Home   >   CSC-OpenAccess Library   >    Manuscript Information
Designing a Rule Based Stemmer for Afaan Oromo Text
Debela Tesfaye Gemechu, Ermias Abebe
Pages - 1 - 11     |    Revised - 30-08-2010     |    Published - 30-10-2010
Volume - 1   Issue - 2    |    Publication Date - October 2010  Table of Contents
MORE INFORMATION
KEYWORDS
Afan Oromo stemmer, Rule based Stemmer, context sensitive stemming
ABSTRACT
Most natural language processing systems use stemmer as a separate module in their architecture. Specially, it is very significant for developing, machine translator, speech recognizer and search engines. In this thesis work, a stemming system for Afan Oromo is presented. This system takes as input a word and removes its affixes according to a rule based algorithm. The result of the study is a prototype context sensitive iterative stemmer for Afan Oromo. Error counting technique was employed to evaluate the performance of this stemmer. The errors were analyzed and classified into two different categories: under stemming and over stemming errors. For testing purpose corpus which is collected from different public Afaan Oromo newspapers and bulletins is used. Newspapers, bulletins and public magazines are considered as consisting different issues of the community: social, economical, technological and political issues. This will reduce the probability of making the corpus biased toward some specific words that do not appear in everyday life. to make the testing set balanced. According to the evaluation of the experiments, it can be concluded that an overall accuracy of the stemmer is encouraging which shows stemming can be performed with low error rates in high inflected languages such as Afan Oromo.
CITED BY (12)  
1 Wegari, G. M., Melucci, M., & Teferra, S. (2015, September). Suffix sequences based morphological segmentation for Afaan Oromo. In AFRICON, 2015 (pp. 1-6). IEEE.
2 Thapar, P. (2014). A Hybrid Approach used to Stem Punjabi Words.
3 Girma, T., Landage, S. M., Wasif, A. I., Dhuppe, P., Kumar, M., & Sharma, S. (2014). Human language technologies and Affan Oromo. International Journal of Advanced Research in Engineering and Applied Sciences, 3(5), 1-13.
4 Misikir, T. (2013). Developing a Stemming Algorithm for Awngi Text (Doctoral dissertation, AAU).
5 Nigussie, E. (2013). Afaan Oromo–Amharic Cross Lingual Information Retrieval (Doctoral dissertation, AAU).
6 ABEDO, M. K. (2012). school of graduates studies department of information science (Doctoral dissertation, Addis Ababa University).
7 Fisseha, Y. (2011). Development of Stemming Algorithm for Tigrgna Text (Doctoral dissertation, AAU).
8 Kumar, D., & Rana, P. (2011). Stemming of Punjabi Words by using Brute Force Technique. International Journal of Engineering Science and Technology (IJEST) Vol, 3, 1351-1357.
9 HENOK, B. (2011). Dsp based impelementation of field-weakening on synchronous motor for high speed operation (doctoral dissertation, aau).
10 Tesfaye, D. (2011). A rule-based Afan Oromo Grammar Checker. IJACSA Editorial.
11 YONAS, F. (2011). Development of stemming algorithm for Tigrigna text.
12 Kumar, D. stemming of punjabi words by using brute force technique Dinesh Kumar Assistant Prof. & Head Department of Information Technology daviet, Jalandhar Prince Rana.
1 Google Scholar 
2 CiteSeerX 
3 refSeek 
4 Scribd 
5 SlideShare 
6 PDFCAST 
7 PdfSR 
C. G. Mewis.” A Grammatical sketch of Written Oromo”, Germany: Koln,pp. 25-99 (2001)
Census Report. “Ethiopia’s population now 76 million”. (2008) available at: http://ethiopolitics.com/news
G. Q. A. Oromoo. “Caasluga Afaan Oromoo Jildi I”, Komishinii Aadaaf Turizmii Oromiyaa, Finfinnee, Ethiopia, pp. 105-220 (1995)
K. K. Tune, V. Varma and P. Pingali. “Evaluation of Oromo-English Cross-Language Information Retrieval”, Language Technologies Research Centre IIIT, Hyderabad India, (2007)
L. JB. “Development of a stemming algorithm”. Mechanical Translation and Computational Linguistics, 11: 22-31,(1968)
L. Lessa. “development of stemming algorithm for wolaytta text”, Masters Thesis Addis Ababa University, Fuculty of Informatics, Department of Information Science, (2003)
M. F Porter. “An algorithm for suffix stripping”. Program, 14(3):130–137,(1980)
S. Dandapat, S. Sarkar and A. Basu. “A Hybrid Model for Part-f-Speech Tagging and its Application to Bengali”, Journal of world information society, 43(6):384–390, (2004)
S. Jacques. “Stemming of French Words Based on Grammatical Categories.” Journal of American Society for Information Science 44(1): 1-9, (1993)
W. Kraaij, R. Pohlmann.”Porter’s stemming algorithm for Dutch”. Bioinformatics, 25: 1412–1418, (1997)
W. Mekonen. “Development of stemming algorithm for Affan Oromo anguage text”, MSc thesis faculty of informatics, Addis Ababa University, Addis Ababa,(2000)
Mr. Debela Tesfaye Gemechu
Jimma University - Ethiopia
dabookoo@yahoo.com
Mr. Ermias Abebe
- Ethiopia


CREATE AUTHOR ACCOUNT
 
LAUNCH YOUR SPECIAL ISSUE
View all special issues >>
 
PUBLICATION VIDEOS