Home   >   CSC-OpenAccess Library   >    Manuscript Information
Full Text Available

This is an Open Access publication published under CSC-OpenAccess Policy.
Implementation of Urdu Probabilistic Parser
Neelam Mukhtar, Mohammad Abid Khan, Fatima Tuz Zuhra , Nadia Chiragh
Pages - 12 - 20     |    Revised - 15-07-2012     |    Published - 10-08-2012
Volume - 3   Issue - 1    |    Publication Date - October 2012  Table of Contents
Urdu Probabilistic Parser, , Urdu PCFG, Results of Urdu Probabilistic parser
The implementation of Urdu probabilistic parser is the main contribution of this research work. In the beginning, a lot of Urdu text was collected from different sources. The sentences in the text were subsequently tagged. The tagged sentences were then parsed by a chart parser to formulate the rules. In the next step, probabilities were assigned to these rules to get a Probabilistic Context Free Grammar. For Urdu probabilistic parser, the idea of shift-reduce multi-path strategy is used. The developed software performs the syntactic analysis of a sentence, using a given set of probabilistic phrase structure rules. The parse with the highest probability is selected, as the most suitable one from a set of possible parses produced by this parser. The structure of each sentence is represented in the form of successive rules. This parser parses sentences with 74% accuracy.
CITED BY (4)  
1 Abbas, Q. Morphologically rich Urdu grammar parsing using Earley algorithm. Natural Language Engineering, 1-36.
2 Abbas, Q. Morphologically rich Urdu grammar parsing using Earley algorithm. Natural Language Engineering, 1-36.
3 Abbas, Q. (2014). Building Computational Resources: The URDU. KON-TB Treebank and the Urdu Parser (Doctoral dissertation).
4 Abbas, Q. (2014). Building Computational Resources: The URDU. KON-TB Treebank and the Urdu Parser (Doctoral dissertation).
1 Google Scholar
2 CiteSeerX
3 refSeek
4 Scribd
5 SlideShare
6 PdfSR
1 B. Sagot and E. de la Clergerie. "Error Mining in Parsing Results”. Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the ACL,Sydney, 2006, pp. 329–336.
2 E. Charniak. “Statistical Parsing with a Context-Free Grammar and Word Statistics”. Proceedings of the 14th National Conference on Artificial Intelligence, MIT Press, 1997.
3 M. J. Collins. “A New Statistical Parser Based on Bigram Lexical Dependencies”. Proceedings of ACL 96, 1996.
4 D. M. Magerman. “Statistical Decision- Tree Model for Parsing”. Proceedings of the 33rd Annual Meeting of the ACL, 1995.
5 S. Abney. “Partial Parsing via Finite-State Cascades”. John C. Ed. Workshop.Robust Parsing (ESSLLI’96), 1996, pp. 08-15.
6 S. A’it-Mokhtar and J-P. Chanod. “Incremental Finite-State Parsing”. Proceedings of the 5th Conference on Applied Natural Language Processing, 1997.
7 M .J. Collins. “Head-driven statistical models for natural language parsing”. Ph.D. thesis, 1999.
8 E. Charniak.“A maximum-entropy inspired parse”, Proceedings of the First Meeting of The North American Chapter of the Association for Computational Linguistics, Seattle, WA, 2000,pp. 132–139.
9 S. Petrov, L. Barrett, R. Thibaux. and D. Klein. “Learning accurate, compact and interpretable tree annotation”. Proceedings of ACL, 2006.
10 C. Lakeland and A. Knott. “Implementing a Lexicalized Statistical Parser”. Proceedings of the Australasian Language Technology Workshop, Macquarie University, Sydney, 2004.
11 M. Humayoun, H. Hammarström and Ranta. "Urdu Morphology, Orthography and Lexicon Extraction”. CAASL-2, the Second Workshop on Computational Approaches to Arabic Script- based Languages, LSA 2007, Linguistic Institute, Stanford University, 2007.
12 M. Humayoun. "Urdu Morphology, Orthography and Lexicon Extraction”. Master thesis,Department of Computer Science and Engineering, Chalmers University of Technology and Goteborg University, 2006.
13 H. Samin, S. Nisar and S. Sehrai. Project: “Corpus Development”. BIT thesis, Department of Computer Science, University of Peshawar, Peshawar, Pakistan, 2006.
14 D. Becker and K. Riaz. “A Study in Urdu Corpus Construction”. Proceedings of the 3rd Workshop on Asian language resources and international standardization, 2002.
15 W. Anwar, X. Wang, Luli and Wang. “Hidden Markov Model Based Part of Speech Tagger for Urdu”. Information Technology, 2007, Vol.6, pp. 1190-1198.
16 R. L. Schmidt. “Urdu: an essential grammar”. Rout-ledge, London, UK, 1999.
17 A. Hardie. “The computational analysis of morph syntactic categories in Urdu”. Ph.D thesis,Lancaster University, 2003a.
18 N. Mukhtar, M. A. Khan and F. Zuhra. “Probabilistic Context Free Grammar for Urdu”, Linguistic and Literature Review (LLR), 2011, 1(1).
19 N. Mukhtar, M. A. Khan and F. Zuhra. “Algorithm for developing Urdu Probabilistic Parser”,International journal of Electrical and Computer Sciences IJECS-IJENS 12(3), pp. 57-66, 2012.
20 W. Jiang, H. Xiong and Q. Liu. "Multi-Path Shift-Reduce Parsing with Online Training".CIPS ParsEval, Beijing, November, 2009.
21 F. Zuhra. "Pashto Chart parser". Unpublished paper, Department of Computer Science,University of Peshawar, Pakistan, 2010.
22 G. Sandstrom. Survey paper, "Parsing and Parallelization", 2004.
23 B. M. Bataineh and E. A. Bataineh. "An Efficient Recursive Transition Network Parser or Arabic Language".Proceedings of the World Congress on Engineering, London, U.K, 2009.
Mr. Neelam Mukhtar
- Pakistan
Dr. Mohammad Abid Khan
- Pakistan
Miss Fatima Tuz Zuhra
- Pakistan
Miss Nadia Chiragh
- Pakistan