Home   >   CSC-OpenAccess Library   >    Manuscript Information
Full Text Available

(377.16KB)
This is an Open Access publication published under CSC-OpenAccess Policy.

PUBLICATIONS BY COUNTRIES

Top researchers from over 74 countries worldwide have trusted us because of quality publications.

United States of America
United Kingdom
Canada
Australia
Malaysia
China
Japan
Saudi Arabia
Egypt
India
Automatic Generation of Multiple Choice Questions using Surface-based Semantic Relations
Naveed Afzal
Pages - 26 - 44     |    Revised - 31-08-2015     |    Published - 30-09-2015
Volume - 6   Issue - 3    |    Publication Date - September 2015  Table of Contents
MORE INFORMATION
KEYWORDS
E-Learning, Automatic Assessment, Educational Assessment, Natural Language Processing, Information Extraction, Unsupervised Relation Extraction, Multiple Choice Questions Generation, Biomedical Domain.
ABSTRACT
Multiple Choice Questions (MCQs) are a popular large-scale assessment tool. MCQs make it much easier for test-takers to take tests and for examiners to interpret their results; however, they are very expensive to compile manually, and they often need to be produced on a large scale and within short iterative cycles. We examine the problem of automated MCQ generation with the help of unsupervised Relation Extraction, a technique used in a number of related Natural Language Processing problems. Unsupervised Relation Extraction aims to identify the most important named entities and terminology in a document and then recognize semantic relations between them, without any prior knowledge as to the semantic types of the relations or their specific linguistic realization. We investigated a number of relation extraction patterns and tested a number of assumptions about linguistic expression of semantic relations between named entities. Our findings indicate that an optimized configuration of our MCQ generation system is capable of achieving high precision rates, which are much more important than recall in the automatic generation of MCQs. Its enhancement with linguistic knowledge further helps to produce significantly better patterns. We furthermore carried out a user-centric evaluation of the system, where subject domain experts from biomedical domain evaluated automatically generated MCQ items in terms of readability, usefulness of semantic relations, relevance, acceptability of questions and distractors and overall MCQ usability. The results of this evaluation make it possible for us to draw conclusions about the utility of the approach in practical e-Learning applications.
CITED BY (1)  
1 Afzal, N., & Bawakid, A. (2015). Comparison between Surface-based and Dependency-based Relation Extraction Approaches for Automatic Generation of Multiple-Choice Questions.
1 Google Scholar 
2 CiteSeerX 
3 refSeek 
4 Scribd 
5 SlideShare 
6 PdfSR 
1 N. Afzal and A. Bawakid, “Comparison between Surface-based and Dependency-based Relation Extraction Approaches for Automatic Generation of Multiple-Choice Questions”. IJMSE, Volume 6, Issue 8, 2015.
2 N. Afzal and R. Mitkov, “Automatic Generation of Multiple Choice Questions using Dependency-based Semantic Relations”. Soft Computing. Volume 18, Issue 7, pp. 12691281, 2014. DOI: 10.1007/s00500-013-1141-4
3 N. Afzal and V.Pekar, “Unsupervised Relation Extraction for Automatic Generation of Multiple-Choice Questions”. In Proc. of RANLP2009 14-16 September 2009. Borovets, Bulgaria.
4 E. Agichtein and L. Gravano, “Snowball: Extracting relations from large plain text collections”. In Proc. of the 5th ACM International Conference on Digital Libraries, 2000.
5 S. Ananiadou and J. McNaught eds. “Text Mining for Biology and Biomedicine”, Artech House, 2006.
6 R.K. Ando and T. Zhang, “A high-performance semi-supervised learning method for text chunking”. In Proc. of the 43rd Annual Meeting on Association for Computational Linguistics (ACL-05). Association for Computational Linguistics, pp. 1-9, 2005.
7 W.E. Becker and M. Watts, “Teaching methods in U.S. and undergraduate economics courses”. Journal of Economics Education, 32(3), pp. 269-279, 2001.
8 J. Brown, G. Frishkoff and M. Eskenazi, “Automatic question generation for vocabulary assessment”. In Proc. of HLT/EMNLP. Vancouver, B.C. 2005.
9 R. Bunescu and R. Mooney, “Learning to extract relations from the web using minimal supervision”. In Proc. of the 45th Annual Meeting of the Association for Computational Linguistics (ACL-07). Prague, Czech Republic, 2007.
10 C.-Y. Chen, H.-C. Liou and J.S. Chang, “FAST- An automatic generation system for grammar tests”. In Proc. of COLING/ACL Interactive Presentation Sessions, Sydney, Australia, 2006.
11 A.M. Cohen and W.R. Hersh, “A survey of current work in biomedical text mining”. Briefings in Bioinformatics, pp. 57-71, 2005.
12 J. Cohen, “Weighted Kappa: Nominal scale agreement with provision for scaled disagreement or partial credit”. Psychological Bulletin, 1968.
13 D. P. Corney, D. Jones, B. Buxton and W. Langdon, “BioRAT: Extracting biological information from full-length papers”. Bioinformatics, pp. 3206-3213, 2004.
14 K. Eichler, H. Hemsen and G. Neumann, “Unsupervised relation extraction from web documents”. In Proc. of the 6th International Language Resources and Evaluation (LREC-08). Marrakech, Morocco, 2008.
15 G. Erkan, A. Ozgur and D. R. Radev, “Semi-supervised classification for extracting protein interaction sentences using dependency parsing”. In Proc. of CoNLL-EMNLP, 2007.
16 O. Etzioni, M. Banko, S. Soderland and D. S. Weld, “Open information extraction from the web”. Communications of the ACM, 51(12), pp.68-74, 2008.
17 M. Greenwood, M. Stevenson, Y. Guo, H. Harkema and A. Roberts, “Automatically acquiring a linguistically motivated genic interaction extraction system”. In Proc. of the 4th Learning Language in Logic Workshop, Bonn, Germany, 2005.
18 N. Gronlund, “Constructing Achievement Tests”. New York, USA: Prentice Hall, 1982.
19 C. Grover, A. Lascarides and M. Lapata, “A comparison of parsing technologies for the biomedical domain”. Natural Language Engineering 11 (1), pp. 27 -65, 2005.
20 T. Hasegawa, S. Sekine and R. Grishman, “Discovering relations among named entities from large corpora”. In Proc. of ACL’04, 2004.
21 A. Hoshino and H. Nakagawa, “A real-time multiple-choice question generation for language testing – A preliminary study”. In Proc. of the 43rd ACL’05 2nd Workshop on Building Educational Applications Using Natural Language Processing, pp.17-20, 2005.
22 A. Hoshino and H. Nakagawa “Assisting cloze test making with a web application”. In Proc. of Society for Information Technology and Teacher Education International Conference. Chesapeake, VA, 2007.
23 M. Huang, X. Zhu, G. D. Payan, K. Qu and M. Li, “Discovering patterns to extract proteinprotein interactions from full biomedical texts”. Bioinformatics, pp. 3604-3612, 2004.
24 D. Jurafsky and J. H. Martin, “Speech and Language Processing”. Second Edition. Prentice Hall, 2008.
25 N. Karamanis, L. A. Ha and R. Mitkov, “Generating multiple-choice test items from medical text: A pilot study”. In Proc. of the 4th International Natural Language Generation Conference, (July), pp.111-113, 2006.
26 S. Katrenko and P. Adriaans, “Learning relations from biomedical corpora using dependency trees”. In Proc. of the 1st International Workshop on Knowledge Discovery and Emergent Complexity in Bioinformatics, Ghent, pp. 61–80, 2006.
27 J-D. Kim, T. Ohta and J. Tsujii, “Corpus annotation for mining biomedical events from literature”, BMC Bioinformatics, 2008.
28 D. Lin and P. Pantel, “Concept discovery from text”. In Proc. of Conference on CL’02. pp. 577-583. Taipei, Taiwan, 2002.
29 C. Manning and H. Schütze, “Foundations of Statistical Natural Language Processing”. The MIT Press, Cambridge, U.S. 1999.
30 E. P. Martin, E. Bremer, G. Guerin, M-C. DeSesa and O. Jouve, “Analysis of protein/protein interactions through biomedical literature: Text mining of abstracts vs. Text mining of full text articles”. Berlin: Springer-Verlag, pp. 96-108, 2004.
31 R. Mitkov and L. A. An, “Computer-aided generation of multiple-choice tests”. In Proc. of the HLT/NAACL 2003 Workshop on Building educational applications using Natural Language Processing, 17-22. Edmonton, Canada, 2003.
32 R. Mitkov, L. A. Ha and N. Karamanis, “A computer-aided environment for generating multiple-choice test items”. Natural Language Engineering 12(2). Cambridge University Press, pp. 177-194, 2006.
33 T. Ono, H. Hishigaki, A. Tanigami and T. Takagi, “Automated extraction of information on protein–protein interactions from the biological literature”. Bioinformatics, pp. 155-161, 2001.
34 A. Papasalouros, K. Kanaris and K. Konstantinos, “Automatic generation of multiple choice questions from domain ontologies”. In Proc. of IADIS International Conference e-Learning, 2008.
35 V. Pekar, M. Krkoska and S. Staab, “Feature weighting for co-occurrence-based classification of words”. In Proc. of the 20th International Conference on Computational Linguistics (COLING-04). Geneva, Switzerland, pp. 799-805, 2004.
36 S. Sekine, “On-demand information extraction”. In Proc. of the COLING/ACL, 2006.
37 Y. Shinyama, and S. Sekine, “Preemptive information extraction using unrestricted relation discovery”. In Proc. of the HLT Conference of the North American Chapter of the ACL. New York, pp. 304-311, 2006.
38 Y. Skalban, “Improving the output of a multiple-choice test generator: Analysis and proposals”. University of Wolverhampton, 2009.
39 M. Stevenson and M. Greenwood, “A semantic approach to IE pattern induction”. In Proc. of ACL’05, pages 379-386, 2005.
40 M. Stevenson and M. Greenwood, “Dependency pattern models for information extraction”. Research on Language and Computation, 2009.
41 K. Sudo, S. Sekine and R. Grishman, “An Improved Extraction Pattern Representation Model for Automatic IE Pattern Acquisition”. In Proc. of the 41st Annual Meeting of ACL-03, pp. 224– 231, Sapporo, Japan, 2003.
42 I. Szpektor, H. Tanev, I. Dagan and B. Coppola, “Scaling Web-based acquisition of entailment relations”. In Proc. of EMNLP-04, Barcelona, Spain, 2004.
43 P. Tapanainen and T. Järvinen, “A non-projective dependency parser”. In Proc. of the 5th Conference on Applied Natural Language Processing, pages 64–74, Washington, 1997.
44 Y. Tsuruoka, Y. Tateishi, J-D. Kim, T. Ohta, J. McNaught, S. Ananiadou and J.Tsujii, “Developing a robust PoS tagger for biomedical text”. Advances in Informatics – 10th Panhellenic Conference on Informatics, LNCS 3746, pp. 382-392, 2005.
45 Y. Tsuruoka and J. Tsujii, “Bidirectional inference with the easiest-first strategy for tagging sequence data”. Proc. of HLT/EMNLP, pp. 467-474, 2005.’
46 J. Wilbur, L. Smith and T. Tanabe, “BioCreative 2. Gene mention task. Proc. of the 2nd BioCreative Challenge Workshop pp. 7-16, 2007.
47 G. Zhou, J. Su, D. Shen and C. Tan, “Recognizing name in biomedical texts: A machine learning approach”. Bioinformatics, pp. 1178-1190, 2004.
Dr. Naveed Afzal
Mayo Clinic - United States of America
dr.na.bhatti@gmail.com