Home   >   CSC-OpenAccess Library   >    Manuscript Information
Full Text Available

(145.7KB)
This is an Open Access publication published under CSC-OpenAccess Policy.
MMI Diversity Based Text Summarization
Mohammed Salem Binwahlan, Naomie Salim , Ladda Suanmali
Pages - 23 - 33     |    Revised - 20-03-2009     |    Published - 15-03-2009
Volume - 3   Issue - 1    |    Publication Date - February 2009  Table of Contents
MORE INFORMATION
KEYWORDS
Binary tree, Diversity, MMR, Summarization, Similarity threshold
ABSTRACT
The searching for interesting information in a huge data collection is a tough job frustrating the seekers for that information. The automatic text summarization has come to facilitate such searching process. The selection of distinct ideas “diversity” from the original document can produce an appropriate summary. Incorporating of multiple means can help to find the diversity in the text. In this paper, we propose approach for text summarization, in which three evidences are employed (clustering, binary tree and diversity based method) to help in finding the document distinct ideas. The emphasis of our approach is on controlling the redundancy in the summarized text. The role of clustering is very important, where some clustering algorithms perform better than others. Therefore we conducted an experiment for comparing two clustering algorithms (K-means and complete linkage clustering algorithms) based on the performance of our method, the results shown that k-means performs better than complete linkage. In general, the experimental results shown that our method performs well for text summarization comparing with the benchmark methods used in this study.
CITED BY (11)  
1 Al-Saeedan, W., & Menai, M. E. B. (2015). Swarm intelligence for natural language processing. International Journal of Artificial Intelligence and Soft Computing, 5(2), 117-150.
2 Alguliyev, R. M., Aliguliyev, R. M., & Isazade, N. R. (2015). an unsupervised approach to generating generic summaries of documents. Applied Soft Computing.
3 Alguliev, R. M., Aliguliyev, R. M., & Isazade, N. R. (2013). Mr&Mr-Sum: Maximum Relevance And Minimum Redundancy Document Summarization Model. International Journal of Information Technology & Decision Making, 12(03), 361-393.
4 Alguliev, R. M., Aliguliyev, R. M., & Isazade, N. R. (2012). DESAMC+ DocSum: Differential evolution with self-adaptive mutation and crossover parameters for multi-document summarization. Knowledge-Based Systems, 36, 21-38.
5 Alguliev, R. M., Aliguliyev, R. M., & Hajirahimova, M. S. (2012). GenDocSum+ MCLR: Generic document summarization based on maximum coverage and less redundancy. Expert Systems with Applications, 39(16), 12460-12473.
6 R. M. Alguliev, R. M. Aliguliyev and C. A. Mehdiyev, “pSum-SaDE: A Modified p-Median Problem and Self-Adaptive Differential Evolution Algorithm for Text Summarization”, Applied Computational Intelligence and Soft Computing, Vol. 2011, Article ID 351498, 13 pages, 2011.
7 R. M. Alguliev, R. M. Aliguliyev, M. S. Hajirahimova and C. A. Mehdiyev, “MCMR: Maximum Coverage and Minimum Redundant Text Summarization Model”, Expert Systems with Applications, 38(12), pp. 14514–14522, 2011.
8 Kent, C. K., & Salim, N. (2011, December). Web Based Cross Language Semantic Plagiarism Detection. In Dependable, Autonomic and Secure Computing (DASC), 2011 IEEE Ninth International Conference on (pp. 1096-1102). IEEE.
9 M. S. Binwahlan, N. Salim and L. Suanmali, “Fuzzy Swarm Diversity Hybrid Model for Text Summarization”, Information Processing & Management, 46(5), pp. 571-588, 2010.
10 M. S. Binwahlan, N. Salim and L. Suanmali, “Swarm Diversity Based Text Summarization”, Computer Science Neural Information Processing Lecture Notes in Computer Science, Vol. 5864/2009, pp. 216-225, 2009.
11 Binwahlan, M. S., Salim, N., & Suanmali, L. (2009, June). Integrating of the diversity and swarm based methods for text summarization. In The 5th postgraduate annual research seminar (PARS) (pp. 17-19).
1 Google Scholar
2 Academic Journals Database
3 ScientificCommons
4 Academic Index
5 CiteSeerX
6 iSEEK
7 Socol@r
8 ResearchGATE
9 Libsearch
10 Bielefeld Academic Search Engine (BASE)
11 Scribd
12 WorldCat
13 slideshare
14 PDFCAST
15 PdfSR
16 Chinese Directory Of Open Access
1 S. Brin, and L. Page. “The anatomy of a large-scale hypertextual Web search engine”. Computer Networks and ISDN System. 30(1–7): 107–117. 1998.
2 J. Carbonell, and J. Goldstein. “The use of MMR, diversity-based reranking for reordering documents and producing summaries”. SIGIR '98: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 24-28 August. Melbourne, Australia, 335-336. 1998
3 G. Erkan, and D. R. Radev. “LexRank: Graph-based Lexical Centrality as Salience in Text Summarization”. Journal of Artificial Intelligence Research (JAIR), 22, 457-479. AI Access Foundation. 2004.
4 K Filippova, M. Mieskes, V. Nastase, S. P. Ponzetto and M. Strube. “Cascaded Filtering for Topic-Driven Multi-Document Summarization”. Proceedings of the Document Understanding Conference. 26-27 April. Rochester, N.Y., 30-35. 2007.
5 M. K. Ganapathiraju. “Relevance of Cluster size in MMR based Summarizer: A Report 11- 742: Self-paced lab in Information Retrieval”. November 26, 2002.
6 “The Document Understanding Conference (DUC)”. http://duc.nist.gov.
7 A. Jain, M. N. Murty and P. J. Flynn. “Data Clustering: A Review”. ACM Computing Surveys. 31 (3), 264-323, 1999.
8 C. Jaruskulchai and C. Kruengkrai. “Generic Text Summarization Using Local and Global Properties”. Proceedings of the IEEE/WIC international Conference on Web Intelligence. 13- 17 October. Halifax, Canada: IEEE Computer Society, 201-206, 2003.
9 A. Kiani –B and M. R. Akbarzadeh –T. “Automatic Text Summarization Using: Hybrid Fuzzy GA-GP”. IEEE International Conference on Fuzzy Systems. 16-21 July. Vancouver, BC, Canada, 977 -983, 2006.
10 W. Kraaij, M. Spitters and M. v. d. Heijden. “Combining a mixture language model and naive bayes for multi-document summarization”. Proceedings of Document Understanding Conference. 13-14 September. New Orleans, LA, 109-116, 2001.
11 C. Y. Lin. “Rouge: A package for automatic evaluation of summaries”. . Proceedings of the Workshop on Text Summarization Branches Out, 42nd Annual Meeting of the Association for Computational Linguistics. 25–26 July. Barcelona, Spain, 74-81, 2004b.
12 Z. Lin, T. Chua, M. Kan, W. Lee, Q. L. Sun and S. Ye. “NUS at DUC 2007: Using Evolutionary Models of Text”. Proceedings of Document Understanding Conference. 26-27 April. Rochester, NY, USA, 2007.
13 D. Liu, Y. Wang, C. Liu and Z. Wang. “Multiple Documents Summarization Based on Genetic Algorithm”. In Wang L. et al. (Eds.) Fuzzy Systems and Knowledge Discovery. (355–364). Berlin Heidelberg: Springer-Verlag, 2006.
14 H. P. Luhn. “The Automatic Creation of Literature Abstracts”. IBM Journal of Research and Development. 2(92), 159-165, 1958.
15 M. J. MAN`A-LO`PEZ, M. D. BUENAGA, and J. M. GO´ MEZ-HIDALGO. “Multi-document Summarization: An Added Value to Clustering in Interactive Retrieval”. ACM Transactions on Information Systems. 22(2), 215–241, 2004.
16 T. Mori, M. Nozawa and Y. Asada. “Multi-Answer-Focused Multi-document Summarization Using a Question-Answering Engine”. ACM Transactions on Asian Language Information Processing. 4 (3), 305–320 , 2005.
17 J. L. Neto, A. A. Freitas and C. A. A. Kaestner. “Automatic Text Summarization using a Machine Learning Approach”. In Bittencourt, G. and Ramalho, G. (Eds.). Proceedings of the 16th Brazilian Symposium on Artificial intelligence: Advances in Artificial intelligence. (pp. 386-396). London: Springer-Verlag ,2002.
18 J. L. Neto, A. D. Santos, C. A. A. Kaestner and A. A Freitas. “Document Clustering and Text Summarization”. Proc. of the 4th International Conference on Practical Applications of Knowledge Discovery and Data Mining. April. London, 41-55, 2000.
19 19. R. Ribeiro and D. M. d. Matos. “Extractive Summarization of Broadcast News: Comparing Strategies for European Portuguese”. In V. M. sek, and P. Mautner, (Eds.). Text, Speech and Dialogue. (pp. 115–122). Berlin Heidelberg: Springer-Verlag, 2007.
20 E. Villatoro-Tello, L. Villaseñor-Pineda and M. Montes-y-Gómez. “Using Word Sequences for Text Summarization”. In Sojka, P., Kope?ek, I., Pala, K. (eds.). Text, Speech and Dialogue. vol. 4188: 293–300. Berlin Heidelberg: Springer-Verlag, 2006.
21 S. Ye, L. Qiu, T. Chua and M. Kan. “NUS at DUC 2005: Understanding documents via concept links”. Proceedings of Document Understanding Conference. 9-10 October. Vancouver, Canada, 2005.
22 D. M. Zajic. “Multiple Alternative Sentence Compressions As A Tool For Automatic Summarization Tasks”. PhD theses. University of Maryland, 2007.
23 D. M. Zajic, B. J. Dorr, R. Schwartz, and J. Lin. “Sentence Compression as a Component of a Multi-Document Summarization System”. Proceedings of the 2006 Document Understanding Workshop. 8-9 June. New York, 2006.
24 H. Zha. “Generic summarization and key phrase extraction using mutual reinforcement principle and sentence clustering”. In proceedings of 25th ACM SIGIR. 11-15 August. Tampere, Finland, 113-120, 2002.
25 X. Zhu, A. B. Goldberg, J. V. Gael and D. Andrzejewski. “Improving diversity in ranking using absorbing random walks”. HLT/NAACL. 22-27 April. Rochester, NY, 2007.
Mr. Mohammed Salem Binwahlan
UTM - Malaysia
moham2007med@yahoo.com
Assistant Professor Naomie Salim
UTM - Malaysia
Mr. Ladda Suanmali
UTM - Thailand