List of Journals    /    Call For Papers    /    Subscriptions    /    Login
 
 
 
 
 SEARCH
By Author By Title
 
 
ABOUT CSC
 About CSC Journals
 CSC Journals Objectives
 List of Journals
 CALL FOR PAPERS
 Call For Papers CFP
 Special Issue CFP
AUTHOR GUIDELINES
 Submission Guidelines
 Peer Review Process
 Helpful Hints For Getting Published
 Plagiarism Policies
 Abstracting & Indexing
 Open Access Policy
 Submit Manuscript
 FOR REVIEWERS
 Reviewer Guidelines
 FOR EDITORIAL
 Editor Guidelines
 Join Us As Editor
 Launch Special Issue
 Suggest New Journal
 CSC LIBRARY
 Browse CSC Library
 Open Access Policy
  SERVICES
 Conference Partnership Program (CPP)
 Abstracting & Indexing
 SUBSCRIPTIONS
 Subscriptions
 Discounted Packages
 Archival Subscriptions
 How to Subscribe
 Librarians
 Subscriptions Agents
 Order Form
 DOWNLOADS
 
 
 
 
Similarity-Based Estimation for Document Summarization using Fuzzy Sets
Full text
 PDF(135.3KB)
Source 
International Journal of Computer Science and Security (IJCSS)
Table of Contents
Download Complete Issue    PDF(1.01MB)
Volume:  1    Issue:  4
Pages:  1-47
Publication Date:   December 2007
ISSN (Online): 1985-1553
Pages 
1 - 12
Author(s)  
 
Published Date   
30-12-2007 
Publisher 
CSC Journals, Kuala Lumpur, Malaysia
ADDITIONAL INFORMATION
Keywords   Abstract   References   Cited by   Related Articles   Collaborative Colleague
 
KEYWORDS:   fuzzy sets, mass assignment, asymmetric word similarity, topic similarity, summarization 
 
 
This Manuscript is indexed in the following databases/websites:-
1. Directory of Open Access Journals (DOAJ)
2. Docstoc
3. Scribd
4. PDFCAST
5. Google Scholar
6. WorldCat
7. ScientificCommons
8. Bielefeld Academic Search Engine (BASE)
9. ResearchGATE
10. iSEEK
11. Microsoft Academic Search
12. Academic Journals Database
13. Libsearch
14. slideshare
 
 
Information is increasing every day and thousands of documents are produced and made available in the Internet. The amount of information available in documents exceeds our capacity to read them. We need access to the right information without having to go through the whole document. Therefore, documents need to be compressed and produce an overview so that these documents can be utilized effectively. Thus, we propose a similarity model with topic similarity using fuzzy sets and probability theories to extract the most representative sentences. Sentences with high weights are extracted to form a summary. On average, our model (known as MySum) produces summaries that are 60% similar to the manually created summaries, while tf.isf algorithm produces summaries that are 30% similar. Two human summarizers, named P1 and P2, produce summaries that are 70% similar to each other using similar sets of documents obtained from TREC. 
 
 
 
1 K. Sparck Jones. “Automatic Summarizing: Factors and Directions”. In I. Mani and M.T. Maybury, Editors, Advances in Automatic Text Summarization, Cambridge, MA: The MIT Press, pp 1-12, 1999
2 S.H. Lo, H. Meng, and W. Lam. “Automatic Bilingual Text Document Summarization”. In Proceedings of the Sixth World Multiconference on Systematic, Cybernetics and Informatics. Orlando, Florida, USA, 2002
3 S. Yohei ‘‘Sentence Extraction by tf/idf and Position Weighting from Newspaper Articles (TSC-8)’’ NTCIR Workshop 3 Meeting TSC, pp 55-59, 2002
4 J. Larocca Neto, A.D. Santos, C.A.A. Kaestner, and A.A. Freitas. “Document Clustering and Text Summarization”. In Proceedings of the 4th Int. Conf. Practical Applications of Knowledge Discovery and Data Mining (PADD-2000), London: The Practical Application Company, pp 41---55, 2000b
5 M. Amini and P. Gallinari. “The Use of Unlabeled Data to Improve Supervised Learning for Unsupervised for Text Summarization”. In SIGIR, Tampere, Finland, 2002
6 H. Luhn “The Automatic Creation of Literature Abstracts”. IBM Journal of Research and Development, 2(92):159 - 165, 1958
7 G. Salton and C. Buckley. “Term-weighting Approaches in Automatic Text Retrieval”. Information Processing and Management 24, pp 513-523, 1988. Reprinted in: Sparck Jones K. and Willet P. (eds). Readings in Information Retrieval, Morgan Kaufmann, pp 323-328, 1997
8 G.J. Klir and B. Yuan. “Fuzzy Sets and Fuzzy Logic - Theory and Applications”. Prentice- Hall, Inc., Englewood Cliffs, New Jersey, 1995
9 J.F. Baldwin. “Fuzzy and Probabilistic Uncertainties”. In Encyclopedia of AI, 2nd ed., S.C. Shapiro, Editor 1992, Wiley, New York, pp. 528-537, 1992
10 J.F. Baldwin. “Combining Evidences for Evidential Reasoning”. International Journal of Intelligent Systems, 6(6), pp. 569-616, 1991a
11 J.F. Baldwin, J. Lawry, and T.P. Martin. “A Mass Assignment Theory of the Probability of Fuzzy Events”. Fuzzy Sets and Systems, (83), pp. 353-367, 1996
12 J.F. Baldwin, T.P. Martin and B.W. Pilsworth. “Fril - Fuzzy and Evidential Reasoning in Artificial Intelligence”. Research Studies Press Ltd, England, 1995
13 M.F. Porter. “An Algorithm for Suffix Stripping”. Program, 14(3):130-137, 1980
14 D. Lin. “Extracting Collocations from Text Corpora”. Workshop on Computational Terminology, Montreal, Canada, 1998
15 Z. Harris. “Distributional Structure”. In: Katz, J. J. (ed.) The Philosophy of Linguistics. New York: Oxford University Press, pp. 26-47, 1985
16 M.A. Azmi-Murad. “Fuzzy Text Mining for Intelligent Information Retrieval”. PhD Thesis, University of Bristol, April 2005
17 DUC. “Document Understanding Conferences”. http://duc.nist.gov, 2002
 
 
 
1 M. S. Binwahlan, N. Salim and L. Suanmalui, “Fuzzy Swarm Diversity Hybrid Model for Text Summarization”, Information Processing & Management, 46(5), pp. 571–588, 2010.
2 W. A. Ahmed and S. M. Shamsuddin , “Integration of Least Recently Used Algorithm and Neuro-Fuzzy System into Client-side Web Caching” , International Journal of Computer Science and Security (IJCSS), 3(1), pp. 1 – 15, 2009.
3 S. Mansor , R. B. Din and A. Samsudin , “Analysis of Natural Language Steganography”, International Journal of Computer Science and Security (IJCSS), 3(2), pp. 113 – 125, 2009.
4 R. Ahmad and A. Khanum , “Document Topic Generation in Text Mining by Using Cluster Analysis with EROCK”, International Journal of Computer Science and Security (IJCSS), 4(2), pp. 176 – 182, 2010.
 
 
 
1 citeulike
 
2 UNIVERSITY PUTRA MALAYSIA
 
3 UNIVERSITY PUTRA MALAYSIA
 
4 yasni
 
5 Live DNA
 
6 lw20
 
7 Electronic Theses Dissertations Services
 
 
 
Masrah Azrifah Azmi Murad : Colleagues
Trevor Martin : Colleagues  
 
 
 
  Untitled Document
 
Copyrights (c) 2012 Computer Science Journals. All rights reserved.
Best viewed at 1152 x 864 resolution. Microsoft Internet Explorer.
 
  
 
Copyrights & Usage: Articles published by CSC Journals are Open Access. Permission to copy and distribute any other content, images, animation and other parts of this website is prohibited. CSC Journals has the rights to take action against individual/group if they are found victim of copying these parts of the website.