List of Journals    /    Call For Papers    /    Subscriptions    /    Login
 
 
 
 
 SEARCH
By Author By Title
 
 
ABOUT CSC
 About CSC Journals
 CSC Journals Objectives
 List of Journals
 CALL FOR PAPERS
 Call For Papers CFP
 Special Issue CFP
AUTHOR GUIDELINES
 Submission Guidelines
 Peer Review Process
 Helpful Hints For Getting Published
 Plagiarism Policies
 Abstracting & Indexing
 Open Access Policy
 Submit Manuscript
 FOR REVIEWERS
 Reviewer Guidelines
 FOR EDITORIAL
 Editor Guidelines
 Join Us As Editor
 Launch Special Issue
 Suggest New Journal
 CSC LIBRARY
 Browse CSC Library
 Open Access Policy
  SERVICES
 Conference Partnership Program (CPP)
 Abstracting & Indexing
 SUBSCRIPTIONS
 Subscriptions
 Discounted Packages
 Archival Subscriptions
 How to Subscribe
 Librarians
 Subscriptions Agents
 Order Form
 DOWNLOADS
 
 
 
 
Testing Various Similarity Metrics and their Permutations with Clustering Approach in Context Free Data Cleaning
Full text
 PDF(82.5KB)
Source 
International Journal of Computer Science and Security (IJCSS)
Table of Contents
Download Complete Issue    PDF(3.22MB)
Volume:  3    Issue:  5
Pages:  334-447
Publication Date:   November 2009
ISSN (Online): 1985-1553
Pages 
344 - 350
Author(s)  
 
Published Date   
30-11-2009 
Publisher 
CSC Journals, Kuala Lumpur, Malaysia
ADDITIONAL INFORMATION
Keywords   Abstract   References   Cited by   Related Articles   Collaborative Colleague
 
KEYWORDS:   Context free data cleaning, Clustering, Sequence similarity metrics 
 
 
This Manuscript is indexed in the following databases/websites:-
1. Directory of Open Access Journals (DOAJ)
2. OpenJ-Gate
3. Scribd
4. PDFCAST
5. Docstoc
6. Google Scholar
7. CiteSeerX
8. ScientificCommons
9. WorldCat
10. refSeek
11. ResearchGATE
12. Bielefeld Academic Search Engine (BASE)
13. iSEEK
14. Academic Journals Database
15. Libsearch
16. slideshare
 
 
Organizations can sustain growth in this knowledge era by proficient data analysis, which heavily relies on quality of data. This paper emphasizes on usage of sequence similarity metric with clustering approach in context free data cleaning to improve the quality of data by reducing noise. Authors propose an algorithm to test suitability of value to correct other values of attribute based on distance between them. The sequence similarity metrics like Needlemen-Wunch, Jaro-Winkler, Chapman Ordered Name Similarity and Smith-Waterman are used to find distance of two values. Experimental results show that how the approach can effectively clean the data without reference data. 
 
 
 
1 Hui Xiong, Gaurav Pandey, Michael Steinbach, Vipin Kumar. “Enhancing Data Analysis with Noise Removal”. IEEE Transaction on Knowledge & Data Engineering, 18(3):304-319, 2006.
2 Lukasz Ciszak. “Applications of Clustering and Association Methods in Data Cleaning”. In Proceedings of the International Multiconference on Computer Science and Information Technology. 2008.
3 Sohil D Pandya, Dr. Paresh V Virparia. “Data Cleaning in Knowledge Discovery in Databases: Various Approaches”. In Proceedings of the National Seminar on Current Trends in ICT, INDIA, 2009.
4 Sohil D Pandya, Dr. Paresh V Virparia. “Clustering Approach in Context Free Data Cleaning”. National Journal on System & Information Technology, 2(1):83-90, 2009.
5 Sohil D Pandya, Dr. Paresh V Virparia. “Application of Various Permutations of Similarity Metrics with Clustering Approach in Context Free Data Cleaning”. In Proceedings of the National Symposium on Indian IT @ CROXRoads, INDIA, 2009.
6 W Cohen, P Ravikumar, S Fienberg. “A Comparison of String Distance Metrics for Name- Matching Tasks”. In the Proceedings of the IJCAI, 2003.
7 http://en.wikipedia.org/
8 http://www.dcs.shef.ac.uk/~sam/simmetric.html
 
 
 
1 S. D. Pandya and P. V. Virparia, “Context Free Data Cleaning and its Application in Mechanism for Suggestive Data Cleaning”, International Journal of Information Science, 1(1), pp. 32-35, 2011.
2 R. Ahmad and A. Khanum, “Document Topic Generation in Text Mining by using Cluster Analysis with EROCK”, International Journal of Computer Science and Security (IJCSS), 4(2), pp. 176 – 182, 2010.
 
 
 
1 TechRepublic
 
2 Academia.edu
 
3 ZDNet
 
4 4shared
 
5 Scientific & Academic Publishing Co
 
 
 
Sohil Dineshkumar Pandya : Colleagues
Paresh V Virparia : Colleagues  
 
 
 
  Untitled Document
 
Copyrights (c) 2012 Computer Science Journals. All rights reserved.
Best viewed at 1152 x 864 resolution. Microsoft Internet Explorer.
 
  
 
Copyrights & Usage: Articles published by CSC Journals are Open Access. Permission to copy and distribute any other content, images, animation and other parts of this website is prohibited. CSC Journals has the rights to take action against individual/group if they are found victim of copying these parts of the website.