List of Journals    /    Call For Papers    /    Subscriptions    /    Login
 
 
 
 
 SEARCH
By Author By Title
 
 
ABOUT CSC
 About CSC Journals
 CSC Journals Objectives
 List of Journals
 CALL FOR PAPERS
 Call For Papers CFP
 Special Issue CFP
AUTHOR GUIDELINES
 Submission Guidelines
 Peer Review Process
 Helpful Hints For Getting Published
 Plagiarism Policies
 Abstracting & Indexing
 Open Access Policy
 Submit Manuscript
 FOR REVIEWERS
 Reviewer Guidelines
 FOR EDITORIAL
 Editor Guidelines
 Join Us As Editor
 Launch Special Issue
 Suggest New Journal
 CSC LIBRARY
 Browse CSC Library
 Open Access Policy
  SERVICES
 Conference Partnership Program (CPP)
 Abstracting & Indexing
 SUBSCRIPTIONS
 Subscriptions
 Discounted Packages
 Archival Subscriptions
 How to Subscribe
 Librarians
 Subscriptions Agents
 Order Form
 DOWNLOADS
 
 
 
 
A Biological Sequence Compression Based on cross chromosomal similarities using Variable length LUT
Full text
 PDF(69KB)
Source 
International Journal of Biometrics and Bioinformatics (IJBB)
Table of Contents
Download Complete Issue    PDF(1.83MB)
Volume:  4    Issue:  6
Pages:  194-234
Publication Date:   February
ISSN (Online): 1985-2347
Pages 
217 - 223
Author(s)  
 
Published Date   
08-02-2011 
Publisher 
CSC Journals, Kuala Lumpur, Malaysia
ADDITIONAL INFORMATION
Keywords   Abstract   References   Cited by   Related Articles   Collaborative Colleague
 
KEYWORDS:   Biological sequences, chromosome, cross chromosomal similarity, compression gain, prediction 
 
 
This Manuscript is indexed in the following databases/websites:-
1. Directory of Open Access Journals (DOAJ)
2. refSeek
3. Scribd
4. iSEEK
5. Docstoc
6. Google Scholar
7. WorldCat
8. PDFCAST
9. Bielefeld Academic Search Engine (BASE)
10. ResearchGATE
11. Academic Journals Database
12. Libsearch
 
 
While modern hardware can provide vast amounts of inexpensive storage for biological databases, the compression of Biological sequences is still of paramount importance in order to facilitate fast search and retrieval operations through a reduction in disk traffic. This issue becomes even more important in light of the recent increase of very large data sets, such as meta genomes. The present Biological sequence compression algorithms work by finding similar repeated regions within the Biological sequence and then encode these repeated regions together for compression. The previous research on chromosome sequence similarity reveals that the length of similar repeated regions within one chromosome is about 4.5% of the total sequence length. The compression gain is often not high because of these short lengths of repeated regions. It is well recognized that similarities exist among different regions of chromosome sequences. This implies that similar repeated sequences are found among different regions of chromosome sequences. Here, we apply cross-chromosomal similarity for a Biological sequence compression. The length and location of similar repeated regions among the different Biological sequences are studied. It is found that the average percentage of similar subsequences found between two chromosome sequences is about 10% in which 8% comes from cross-chromosomal prediction and 2% from self-chromosomal prediction. The percentage of similar subsequences is about 18% in which only 1.2% comes from self-chromosomal prediction while the rest is from cross-chromosomal prediction among the different Biological sequences studied. This suggests the significance of cross-chromosomal similarities in addition to self-chromosomal similarities in the Biological sequence compression. An additional 23% of storage space could be reduced on average using self-chromosomal and cross-chromosomal predictions in compressing the different Biological sequences.  
 
 
 
1 Ateet Mehta , 2010, et al., “ DNA Compression using Hash Based Data Structure”, IJIT&KM, Vol2 No.2, pp. 383-386.
2 B.A., 2005, “ Genetics: A comceptual approach.” Freeman, PP 311.
3 Choi Ping Paula Wu, 2008, et al., “ Cross chromosomal similarity for DNA sequence compression”, Bioinformatics 2(9): 412-416.
4 Gregory Vey, 2009, “Differential direct coding: a compression algorithm for nucleotide sequence data”, Database, doi: 10.1093/database/bap013.
5 J. Ziv and A., 1977, et al, “A universal algorithm for sequential data compression,” IEEE Transactions on Information Theory, vol. IT-23.
6 K.N. Mishra, 2010, “ An efficient Horizontal and Vertical Method for Online DNA sequence Compression”, IJCA(0975-8887), Vol3, PP 39-45.
7 P. raja Rajeswari, 2010, et al., “ GENBIT Compress- Algorithm for repetitive and non repetitive DNA sequences”, JTAIT, PP 25-29.
8 Pavol Hanus, 2010, et al., “Compression of whole Genome Alignments”, IEE Transactions of Information Theory, vol.56, No.2Doi: 10.1109/TIT.2009.2037052.
9 R. Curnow, 1989, et al. “Statistical analysis of deoxyribonucleic acid sequence data-a review,” J Royal Statistical Soc., vol. 152, pp. 199-220.
10 Sheng Bao, 2005, et al. “A DNA Sequence Compression Algorithm Based on LUT and LZ77”, IEEE International Symposium on Signal Processing and Information Technology.
11 U. Ghoshdastider, 2005, et al., “GenomeCompress: A Novel Algorithm for DNA Compression”, ISSN 0973-6824.
12 Xin Chen, 2002, et al.,” DNA Compress: fast and effective DNA sequence Compression” BIOINFORMATICS APPLICATIONS NOTE, Vol. 18 no. 12, Pages 1696–1698.
13 X. Chen, 2002, et al., “Dnacompress:fast and effective dna sequence compression,” Bioinformatics, vol. 18,.
14 Voet & Voet, Biochemistry, 3rd Edition, 2004.
 
 
 
 
 
 
 
 
Rajendra Kumar Bharti : Colleagues
Archana Verma : Colleagues
R.K. Singh : Colleagues  
 
 
 
  Untitled Document
 
Copyrights (c) 2012 Computer Science Journals. All rights reserved.
Best viewed at 1152 x 864 resolution. Microsoft Internet Explorer.
 
  
 
Copyrights & Usage: Articles published by CSC Journals are Open Access. Permission to copy and distribute any other content, images, animation and other parts of this website is prohibited. CSC Journals has the rights to take action against individual/group if they are found victim of copying these parts of the website.