Home   >   CSC-OpenAccess Library   >    Manuscript Information
Full Text Available

This is an Open Access publication published under CSC-OpenAccess Policy.
Publications from CSC-OpenAccess Library are being accessed from over 74 countries worldwide.
A Novel High Accuracy Algorithm for Reference Assembly in Colour Space
Balazs Gor, Anett Balla, Edit Tukacs, Istvan Nagy, Zsolt Torok
Pages - 92 - 104     |    Revised - 15-07-2012     |    Published - 10-08-2012
Volume - 6   Issue - 4    |    Publication Date - August 2012  Table of Contents
Smith-Waterman, Reference Assembly, Colour Space Code, Algorithm, Next Generation Sequencing.
Although numerous algorithms exist for genome alignment using Next Generation Sequencing tags, assembly of colour coded reads remains a challenge. We present a novel pairwise sequence aligner algorithm derived from Smith-Waterman method. Original feature of the algorithm is that it translates the reference sequence into colour code and performs the alignment in colour space. While operating on this base it can prevent most read error-derived assembly errors. Based on dynamic programming it gives the optimal alignment in colour space. Further, validation on empirical dataset with capillary sequencing proved high mapping accuracy. The algorithm can be implemented into any reference assembly software thereby improving mapping accuracy while maintaining high speed mapping.
1 Google Scholar 
2 CiteSeerX 
3 refSeek 
4 Scribd 
5 SlideShare 
6 PdfSR 
1 Z. Su, B. Ning, H. Fang, H. Hong, R. Perkins , W. Tong, L. Shi. “Next-generation sequencing and its applications in molecular diagnostics”. Expert Review of Molecular Diagnostics, vol. 11, pp. 333-343, Apr. 2011.
2 R. M. Durbin, D. L. Altshuler, R. M. Durbin, G. A. R. Abecasis, D. R. Bentley, A. Chakravarti, A. G. Clark, F. S. Collins et al. "A map of human genome variation from population-scale sequencing". Nature, vol. 467, pp.1061–1073, Oct. 2010.
3 R. A. Gibbs, J. W. Belmont, P. Hardenbol, T. D. Willis et al. “The International HapMap Project”. Nature, vol. 426, pp. 789–796, Dec. 2003.
4 D. R. Bentley. “Whole-genome re-sequencing”. Current Opinion in Genetics & Development, vol. 16, pp. 545-552, Oct. 2006.
5 M. Margulies, M. Egholm, W. E. Altman, S. Attiya, J. S. Bader, L. A. Bemben et al. “Genome sequencing in microfabricated high-density picolitre reactors”. Nature, vol. 437, pp. 376- 380, Sep. 2005.
6 D. R. Smith, A. R. Quinlan, H. E. Peckham, K. Makowsky, W. Tao, B. Woolf et al. “Rapid whole-genome mutational profiling using next-generation sequencing technologies”. Genome Research, vol. 18, pp. 1638–1642, Oct. 2009.
7 J. Shendure, H. Ji. “Next-generation DNA sequencing”. Nature Biotechnology, vol. 26, pp. 1135-1145, Oct. 2008.
8 M. L. Metzker. “Sequencing technologies – the next generation”. Nature Reviews Genetics, vol. 11, pp. 31-46, Jan. 2010.
9 Applied Biosystems Incorporated. “Principles of Di-Base Sequencing and the Advantages of Color Space Analysis in the SOLiD System”. 2008.
10 H. Breu."A Theoretical Understanding of 2 Base Color Codes and Its Application to Annotation, Error Detection, and Error Correction", 2010.
11 A. Magi, M. Benelli, A. Gozzini, F. Girolami, F. Torricelli, M. L. Brandi."Bioinformatics for Next Generation Sequencing Data". Genes, vol. 1, pp. 294-307, Sep. 2010.
12 P. Flicek, E. Birney. “Sense from sequence reads: methods for alignment and assembly”. Nature Methods, vol. 6, pp. S6–S12, Nov. 2009.
13 S. M. Rumble, P. Lacroute, A. V. Dalca, M. Fiume, A. Sidow, M. Brudno. (2009, May). "SHRiMP: accurate mapping of short color-space reads". PLoS Computational Biology,5(5), Available:http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1000386
14 B. Langmead, C. Trapnell, M. Pop, S.L. Salzberg. (2009, March)."Ultrafast and memoryefficient alignment of short DNA sequences to the human genome". Genome Biology, vol.10, 10:R25, Available:http://genomebiology.com/2009/10/3/R25
15 N. Homer, B. Merriman, S.F. Nelson."BFAST: an alignment tool for large scale genome resequencing". PLoS One,4(11) e7767., 2009.
16 N. Homer, B. Merriman, S.F. Nelson. (2009, June). "Local alignment of two-base encoded DNA sequence". BMC Bioinformatics,10:175, Available:http://www.biomedcentral.com/1471-2105/10/175
17 H. Li, R. Durbin."Fast and accurate short read alignment with Burrows-Wheeler transform". Bioinformatics, vol. 25, pp 1754-1760, 2009.
18 T. F. Smith, M. S. Waterman. "Identification of common molecular subsequences". Journal of Molecular Biology, vol.147, pp 195-197, 1981.
19 K. R. Rasmussen, J. Stoye, E. W. Myers. “Efficient q-gram filters for finding all epsilonmatches over a given length”. Journal of Computational Biology, vol. 13, pp. 296–308, Mar. 2006.
20 R. A. Lippert. “Space-efficient whole genome comparisons with Burrows-Wheeler transforms”. Journal of Computational Biology, vol. 12, pp. 407-415, May 2005.
21 S. Bao, R. Jiang, W. Kwan, B. Wang, X. Ma, Y. Q. Song. “Evaluation of next-generation sequencing software in mapping and assembly”. Journal of Human Genetics, vol. 56, pp. 406-414, Jun. 2011.
22 I. Nagy, A. Pivarcsi, K. Kis, A. Koreck, L. Bodai, A. McDowell, H. Seltmann, S. Patrick, C.C. Zouboulis, L. Kemeny. "Propionibacterium acnes and lipopolysaccharide induce the expression of antimicrobial peptides and proinflammatory cytokines/chemokines in human sebocytes". Microbes Infect., vol.8 ,pp 2195-2205, 2006.
23 H. Bruggeman, A. Henne, F. Hoster, H. Liesegang, A. Wiezer, A. Strittmatter, S. Hujer, P. Durre, G. Gottschalk. "The complete genome sequence of Propionibacterium acnes, a commensal of human skin". Science, vol. 305, pp. 671-673, 2004.
24 B. Horvath, J. Hunyadkurti, A. Voros, Cs. Fekete, E. Urban, L. Kemeny, I. Nagy. “Genome sequence of Propionibacteriumacnes type II strain ATCC 11828”. Journal of Bacteriology, vol. 194, pp 202-203, 2012.
25 B. Langmead, S. L. Salzberg. “Fast gapped-read alignment with Bowtie 2”. Nature Methods, vol. 9, pp. 357–359, Mar. 2012.
Dr. Balazs Gor
- Hungary
Mr. Anett Balla
- Hungary
Mr. Edit Tukacs
- Hungary
Mr. Istvan Nagy
- Hungary
Mr. Zsolt Torok
- Hungary