Home   >   CSC-OpenAccess Library   >    Manuscript Information
Full Text Available

This is an Open Access publication published under CSC-OpenAccess Policy.
Publications from CSC-OpenAccess Library are being accessed from over 158 countries worldwide.
Biological Significance of Gene Expression Data Using Similarity Based Biclustering Algorithm
Bagyamani J, Thangavel, Rathipriya R
Pages - 201 - 216     |    Revised - 31-01-2011     |    Published - 08-02-2011
Volume - 4   Issue - 6    |    Publication Date - February  Table of Contents
Biclustering, Gene Expression Data, Query gene, Similarity, Top-Down Approach, Gene Ontology
Unlocking the complexity of a living organism’s biological processes, functions and genetic network is vital in learning how to improve the health of humankind. Genetic analysis, especially biclustering, is a significant step in this process. Though many biclustering methods exist, only few provide a query based approach for biologists to search the biclusters which contain a certain gene of interest. This proposed query based biclustering algorithm SIMBIC+ first identifies a functionally rich query gene. After identifying the query gene, sets of genes including query gene that show coherent expression patterns across subsets of experimental conditions is identified. It performs simultaneous clustering on both row and column dimension to extract biclusters using Top down approach. Since it uses novel ‘ratio’ based similarity measure, biclusters with more coherence and with more biological meaning are identified. SIMBIC+ uses score based approach with an aim of maximizing the similarity of the bicluster. Contribution entropy based condition selection and multiple row / column deletion methods are used to reduce the complexity of the algorithm to identify biclusters with maximum similarity value. Experiments are conducted on Yeast Saccharomyces dataset and the biclusters obtained are compared with biclusters of popular MSB (Maximum Similarity Bicluster) algorithm. The biological significance of the biclusters obtained by the proposed algorithm and MSB are compared and the comparison proves that SIMBIC+ identifies biclusters with more significant GO (Gene Ontology).
CITED BY (6)  
1 He, P., Xu, X., Ju, Y., Lu, L., & Xi, Y. (2014). Concept Mining of Binary Gene Expression Data. In Intelligent Computing in Bioinformatics (pp. 126-133). Springer International Publishing.
2 Bagyamani, J., Thangavel, K., & Rathipriya, R. (2013). Biclustering of gene expression data based on hybrid genetic algorithm. International Journal of Data Mining, Modelling and Management, 5(4), 333-350.
3 Bagyamani, J., Thangavel, K., & Rathipriya, R. (2013). Comparison of Biological Significance of Biclusters of SIMBIC and SIMBIC+ Biclustering Models.
4 Rathipriya, R., Thangavel, K., & Bagyamani, J. (2013). Usage Profile Generation from Web Usage Data Using Hybrid Biclustering Algorithm. Modeling Applications and Theoretical Innovations in Interdisciplinary Evolutionary Computation, 260.
5 Thangavel, K., Bagyamani, J., & Rathipriya, R. (2012). Novel hybrid PSO-SA model for biclustering of expression data. Procedia Engineering, 30, 1048-1055.
6 XU Xiao-hua, Xi Yan-Qiu, Zhou Jin Pan, Lu Lin, & Chen Jun. (2012). Singular vector space dual clustering algorithm Microelectronics and Computer, 29 (003), 78-83.
1 Google Scholar 
2 Academic Journals Database 
3 Academic Index 
4 CiteSeerX 
5 refSeek 
6 iSEEK 
7 Socol@r  
8 ResearchGATE 
9 Libsearch 
10 Bielefeld Academic Search Engine (BASE) 
11 Scribd 
12 WorldCat 
13 SlideShare 
15 PdfSR 
1 W. Ayadi, M. Elloumi, J.K Hao. “A biclustering algorithm based on a Bicluster Enumeration Tree: application to DNA microarray data”. Biodata Mining, 2:9, 2009
2 J. Bagyamani, K. Thangavel. “SIMBIC: SIMilarity Based BIClustering of Expression Data”. Information Processing and Management Communications in Computer and Information Science, 70, 437-441, 2010
3 A. Ben-Dor, B. Benny Chor, R. Karp, and Z. Yakhini , “Discovering local structure in gene expression data: The order–preserving sub matrix problem”. Journal of Computational Biology, 373–84
4 K, Cheng, N. Law, W. Siu and A. Liew. “Identification of coherent patterns in gene expression data using an efficient biclustering algorithm and parallel coordinate visualization” BMC Bioinformatics, 9:210, 2008
5 Y. Cheng, G.M Church, “Biclustering of expression data”. Proceedings of 8th International Conference on Intelligent Systems for Molecular Biology, ISMB-00, 93-103, 2000
6 Chun Tang, Li Zhang, Idon Zhang, and Murali Ramanathan, “Interrelated two-way clustering: an unsupervised approach for gene expression data analysis”. Proceedings of the 2nd IEEE International Symposium on Bioinformatics and Bioengineering, 41–48, 2001
7 T. Dhollander, Q. Sheng, K. Lemmens, B.D. Moor and K. Marchal et al., “Query-driven module discovery in microarray data”. Bioinformatics, 2007
8 G. Getz, E. Levine and E. Domany, “Coupled two-way clustering analysis of gene microarray data”. Proceedings of the Natural Academy of Sciences USA, 12079-12084, 2000
9 J.A. Hartigan. "Direct clustering of a data matrix". Journal of the American Statistical Association Statistical Assoc. (JASA), 67, 123-129, 1972
10 M. Hu, and Z.S. Qin. ”Query Large Scale Microarray Compendium Datasets using a Model-Based Bayesian Approach with Variable Selection”, PLoS ONE 4(2) e4495, 2009.
11 J. Ihmels et al. ”Defining transcription modules using large-scale gene expressiondata”. Bioinformatics, 20,2004
12 G. Kerr, H.J. Ruskin, M. Crane and P. Doolan, “Techniques for clustering gene expression data”. Computers in Biology and Medicine, .38 (3), 283-293, 2008
13 J. Laurie Heyer, Semyon Kruglyak, and Shibu Yooseph, “Exploring Expression Data: Identification and Analysis of Coexpressed Genes”. ISMB, Bioinformatics, 22(14), e507-513, 2006
14 X. Liu and L. Wang, “Computing maximum similarity biclusters of gene expression data”, Bioinformatics, 23(1),50-56, 2007
15 S.C. Madeira and A.L Oliveira. “Biclustering algorithms for biological data analysis: a survey”. IEEE Transactions on Computational Biology and Bioinformatics,1(1) 24-45, 2004
16 A.B. Owen, J. Stuart, K. Mach, A.M Villeneuve and S. Kim. “A gene recommender algorithm to identify co expressed genes in C. elegans”. Genome Res 13: 1828–1837, 2003
17 P.M Pardalos, S. Busygin and O.A Prokopyev. “On biclustering with feature selection for microarray data sets”. BIOMAT2005—International Symposium on Mathematical and Computational Biology, World Scientific, 367–78, 2006
18 Roy Varshavsky, Assaf Gottlieb, Michal Linial and David Horn. “Novel Unsupervised Feature Filtering of Biological Data”. Bioinformatics, 22(14), e507-e513, 2006
19 A. Tanay, R. Sharan and R. Shamir. “Biclustering Algorithms: A Survey”. Handbook of Computational Molecular Biology, 2004
20 A. Tanay, R. Sharan and R. Shamir. “Discovering statistically significant biclusters in gene expression data”. Bioinformatics, 18, 136-144, 2002
21 J. Yang, H. Wang, W. Wang and P.S Yu “An improved biclustering method for analyzing gene expression”. International Journal on Artificial Intelligence Tools, 14(5), 771-789, 2005.
Associate Professor Bagyamani J
Government Arts college, Dharmapuri - India
Dr. Thangavel
Periyar University - India
Mr. Rathipriya R
Periyar University - India