Biological Significance of Gene Expression Data Using Similarity Based Biclustering Algorithm
Bagyamani J, Thangavel, Rathipriya R
Pages - 201 - 216     |    Revised - 31-01-2011     |    Published - 08-02-2011
Volume - 4   Issue - 6    |    Publication Date - February  Table of Contents
Biclustering, Gene Expression Data, Query gene, Similarity, Top-Down Approach, Gene Ontology
Unlocking the complexity of a living organism’s biological processes, functions and genetic network is vital in learning how to improve the health of humankind. Genetic analysis, especially biclustering, is a significant step in this process. Though many biclustering methods exist, only few provide a query based approach for biologists to search the biclusters which contain a certain gene of interest. This proposed query based biclustering algorithm SIMBIC+ first identifies a functionally rich query gene. After identifying the query gene, sets of genes including query gene that show coherent expression patterns across subsets of experimental conditions is identified. It performs simultaneous clustering on both row and column dimension to extract biclusters using Top down approach. Since it uses novel ‘ratio’ based similarity measure, biclusters with more coherence and with more biological meaning are identified. SIMBIC+ uses score based approach with an aim of maximizing the similarity of the bicluster. Contribution entropy based condition selection and multiple row / column deletion methods are used to reduce the complexity of the algorithm to identify biclusters with maximum similarity value. Experiments are conducted on Yeast Saccharomyces dataset and the biclusters obtained are compared with biclusters of popular MSB (Maximum Similarity Bicluster) algorithm. The biological significance of the biclusters obtained by the proposed algorithm and MSB are compared and the comparison proves that SIMBIC+ identifies biclusters with more significant GO (Gene Ontology).
1 He, P., Xu, X., Ju, Y., Lu, L., & Xi, Y. (2014). Concept Mining of Binary Gene Expression Data. In Intelligent Computing in Bioinformatics (pp. 126-133). Springer International Publishing.
2 Bagyamani, J., Thangavel, K., & Rathipriya, R. (2013). Biclustering of gene expression data based on hybrid genetic algorithm. International Journal of Data Mining, Modelling and Management, 5(4), 333-350.
3 Bagyamani, J., Thangavel, K., & Rathipriya, R. (2013). Comparison of Biological Significance of Biclusters of SIMBIC and SIMBIC+ Biclustering Models.
4 Rathipriya, R., Thangavel, K., & Bagyamani, J. (2013). Usage Profile Generation from Web Usage Data Using Hybrid Biclustering Algorithm. Modeling Applications and Theoretical Innovations in Interdisciplinary Evolutionary Computation, 260.
5 Thangavel, K., Bagyamani, J., & Rathipriya, R. (2012). Novel hybrid PSO-SA model for biclustering of expression data. Procedia Engineering, 30, 1048-1055.
6 XU Xiao-hua, Xi Yan-Qiu, Zhou Jin Pan, Lu Lin, & Chen Jun. (2012). Singular vector space dual clustering algorithm Microelectronics and Computer, 29 (003), 78-83.
Associate Professor Bagyamani J
Government Arts college, Dharmapuri - India
Dr. Thangavel
Periyar University - India
Mr. Rathipriya R
Periyar University - India