Home   >   CSC-OpenAccess Library   >    Manuscript Information
Full Text Available

This is an Open Access publication published under CSC-OpenAccess Policy.
Publications from CSC-OpenAccess Library are being accessed from over 158 countries worldwide.
Toward Integrated Clinical and Gene Expression Profiles for Breast Cancer Prognosis: A Review Paper
Farzana Kabir Ahmad, Safaai Deris , Nor Hayati Othman
Pages - 31 - 47     |    Revised - 30-09-2009     |    Published - 21-10-2009
Volume - 3   Issue - 4    |    Publication Date - August 2009  Table of Contents
Gene expression, Classification, Prognosis, Feature selection, Breast cancer
Breast cancer patients with the same diagnostic and clinical prognostic profile can have markedly different clinical outcome. This difference is possibly caused by the limitation of current breast cancer prognostic indices, which group molecularly distinct patients into similar clinical classes based mainly on morphological of disease. Traditional clinical based prognosis models were discovered contain some restriction to address the heterogeneity of breast cancer. The invention of microarray technology and its ability to simultaneously interrogate thousands genes has changed the paradigm of molecular classification of human cancers as well as it shifted clinical prognosis model to broader prospect. Numerous studies have revealed the potential value of gene expression signatures in examining the risk of disease recurrence. However, currently most of these studies attempted to implement genetic marker based prognostic models to replace the traditional clinical markers, yet neglecting the rich information contain in clinical information. Therefore, this research took an effort to integrate both clinical and microarray data in order to obtain accurate breast cancer prognosis, by taking into account that these data complements each other. This article presents a review of the development of breast cancer prognosis models, concentrating precisely on clinical and gene expression profiles. The literature is reviewed in an explicit machine learning framework, which include the elements of feature selection and classification techniques.
CITED BY (6)  
1 Salem, H., Attiya, G., & El-Fishawy, N. (2015, December). Gene expression profiles based Human cancer diseases classification. In 2015 11th International Computer Engineering Conference (ICENCO) (pp. 181-187). IEEE.
2 Srivastava, S., & Joshi, N. (2014).Clustering Techniques Analysis for Microarray Data.
3 Ahmad, F. K., Deris, S., & Othman, N. H. (2012). The inference of breast cancer metastasis through gene regulatory networks. Journal of biomedical informatics, 45(2), 350-362.
4 Srivastava, S., Rathi, M., & Gupta, J. P. (2011). Predictive Analysis of Lung Cancer Recurrence. In Advances in Computing and Communications (pp. 260-269). Springer Berlin Heidelberg.
5 Pandya, A. S., Arimoto, A., Agarwal, A., & Kinouchi, Y. (2009). A novel approach for measuring electrical impedance tomography for local tissue with artificial intelligent algorithm. International Journal of Biometrics and Bioinformatics, 3(5), 66.
6 Agarwal, A., Pandya, A. S., Arimoto, A., & Kinouchi, Y. (2009). A Novel Approach for Measuring Electrical Impedance Tomography for Local Tissue with Artificial Intelligent Algorithm.
1 Google Scholar 
2 ScientificCommons 
3 Academic Index 
4 CiteSeerX 
5 refSeek 
6 iSEEK 
7 Socol@r  
8 ResearchGATE 
9 Bielefeld Academic Search Engine (BASE) 
10 Scribd 
11 WorldCat 
12 SlideShare 
13 PdfSR 
1 A. Hanna and P. Lucas. "Prognostic models in medicine; AI and statistical approaches". Methods of Information in Medicine, 40:1-5, 2001.
2 A. Joseph Cruz and D. S. Wishart. "Applications of machine learning in cancer prediction and prognosis". Cancer Informatics, 2: 59-78, 2006.
3 G. C Lim., Y. Halimah, and T.O Lim. "The First Report of The National Cancer Registry Cancer Incidence In Malaysia 2000", National Cancer Registry 2002.
4 A. Mann, "Women’s health issues and nuclear medicine, Part II: Women and breast cancer". Journal of Nuclear Medicine Technology, 27: 184-187,1999.
5 J. Lundin. "The Nottingham Prognostic Index - from relative to absolute risk prediction". European Journal of Cancer, 43: 1498-1500, 2007.
6 M. J. Duffy. "Predictive markers in breast and other cancer: A review". Clinical Chemistry, 51: 494-503, 2005.
7 R. S. Uma and T. Rajkumar. "DNA microarray and breast cancer - A review". International Journal of Human Genetics, 7: 49-56, 2007.
8 B. Efron. "Logistic regression, survival analysis, and the Kaplan-Meier Curve". Journal of the American Statistical Association, 83: 414-425, 1988.
9 R. L. Prentice and L. A. Gloeckler. "Regression analysis of grouped survival data with application to breast cancer data". Biometrics, 34: 57-67, 1978.
10 H.B. Burke, P.H. Goodman, D.B. Rosen, et al. "Artificial neural networks improve the accuracy of cancer survival prediction". Cancer, 79: 857-862, 1997.
11 M. D. Laurentiis, S. D. Placido, A. R. Bianco, et al. "A prognostic model that makes quantitative estimates of probability of relapse for breast cancer patients". Clinical Cancer Research, 5: 4133-4139, 1999.
12 J.M. Jerez-Aragones, J.A. Gomez-Ruiz, G. Ramos-Jimenez, et al. "A combined neural network and decision trees model for prognosis of breast cancer relapse". Artif Intell Med, 27: 45-63, 2003.
13 R. Kates, N. Harbeck, and M. Schmitt. "Prospects for clinical decision support in breast cancer based on neural network analysis of clinical survival data". In Proceeding of the Fourth International Conference on Knowledge-Based Intelligent Enginerring Systems & Allied Technologies, Brighton,UK, 2000.
14 R. N. G. Naguib, A. E. Adams, C. H. W. Horne, et al. "The detection of nodal metastasis in breast cancer using neural network techniques". Physiological Measurement, 17: 297-303,1996.
15 L. Bottaci, P.J. Drew, and J.E. Hartley. "Artificial neural networks applied to outcome prediction for colorectal cancer patients in separate institutions". Lancet, 1997; 350: 469-72.
16 Delen D., Walker G., and Kadam A. Predicting breast cancer survivability: A comparison of three data mining methods. Artificial Intelligence in Medicine, 34: 113-127, 2004.
17 J. A. Gomez Ruiz, J. M. Jerez Aragones, and J. A Munoz Perez. "Neural network based model for prognosis of early breast cancer". Applied Intelligence, 20: 231-238, 2004.
18 J. M. Jerez, I. Molina, J. L. Subirats, et al. "Missing data imputation in breast cancer prognosis" . In the Proceeding of the 24th IASTED International Multi-Conference, Biomedical Engineering, Innsbruck, Austria, 2006.
19 L. Franco, J. L.Subirats, E. A. I. Molina, et al. "Early breast cancer prognosis prediction and rule extraction using a new constructive neural network algorithm" Computational and Ambient Intelligence: Springer Berlin / Heidelberg, pp. 1004-1011, (2007).
20 X. Xiong, Y. Kim, Y. Baek, et al. "Analysis of breast cancer using data mining & statistical techniques". In the Proceeding of the Sixth International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing and First ACIS International Workshop on Self-Assembling Wireless Networks (SNPD/SAWN'05), 2005.
21 H.Seker, M.O.Odetayo, D.Petrovic, et al. "An artificial neural network based features evaluation index for the assessment of clinical factors in breast cancer survival analysis". In the Proceeding of the IEEE Canadian Conference on Electrical & Computer Engineering, 2002.
22 W. L. McGuire. "Breast cancer prognostic factors; Evaluation guidelines". Journal of the National Cancer Institute, 3: 154 -155, 1990.
23 F. Kharbat, L. Bull, and M.Odeh. "Mining breast cancer data with XCS". In the Proceeding of the 9th Annual Conference on Genetic and Evolutionary Computation, London, England, 2007.
24 Y.J. Lee, O. L. Mangasarian, and W. H Wolberg. "Breast cancer survival and chemotherapy: A support vector Machine analysis". DIMACS Series in Discrete Mathematics and Theoretical Computer Science, 55, 2000.
25 A. V. Tinker, A. Boussioutas, and D. D. L Bowtell. "The challenges of gene expression microarrays for the study of human cancer". Cancer Cell, Elsevier, 9: 333-339, 2006.
26 J. D. Potter. "Epidemiology, cancer genetics and microarrays: Making correct inferences using appropriate designs". Trends in Genetics, 19: 690–695, 2003.
27 A. Brazma, P.Hingamp, J.Quackenbush, et al. "Minimum information about a microarray experiment (MIAME)—Toward standards for microarray data". Nature, 29: 365-371, 2001.
28 T. S.Furey, N.Cristianini, N.Duffy, et al. "Support vector machine classification and validation of cancer tissue samples using microarray expression data". Bioinformatics, 16: 906-914, 2000.
29 Y. Lu and J.Han. "Cancer classification using gene expression data". Information Systems; Data Management in Bioinformatics, 28: 243 - 268, 2003.
30 Y.Saeys, I.Inza, and P.Larranaga. "A review of feature selection techniques in bioinformatics". Bioinformatics, 1-10, 2007.
31 O. Gevaert, F. D Smet., D.Timmerman, et al. "Predicting the prognosis of breast cancer by integrating clinical and microarray data with Bayesian networks". Bioinformatics, 22:e184 - e190, 2006.
32 Z.Wang. "Neuro-fuzzy modeling for microarray cancer gene expression data", Oxford University Computing Laboratory 2005.
33 S. B. Cho and H. H. Won. "Machine learning in DNA microarray analysis for cancer classification". In the Proceedings of the First Asia-Pacific bioinformatics conference on Bioinformatics, 2003.
34 H. Hu, J. Li, H. Wang, et al. "Combined gene selection methods for microarray data analysis", Knowledge-Based Intelligent Information and Engineering Systems; 4251: Springer-Verlag Berlin Heidelberg, pp. 976–983, 2006.
35 T. R. Golub, D. K. Slonim, P.Tamayo, et al. "Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring". Science, 286: 531-537, 1999.
36 E. P. Xing, M. I.Jordan, and R. M.Karp. "Feature selection for high-dimensional genomic microarray data". In the Proceeding of the 18th International Conf. on Machine Learning, 2001.
37 C. Giallourakis, C.Henson, M.Reich, et al. "Disease gene discovery through integrative genomics". Annual Review of Genomics and Human Genetics, 6: 381-406, 2005.
38 A. Ben-Dor, L. Bruhn, N.Friedman, et al. "Tissue classification with gene expression profiles". Journal of Computational Biology, 7: 559-583, 2000.
39 Y.Barash, E.Dehan, M.Krupsky, et al. "Comparative analysis of algorithms for signal quantitation from oligonucleotide microarrays". Bioinformatics, 20: 839–846, 2004.
40 S.Rogers, R. D.Williams, and C.Campbell. "Class prediction with microarray datasets". Bioinformatics Using Computational Intelligence Paradigms; 176: Springer, pp. 119-141, 2005.
41 B. Y. M.Fung and V. T. Y. Ng. "Classification of heterogeneous gene expression data". ACM Special Interest Group on Knowledge Discovery and Data Mining, SIGKDD Explorations, 5: 69 - 78, 2003.
42 D.Koller and M.Sahami. "Toward optimal feature selection". In the Proceeding of International Conference of Machine Learning, p. 284-292, 1996.
43 L. Yu and H. Liu. "Redundancy based feature selection for microarray data". In the proceeding of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, Washington, USA, 2004.
44 R. Kohavi and G. H. John. "Wrappers for feature subset selection". Artificial Intelligence, 97: 273-324, 1997.
45 R. Ruiz, J. C. Riquelme, and J. S. Aguilar-Ruiz. "Incremental wrapper-based gene selection from microarray data for cancer classification". Elsevier, 39: 2383 – 2392, 2006.
46 H. Zhang, T. B. Ho, and S. Kawasaki. "Wrapper feature extraction for time series classification using singular value decomposition". International Journal of Knowledge and Systems Science, 3: 53-60, 2006.
47 I. Inza, P. Larranaga, R.Blanco, et al. "Filter versus wrapper gene selection approaches in DNA microarray domains". Artificial Intelligence in Medicine, 31: 91—103, 2004.
48 J. J. Liu, G. Cutler, W. Li, et al. "Multiclass cancer classification and biomarker discovery using GA-based algorithms". Bioinformatics, 21: 2691-2697, 2005.
49 L. Li, T.A. Darden, C.R. Weingberg, et al. "Gene assessment and sample classification for gene expression data using a genetic algorithm / k-nearest neighbor method". Combinatorial Chemistry & High Throughput Screening, 4: 727-739, 2001.
50 I. Guyon, J. Weston, M. D. Stephen Barnhill, et al. "Gene selection for cancer classification using support vector machines". Machine Learning, 46: 389-422, 2002.
51 S. Dudoit, J. Fridlyand, and T.P. Speed. "Comparison of discrimination methods for the classification of tumors using gene expression data". Journal of the American Statistical Association, 97, 2002.
52 J. Khan, J. S. Wei, M. Ringnér, et al. "Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks". Nature Medicine, 7, 2001
53 Y. Xu, M. Selaru, J. Yin, et al. "Artificial neural networks and gene filtering distinguish between global gene expression profiles of Barrett’s esophagus and esophageal cancer". Cancer Research, 62: 3493-3497, 2002.
54 M. P. S. Brown, W. N. Grundy, D. Lin, et al. "Knowledge-based analysis of microarray gene expression data by using support vector machines". In the Proceeding of the National Academy of Sciences, USA, 2000.
55 D. Liu, T. Shi, J. A. DiDonato, et al. "Application of genetic algorithm/k-nearest neighbor method to the classification of renal cell carcinoma". In the Proceeding of the Computational Systems Bioinformatics Conference (CSB 2004), IEEE, 2004.
56 M. L. Zhang and Z. H. Zhou. "A k-nearest neighbor based algorithm for multi-label classification". In the Proceeding of the International Conference on Granular Computing, IEEE, 2005.
57 D. Berrar, C.S. Downes, and W. Dubitzky. "Multiclass cancer classification using gene expression profiling and probabilistic neural networks". In the Proceeding of the Pacific Symposium on Biocomputing, New Jersey, 2003.
58 V. N. Vapnik. "Statistical Learning Theory". New York, NY: John Wiley & Sons, 1998.
59 V. N. Vapnik. "The Nature of Statistical Learning Theory", 2nd ed. New York, NY: Springer, 2000.
60 Y. Mao, X. Zhou, D. Pi, et al. "Multiclass cancer classification by using fuzzy support vector machine and binary decision tree with gene selection". Journal of Biomedicine and Biotechnology, 2: 160-171, 2005.
61 S. Symons and K. Nieselt. "Data mining microarray data - Comprehensive benchmarking of feature selection and classification methods". Center for Bioinformatics T¨ubingen, Wilhelm-Schickard Institute for Computer Science, University of T¨ubingen, Sand 14, 72076 T¨ubingen, 2006.
62 M. Al-Shalalfa and R. Alhajj. "Application of double clustering to gene expression data for class prediction". In the Proceeding of the International Conference on Advanced Information Networking and Applications Workshops, 2007.
63 S. Ramaswamy, P. Tamayo, R. Rifkin, et al. "Multiclass cancer diagnosis using tumor gene expression signatures". National Academy of Sciences, 98: 15149-15154, 2001.
64 L. E. Peterson, M. Ozen, H. Erdem, et al. "Artificial neural network analysis of DNA microarray-based prostate cancer recurrence" In the Proceeding of the Computational Intelligence in Bioinformatics and Computational Biology, IEEE, 2005.
65 M. Ringnér and C.Peterson. "Microarray-based cancer diagnosis with artificial neural networks". Complex Systems Division, Department of Theoretical Physics, Lund University, Sweden 2003.
66 G. Tusch. "Sequential classification for microarray and clinical data" In the Proceeding of the Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts. IEEE, p. 5-6, 2005.
67 J. S. Wei, B. T Greer, F. Westermann, et al. "Prediction of clinical outcome using gene expression profiling and artificial neural networks for patients with neuroblastoma". Cancer Research, 64: 6883–6891, 2004.
68 V. Bevilacqua, G. Mastronardi, F. Menolascina, et al. "Genetic algorithms and artificial neural networks in microarray data analysis: A distributed approach". Engineering Letters, 13: 1-9, 2006.
69 P. Eden, C. Ritz, C. Rose, et al. ""Good Old’’ clinical markers have similar power in breast cancer prognosis as microarray gene expression profilers". European Journal of Cancer, 40: 1837 - 1841, 2004.
70 J. Pearl. "Fusion, propagation and structuring in belief networks". Artificial Intelligence, 29: 241-288, 1986.
71 N. Friedman, M. Linial, I. Nachman, et al. "Using bayesian networks to analyze expression data". Journal of Computational Biology, 7: 601–620, 2000.
72 P. Helman, R. Veroff, S. R. Atlas, et al. "A bayesian network classification methodology for gene expression data". Journal of Computational Biology, 11: 581-615, 2004.
73 E. Huang, S. H. Cheng, H. Dressman, et al. "Gene expression predictors of breast cancer outcomes". Lancet, 361: 1590-1596, 2003.
74 M. West, C. Blanchette, H. Dressman, et al. "Predicting the clinical status of human breast cancer by using gene expression profiles". National Academy of Science of United States of America (PNAS), 98: 11462-11467, 2001.
75 J. H. Hong and S. B. Cho. "The classification of cancer based on DNA microarray data that uses diverse ensemble genetic programming". Artificial Intelligence in Medicine, 36: 43-58, 2005.
76 S. B. Cho and C. Park. "Speciated GA for optimal ensemble classifiers in DNA microarray classification". In the proceeding of the Congress on Evolutionary Computation, 2004.
77 A. C. Tan and D. Gilbert. "Ensemble machine learning on gene expression data for cancer classification". Applied Bioinformatics, 2: S77-83, 2003.
78 B. Liu, Q. Cui, T. Jiang, et al. "A combinational feature selection and ensemble neural network method for classification of gene expression data". Bioinformatics, 5, 2004.
79 K. J. Kim and S. B. Cho. "Ensemble classifiers based on correlation analysis for DNA microarray classification". Elsevier, 70: 187–199, 2006.
80 M. E.Futschik, M. Sullivan, A. Reeve, et al. "Prediction of clinical behaviour and treatment for cancers". Applied Bioinformatics, 2: S53-S58, 2003.
81 J. R. Nevins, E. S. Huang, H. Dressman, et al. "Towards integrated clinico-genomic models for personalized medicine: combining gene expression signatures and clinical factors in breast cancer outcomes prediction". Human Molecular Genetics, 12: R153-R157, 2003.
82 Y. Sun, S. Goodison, J. Li, et al. "Improved breast cancer prognosis through the combination of clinical and genetic markers". Bioinformatics, 23: 30-37, 2007.
83 L. Li, L. Chen, D. Goldgof, et al. "Integration of clinical information and gene expression profiles for prediction of chemo-response for ovarian cancer." In the Proceeding of the Annual Conference of Engineering in Medicine and Biology, IEEE, Shanghai, China, 2005.
84 L. Li. "Survival prediction of diffuse large -B-cell lymphoma based on both clinical and gene expression information". Bioinformatics, 22: 466-471, 2006.
85 Y. F. Leung and D. Cavalieri. "Fundamentals of cDNA microarray data analysis". Elsevier, 19: 649-659, 2003.
86 J. Pittman. "The importance of validation in genomic studies of breast cancer". Breast Diseases: A Year Book Quarterly, 16: 16-19, 2005.
87 C. Tago and T. Hanai. "Prognosis prediction by microarray gene expression using support vector machine" . Genome Informatics, 14: 324-325, 2003.
88 R. Xu, X. Cai, and D. C. W. II. "Gene expression data for DLBCL cancer survival prediction with a combination of machine learning technologies". In the Proceeding of the Engineering in Medicine and Biology 27th Annual Conference, IEEE, Shanghai, China, 2005.
Miss Farzana Kabir Ahmad
- Malaysia
Mr. Safaai Deris
- Malaysia
Mr. Nor Hayati Othman
- Malaysia