iBet uBet web content aggregator. Adding the entire web to your favor.
iBet uBet web content aggregator. Adding the entire web to your favor.



Link to original content: https://doi.org/10.1023/A:1026032530166
Using a Genetic Algorithm and a Perceptron for Feature Selection and Supervised Class Learning in DNA Microarray Data | Artificial Intelligence Review Skip to main content
Log in

Using a Genetic Algorithm and a Perceptron for Feature Selection and Supervised Class Learning in DNA Microarray Data

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

Class prediction and feature selection is keyin the context of diagnostic applications ofDNA microarrays. Microarray data is noisy andtypically composed of a low number of samplesand a large number of genes. Perceptrons canconstitute an efficient tool for accurateclassification of microarray data.Nevertheless, the large input layers necessaryfor the direct application of perceptrons andthe low samples available for the trainingprocess hamper its use. Two strategies can betaken for an optimal use of a perceptron with afavourable balance between samples for trainingand the size of the input layer: (a) reducingthe dimensionality of the data set fromthousands to no more than one hundred, highlyinformative average values, and using theweights of the perceptron for feature selectionor (b) using a selection of only few genesthat produce an optimal classification with theperceptron. In this case, feature selection iscarried out first. Obviously, a combinedapproach is also possible. In this manuscriptwe explore and compare both alternatives. Westudy the informative contents of the data atdifferent levels of compression with a veryefficient clustering algorithm (Self OrganizingTree Algorithm). We show how a simple geneticalgorithm selects a subset of gene expressionvalues with 100% accuracy in theclassification of samples with maximumefficiency. Finally, the importance ofdimensionality reduction is discussed in lightof its capacity for reducing noise andredundancies in microarray data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  • Alizadeh, A. A., Eisen, M. B., Davis, R. E., Ma, C., Lossos, I. S., Rosenwald, A., Boldrick, J. C., Sabet, H., Tran, T., Yu, X., Powell, J. I., Yang, L., Marti, G. E., Moore, T., Hudson, J. Jr, Lu, L., Lewis, D. B., Tibshirani, R., Sherlock, G., Chan, W. C., Greiner, T. C., Weisenburger, D. D., Armitage, J. O., Warnke, R., Levy, R., Wilson, W., Grever, M. R., Byrd, J. C., Botstein, D., Brown, P. O. & Staudt, L. M. (2000). Distinct Types of Diffuse Large B-cell Lymphoma Identified by Gene Expression Profiling. Nature 403: 503–11.

    Google Scholar 

  • Alon, U., Barkai, N., Notterman, D. A., Gish, K., Ybarra, S., Mack, D. & Levine, A. J. (1999). Broad Patterns of Gene Expression Revealed by Clustering Analysis of Tumor and Normal Colon Tissues Probed with Oligonucleotide Arrays. Proc. Natl. Acad. Sci. USA. 96: 6745–6750.

    Google Scholar 

  • Brown, P. O. & Botstein, D. (1999). Exploring the New World of the Genome with DNA Microarrays. Nature Biotechnol. 14: 1675–1680.

    Google Scholar 

  • Califano, A., Stolovitzky, G. & Tu, Y. (2000). Analysis of Gene Expression Microarrays for Phenotype Classification. Proc. Intell. Syst. Mol. Biol. 8: 75–85.

    Google Scholar 

  • Cummings, C. A. (2001). Application of SOTA, a Growing Neural Network Algorithm, to Gene Expression Profile Clustering. Briefings in Bioinformatics 2: 402–404.

    Google Scholar 

  • Dopazo, J. & Carazo, J. M. (1997). Phylogenetic Reconstruction Using a Growing Neural Network that Adopts the Topology of a Phylogenetic Tree. J. Mol. Evol. 44: 226–233.

    Google Scholar 

  • Furey, T. S., Cristianini, N., Duffy, N., Bednarski, D. W., Schummer, M. & Haussler, D. (2000). Support Vector Machine Classification and Validation of Cancer Tissue Samples Using Microarray Expression Data. Bioinformatics 16: 906–914.

    Google Scholar 

  • Getz, G., Levine, E. & Domany, E. (2000). Coupled Two-way Clustering Analysis of Gene Microarray Data. Proc. Natl. Acad. Sci. USA 97: 12079–12084.

    Google Scholar 

  • Golub, T. R., Slonim, D. K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J. P., Coller, H., Loh, M. L., Downing, J. R., Caligiuri, M. A., Bloomfield, C. D. & Lander, E. S. (1999). Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring. Science 286: 531–537.

    Google Scholar 

  • Herrero, J., Valencia, A. & Dopazo, J. (2001). A Hierarchical Unsupervised Growing Neural Network for Clustering Gene Expression Patterns. Bioinformatics. 17: 126–136.

    Google Scholar 

  • Herrero, J., Al-Shahrouv, F., Diaz-Uriarte, R., Mateos, A., Vapuerizas, J. M., Santoys, J. & Dopazo, J. (2003). GEPAS, a Web-Based Resource for Microarray Gene Expression Data Analysis. Nucl. Acids. Res. 31: 3461–3467.

    Google Scholar 

  • Khan, J. Wei, J. S., Ringnér, M., Saal, L. H., Ladanyi, M., Westermann, F., Berthold, F., Schwab, M., Antonescu C. R., Peterson, C. & Meltzer, P. S. (2001). Classification and Diagnostic Prediction of Cancers Using Gene Expression Profiling and Artificial Neural Networks. Nature Med. 7: 673–579.

    Google Scholar 

  • Li, L., Weinberg, C. R., Darden, T. A. & Pedersen, L. G. (2001). Gene Selection for Sample Classification Based on Gene Expression Data: Study of Sensitivity to Choice of Parameters of the GA/KNN Method. Bioinformatics 17: 1131–1142.

    Google Scholar 

  • Mateos, A., Herrero, J., Tamames, J. & Dopazo, J. (2002). Supervised Neural Networks for Clustering Conditions in DNA Array Data After Reducing Noise by Clustering Gene Expression Profiles. Microarray Data Analysis II, 91–103. Kluwer Academic Publisher.

  • Michalewicz, Z. (1996). Genetic Algorithms + Data Structures = Evolution Programs. Springer Verlag.

  • Shipp, M. A., Ross, K. N., Tamayo, P., Weng, A. P., Kutok, J. L., Aguiar, R. C. T., Gaasenbeek, M., Angelo, M., Reich, M., Pinkus, G. S., Ray, T. S., Koval, M. A., Last, K.W., Norton, A., T. Lister, A., Mesirov, J., Neuberg, D. S., Lander, E. S., Aster, J. C. & Golub, T. R. (2002). Diffuse Large B-cell Lymphoma Outcome Prediction by Gene-Expression Profiling and Supervised Machine Learning. Nature Medicine 8: 68–74.

    Google Scholar 

  • van't Veer, L. J., Dai, H., van de Vijver, M. J., He, Y. D., Hart, A. A. M., Mao, M., Peterse, L., van der Kooy, K., Marton, M. J., Witteveen, A. T., Schreiber, G. J., Kerkhoven, R. M., Roberts, C., Linsley, P. S., Bernards, R. & Friend, S. H. (2002). Gene Expression Profiling Predicts Clinical Outcome of Breast Cancer. Nature 415: 530–536.

    Google Scholar 

  • Wu, C. H. & McLarty, J.W. (2000). Neural Networks and Genome Informatics. Ed. Konopka. Elsevier.

  • Yeung, K. Y. & Ruzzo, W. L. (2001). Principal Component Analysis for Clustering Gene Expression Data. Bioinformatics 17: 763–774.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joaquín Dopazo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Karzynski, M., Mateos, Á., Herrero, J. et al. Using a Genetic Algorithm and a Perceptron for Feature Selection and Supervised Class Learning in DNA Microarray Data. Artificial Intelligence Review 20, 39–51 (2003). https://doi.org/10.1023/A:1026032530166

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1026032530166

Navigation