Abstract
Predictive toxicology is the task of building models capable of determining, with a certain degree of accuracy, the toxicity of chemical compounds. Machine Learning (ML) in general, and lazy learning techniques in particular, have been applied to the task of predictive toxicology. ML approaches differ in which kind of chemistry knowledge they use but all rely on some specific representation of chemical compounds. In this paper we deal with one specific issue of molecule representation, the multiplicity of descriptions that can be ascribed to a particular compound. We present a new approach to lazy learning, based on the notion of multiple-instance, which is capable of seamlessly working with multiple descriptions. Experimental analysis of this approach is presented using the Predictive Toxicology Challenge data set.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Ames, B.N., McCann, J.: Detection of carcinogens as mutagens in the salmonella/ microsome test: Assay of 300 chemicals: Discussion. Proceedings of the National Academy of Sciences USA 73, 950–954 (1976)
Armengol, E., Plaza, E.: Bottom-up induction of feature terms. Machine Learning 41(1), 259–294 (2000)
Armengol, E., Plaza, E.: Relational case-based reasoning for carcinogenic activity prediction. Artificial Intelligence Review 20(1-2), 121–141 (2003)
Armengol, E., Plaza, E.: Lazy learning for predictive toxicology based on a chemical ontology. In: Dubitzky, W., Azuaje, F.J. (eds.) Artificial Intelligence Methods and Tools for Systems Biology. Kluwer Academic Publishers, Dordrecht (2004) (in press)
Baurin, N., Marot, C., Mozziconacci, J.C., Morin-Allory, L.: Use of learning vector quantization and BCI fingerprints for the predictive toxicology challenge 2000-2001. In: Proceedings of the Predictive Toxicology Challenge Workshop, Freiburg, Germany (2001)
Blinova, V., Bobryinin, D., Finn, V., Kuznetsov, S., Pankratova, E.: Toxicology analysis by means of simple JSM method. Bioinformatics 19(10), 1201–1207 (2003)
Blockeel, H., Driessens, K., Jacobs, N., Kosala, R., Raeymaekers, S., Ramon, J., Struyf, J., Van Laer, W., Verbaeten, S.: First order models for the predictive toxicology challenge 2001. In: Proceedings of the Predictive Toxicology Challenge Workshop, Freiburg, Germany (2001)
Chevaleyre, Y., Zucker, J.D.: Solving multiple-instance and multiple-part learning problems with decision trees and rule sets. In: Application to the Mutagenesis Problem, Morgan Kaufmann, San Francisco (1995)
Cohen, W.: Fast effective rule induction. In: Proceedings of the 12th International Conference on Machine Learning, pp. 204–214 (2001)
Dasarathy, B.V.: Nearest Neighbor (NN) Norms: NN Pattern Classification Techniques. IEEE Computer Society Press, Washington (1990)
Dietterich, T., Lathrop, R., Lozano-Perez, T.: Solving the multiple instance problem with axis-parallel rectangles. Artificial Intelligence Journal 89(1-2), 31–71 (1997)
Edgar, G.A.: Measure, Topology, and Fractal Geometry. Springer Verlag, Heidelberg (1995)
Egan, J.P.: Signal Detection Theory and ROC Analysis. Series in Cognition and Perception. Academic Press, New York (1975)
Gonzalez, J., Holder, L., Cook, D.: Application of graph-based concept learning to the predictive toxicology domain. In: Proceedings of the Predictive Toxicology Challenge Workshop, Freiburg, Germany (2001)
Helma, C., Kramer, S.: A survey of the predictive toxicology challenge 2000- 2001. Bioinformatics 19(10), 1179–1182 (2003)
Maron, O., Lozano-Perez, T.: A framework for multiple instance learning. Neural Information Processing Systems 10 (1998)
Owada, H., Koyama, M., Hoken, Y.: ILP-based rule induction for predicting carcinogenicity. In: Proceedings of the Predictive Toxicology Challenge Workshop, Freiburg, Germany (2001)
Provost, F., Fawcett, T.: Analysis and visualization of classifier performance:Comparison under imprecise class and cost distributions. In: Proceedings of the KDD 1997 (1997)
Srinivasan, A., Muggleton, S., King, R.D., Sternberg, M.J.E.: Mutagenesis: ILP experiments in a non-determinate biological domain. In: Proceedings of the Fourth Inductive Logic Programming Workshop (1994)
Toivonen, H., Srinivasan, A., King, R., Kramer, S., Helma, C.: Statistical evaluation of the predictive toxicology challenge, pp. 1183–1193 (2003)
Wettschereck, D., Dietterich, T.G.: Locally adaptive nearest neighbor algorithms. In: Cowan, J.D., Tesauro, G., Alspector, J. (eds.) Advances in Neural Information Processing Systems, vol. 6, pp. 184–191. Morgan Kaufmann Publishers, Inc, San Francisco (1994)
Zucker, J.: A framework for learning rules from multiple instance data. In: Langley, P. (ed.) European Conference on Machine Learning, pp. 1119–1125 (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Armengol, E., Plaza, E. (2004). Multiple-Instance Case-Based Learning for Predictive Toxicology. In: López, J.A., Benfenati, E., Dubitzky, W. (eds) Knowledge Exploration in Life Science Informatics. KELSI 2004. Lecture Notes in Computer Science(), vol 3303. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30478-4_18
Download citation
DOI: https://doi.org/10.1007/978-3-540-30478-4_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23927-7
Online ISBN: 978-3-540-30478-4
eBook Packages: Springer Book Archive