Abstract
In this paper, we discuss grading, a meta-classification technique that tries to identify and correct incorrect predictions at the base level. While stacking uses the predictions of the base classifiers as metalevel attributes, we use “graded” predictions (i.e., predictions that have been marked as correct or incorrect) as meta-level classes. For each base classifier, one meta classifier is learned whose task is to predict when the base classifier will err. Hence, just like stacking may be viewed as a generalization of voting, grading may be viewed as a generalization of selection by cross-validation and therefore fills a conceptual gap in the space of meta-classification schemes. Our experimental evaluation shows that this technique results in a performance gain that is quite comparable to that achieved by stacking, while both, grading and stacking outperform their simpler counter-parts voting and selection by cross-validation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bauer, E., & Kohavi, R. (1999). An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning, 36, 105–169.
Blake, C. L., & Merz, C. J. (1998). UCI repository of machine learning databases. http://www.ics.uci.edu/~mlearn/MLRepository.html. Department of Information and Computer Science, Unifirsty of California at Irvine, Irvine CA.
Brazdil, P. B., Gama, J., & Henery, B. (1994). Characterizing the applicability of classification algorithms using meta-level learning. Proceedings of the 7th European Conference on Machine Learning (ECML-94) (pp. 83–102). Catania, Italy: Springer-Verlag.
Chan, P. K., & Stolfo, S. J. (1995). A comparative evaluation of voting and metalearning on partitioned data. Proceedings of the 12th International Conference on Machine Learning (ICML-95) (pp. 90–98). Morgan Kaufmann.
Cleary, J. G., & Trigg, L. E. (1995). K*: An instance-based learner using an entropic distance measure. Proceedings of the 12th International Conference on Machine Learning (pp. 108–114). Lake Tahoe, CA.
Dietterich, T. G. (2000a). Ensemble methods in machine learning. First International Workshop on Multiple Classifier Systems (pp. 1–15). Springer-Verlag.
Kononenko, I., & Bratko, I. (1991). Information-based evaluation criterion for classifier’s performance. Machine Learning, 6, 67–80.
Koppel, M., & Engelson, S. P. (1996). Integrating Multiple Classifiers By Finding Their Areas of Expertise. Proceedings of the AAAI-96 Workshop on Integrating Multiple Models (pp. 53–58).
Ortega, J. (1996). Exploiting Multiple Existing Models and Learning Algorithms Proceedings of the AAAI-96 Workshop on Integrating Multiple Models (pp. 101–106).
Petrak, J. (2000). Fast subsampling performance estimates for classification algorithm selection. Proceedings of the ECML-00 Workshop on Meta-Learning: Building Automatic Advice Strategies for Model Selection and Method Combination (pp. 3–14). Barcelona, Spain.
Pfahringer, B., Bensusan, H., & Giraud-Carrier, C. (2000). Meta-learning by landmarking various learning algorithms. Proceedings of the 17th International Conference on Machine Learning (ICML-2000). Stanford, CA.
Quinlan, J. R. (1993). C4.5: Programs for Machine Learning. San Mateo, CA: Morgan Kaufmann.
Schaffer, C. (1993). Selecting a classification method by cross-validation. Machine Learning, 13, 135–143.
Schaffer, C. (1994). Cross-validation, stacking and bi-level stacking: Meta-methods for classification learning. In P. Cheeseman and R. W. Oldford (Eds.), Selecting models from data: Artificial Intelligence and Statistics IV, 51–59. Springer-Verlag.
Seewald, A. K., & Fürnkranz, J. (2001). Grading classifiers (Technical Report OEFAI-TR-2001-01). Austrian Research Institute for Artificial Intelligence, Wien.
Ting, K. M. (1997). Decision combination based on the characterisation of predictive accuracy. Intelligent Data Analysis, 1, 181–206.
Ting, K. M., & Witten, I. H. (1999). Issues in stacked generalization. Journal of Artificial Intelligence Research, 10, 271–289.
Todorovski, L., & Džeroski, S. (2000). Combining multiple models with meta decision trees. Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD-2000) (pp. 54–64). Lyon, France: Springer-Verlag.
Wolpert, D. H. (1992). Stacked generalization. Neural Networks, 5, 241–260.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Seewald, A.K., Fürnkranz, J. (2001). An Evaluation of Grading Classifiers. In: Hoffmann, F., Hand, D.J., Adams, N., Fisher, D., Guimaraes, G. (eds) Advances in Intelligent Data Analysis. IDA 2001. Lecture Notes in Computer Science, vol 2189. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44816-0_12
Download citation
DOI: https://doi.org/10.1007/3-540-44816-0_12
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42581-6
Online ISBN: 978-3-540-44816-7
eBook Packages: Springer Book Archive