Abstract
Knowledge discovery in databases is used to discover useful and understandable knowledge from large databases. A process of knowledge discovery consists of two steps, the data mining step and the evaluation step. In this paper, evaluating and ranking the interestingness of summaries generated from databases, which is a part of the second step, is studied using diversity measures. Sixteen previously analyzed diversity measures of interestingness are used along with three not previously considered ones, brought from different well-known areas. The latter three measures are evaluated theoretically according to five principles that a measure must satisfy to be qualified acceptable for ranking summaries. A theoretical correlation study between the eight measures that satisfy all five principles is presented based on mathematical proofs. An empirical evaluation is conducted using three real databases. Then, a classification of the eight measures is deduced. The resulting classification is used to reduce the number of measures to only two, which are the best over all criteria, and that produce non-similar results. This helps the user interpret the most important discovered knowledge in his decision making process.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Carter, C. L., & Hamilton, H. J. (1995a). Fast, incremental generalization and regeneralization for knowledge discovery from large databases. In Proceedings of the Eighth Florida Artificial Intelligence Symposium. (pp. 319–323), Melbourne, Florida.
Carter, C. L., & Hamilton, H. J. (1995b). Performance evaluation of attribute-oriented algorithms for knowledge discovery from databases. In Proceedings of the Seventh IEEE International Conference on Tools with Artificial Intelligence (ICTAI'95). (pp. 486–489), Washington, D.C.
Egghe, L., & Rousseau, R. (1991). Transfer principles and a classification of concentration measures. Journal of the American Society for Information Science (JASIS), 42:7, 479–489.
Han, J., & Kamber, M. (2001). data mining: Concepts and techniques. Morgan Kaufmann Publishers.
Hilderman, R. J., & Hamilton, H. J. (1999). Heuristic measures of interestingness. In Proceedings of the Third European Conference on the Principles of Data Mining and Knowledge Discovery (PKDD'99). (pp. 232–241), Prague, Czech Republic.
Hilderman, R. J., & Hamilton, H. J. (2000). Principles for mining summaries using objective measures of interestingness. In Proceedings of the Twelfth IEEE International Conference on Tools with Artificial Intelligence (ICTAI'00). (pp. 72–81), Vancouver, Canada.
Hilderman, R. J., & Hamilton, H. J. (2001). Evaluation of interestingness measures for ranking discovered knowledge. Lecture Notes in Computer Sciences, 2035, 247–259.
Hilderman, R. J., Hamilton, H. J., & Barber, B. (1999a). Ranking the interestingness of summaries from data mining systems. In Proceedings of the 12th International Florida Artificial Intelligence Research Symposium (FLAIRS'99). (pp. 100–106), Orlando, U.S.A.
Hilderman, R. J., Hamilton, H. J., & Cercone, N. (1999b). Data mining in large databases using domain generalization graphs. Journal of Intelligent Information Systems, 13:3, 195–234.
Hill, M. O. (1973). Diversity and evenness: A unifying notation and its consequences. Ecology, 54, 427–432.
Rae, D. W., & Taylor, M. (1970). The Analysis of Political Cleavages. New Haven: Yale University Press.
Silberschatz, A., & Tuzhilin, A. (1995). On objective measures of interestingness in knowledge discovery. In Proceedings of The First International Conference on Knowledge Discovery and Data Mining (KDD'95). (pp. 275–281), Montreal, Canada.
Silberschatz, A., & Tuzhilin, A. (1996). What makes patterns interesting in knowledge discovery. IEEE Transactions on Knowledge and Data Engineering, Special Issue on Data Mining, 5:6, 970–974.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zbidi, N., Faiz, S. & Limam, M. On Mining Summaries by Objective Measures of Interestingness. Mach Learn 62, 175–198 (2006). https://doi.org/10.1007/s10994-005-5066-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10994-005-5066-8