Abstract
One of the most important steps in any knowledge discovery task is the interpretation and evaluation of discovered patterns. To address this problem, various techniques, such as the chi-square test for independence, have been suggested to reduce the number of patterns presented to the user and to focus attention on those that are truly statistically significant. However, when mining a large database, the number of patterns discovered can remain large even after adjusting significance thresholds to eliminate spurious patterns. What is needed, then, is an effective measure to further assist in the interpretation and evaluation step that ranks the interestingness of the remaining patterns prior to presenting them to the user. In this paper, we describe a two-step process for ranking the interestingness of discovered patterns that utilizes the chi-square test for independence in the first step and objective measures of interestingness in the second step. We show how this two-step process can be applied to ranking characterized/generalized association rules and data cubes.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
T. Brijs, G. Swinnen, K. Vanhoof, and G. Wets. Using association rules for product assortment decisions: A case study. In Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’99) pages 254–260 San Diego,California August 1999.
C.L. Carter and H.J. Hamilton. Efficient attribute-oriented algorithms for knowledge discovery from large databases. IEEE Transactions on Knowledge and Data Engineering, 10(2):193–208, March/April 1998.
L.A. Goodman and W.H. Kruskal. Measures of Association for Cross Classifications. Springer-Verlag, 1979.
J. Han, W. Ging, and Y. Yin. Mining segment-wise periodic patterns in time-related databases. In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD’98), pages 214–218, New York, New York, August1998.
R.J. Hilderman, C.L. Carter, H.J. Hamilton, and N. Cercone. Mining association rules from market basket data using share measures and characterized itemsets. International Journal on Artificial Intelligence Tools, 7(2):189–220, June 1998.
R.J. Hilderman, C.L. Carter, H.J. Hamilton, and N. Cercone. Mining market basket data using share measures and characterized itemsets. In X. Wu, R. Kotagiri, and K. Korb, editors, Proceedings of the Second Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’98), pages 159–173, Melbourne, Australia, April 1998.
R.J. Hilderman and H.J. Hamilton. Heuristic measures of interestingness. In J. Zytkow and J. Rauch, editors, Proceedings of the Third European Conference on the Principles of Data Mining and Knowledge Discovery (PKDD’99), pages 232–241, Prague, Czech Republic, September 1999.
R.J. Hilderman and H.J. Hamilton. Heuristics for ranking the interestingness of discovered knowledge. In N. Zhong and L. Zhou, editors, Proceedings of the Third Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’99), pages 204–209, Beijing, China, April 1999.
R.J. Hilderman and H.J. Hamilton. Principles for mining summaries: Theorems and proofs. Technical Report CS 00-01, Department of Computer Science, University of Regina, February2000. Online at http://www.cs.uregina.ca/research/Techreport/0001.ps.
R.J. Hilderman, H.J. Hamilton, and N. Cercone. Data mining in large databases using domain generalization graphs. Journal of Intelligent Information Systems, 13(3):195–234, November 1999.
R.J. Hilderman, H.J. Hamilton, R.J. Kowalchuk, and N. Cercone. Parallel knowledge discovery using domain generalization graphs. In J. Komorowski and J. Zytkow, editors, Proceedings of the First European Conference on the Principles of Data Mining and Knowledge Discovery (PKDD’97), pages 25–35, Trondheim, Norway, June 1997.
B. Liu, W Hsu, and Y. Ma. Pruning and summarizing the discovered associations. In Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’99), pages 125–134, San Diego, California, August 1999.
R. Srikant and R. Agrawal. Mining generalized association rules. In Proceedings of the 21th International Conference on Very Large Databases (VLDB’95), pages 407–419, Zurich, Switzerland, September 1995.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hilderman, R.J., Hamilton, H.J. (2000). Applying Objective Interestingness Measures in Data Mining Systems. In: Zighed, D.A., Komorowski, J., Żytkow, J. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 2000. Lecture Notes in Computer Science(), vol 1910. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45372-5_47
Download citation
DOI: https://doi.org/10.1007/3-540-45372-5_47
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41066-9
Online ISBN: 978-3-540-45372-7
eBook Packages: Springer Book Archive