Abstract
The classification into hierarchical structures is a problem of increasing importance, e.g. considering the growing use of ontologies or keyword hierarchies used in many web-based information systems. Therefore, it is not surprising that it is a field of ongoing research. Here, we propose an approach that utilizes hierarchy information in the classification process. In contrast to other methods, the hierarchy information is used independently of the classifier rather than integrating it directly. This enables the use of arbitrary standard classification methods. Furthermore, we discuss how hierarchical classification in general and our setting in specific can be evaluated appropriately. We present our algorithm and evaluate it on two datasets of web pages using Naïve Bayes and SVM as baseline classifiers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
BADE, K. and NÜRNBERGER, A. (2005): Supporting Web Search by User Specific Document Categorization: Intelligent Bookmarks. Proc. of LIT05, 115–123.
CAI, L. and HOFMANN, T. (2004): Hierarchical Document Categorization with Support Vector Machines. Proceedings of 13 th ACM Conference on Information and Knowledge Management, 78–87.
CECI, M. and MALERBA, D. (2003): Hierarchical Classification of HTML Documents with WebClassII. Proc. of 25 th Europ. Conf. on Inform. Retrieval, 57–72.
CESA-BIANCHI, N., GENTILE, C., TIRONI, A. and ZANIBONI, L. (2004): Incremental Algorithms for Hierarchical Classification. Neural Information Processing Systems, 233–240.
CHANG, C. and LIN, C. (2001): LIBSVM: A Library for Support Vector Machines. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
CHOI, B. and PENG, X. (2004): Dynamic and Hierarchical Classification of Web Pages. Online Information Review, 28,2, 139–147.
DUMAIS, S. and CHEN, H. (2000): Hierarchical Classification of Web Content. Proceedings of the 23 rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 256–263.
FROMMHOLZ, I. (2001): Categorizing Web Documents in Hierarchical Catalogues. Proceedings of the European Colloquium on Information Retrieval Research.
GRANITZER, M. and AUER, P. (2005): Experiments with Hierarchical Text Classification. Proc. of 9 th IASTED Intern. Conference on Artificial Intelligence.
HOTHO, A., NÜRNBERGER, A. and PAAß G. (2005): A Brief Survey of Text Mining. GLDV-J. for Comp. Linguistics & Language Technology, 20,1, 19–62.
MCCALLUM, A., ROSENFELD, R., MITCHELL, T. and NG, A. (1998): Improving Text Classification by Shrinkage in a Hierarchy of Classes. Proceedings of the 15 th International Conference on Machine Learning (ICML98), 359–367.
SINKA, M. and CORNE, D. (2002): A Large Benchmark Dataset forWeb Document Clustering. Soft Computing Systems: Design, Management and Applications, Volume 87 of Frontiers in Artificial Intelligence and Applications, 881–890.
SUN, A. and LIM, E. (2001): Hierarchical Text Classification and Evaluation. Proc. of the 2001 IEEE International Conference on Data Mining, 521–528.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bade, K., Nürnberger, A. (2007). Rearranging Classified Items in Hierarchies Using Categorization Uncertainty. In: Decker, R., Lenz, H.J. (eds) Advances in Data Analysis. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70981-7_15
Download citation
DOI: https://doi.org/10.1007/978-3-540-70981-7_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-70980-0
Online ISBN: 978-3-540-70981-7
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)