Abstract
In this paper we focus on the organisation of web contents, which allows efficient browsing, searching and discovery. We propose a method that dynamically creates such a structure called Topological Tree. The tree is generated using an algorithm called Automated Topological Tree Organiser, which uses a set of hierarchically organised self-organising growing chains. Each chain fully adapts to a specific topic, where its number of subtopics is determined using entropy-based validation and cluster tendency schemes. The Topological Tree adapts to the natural underlying structure at each level in the hierarchy. The topology in the chains also relates close topics together, thus can be exploited to reduce the time needed for search and navigation. This method can be used to generate a web portal or directory where browsing and user comprehension are improved.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Beil, F., Ester, M., Xu, X.: Frequent term-based text clustering. In: Proc. SIGKDD 2002, Edmonton, Canada, pp. 436–442 (2002)
El-Hamdouchi, A., Willett, P.: Techniques for the measurement of clustering tendency in document retrieval systems. Journal of Information Science 13(6), 361–365 (1987)
Freeman, R., Yin, H.: Self-organising maps for hierarchical tree view document clustering using contextual information. In: Yin, H., Allinson, N.M., Freeman, R., Keane, J.A., Hubbard, S. (eds.) IDEAL 2002. LNCS, vol. 2412, pp. 123–128. Springer, Heidelberg (2002)
Freeman, R., Yin, H., Allinson, N.M.: Self-organising maps for tree view based hierarchical document clustering. In: Proc. IJCNN 2002, Honolulu, Hawaii, vol. 2, pp. 1906–1911. IEEE, Los Alamitos (2002)
Freeman, R.T., Yin, H.: Tree view self-organisation of web content. Neurocomputing (2004) (in press)
Hearst, M.A.: Untangling text data mining. In: Proc. ACL 1999 (1999)
Hodge, V.J., Austin, J.: Hierarchical growing cell structures: Treegcs. IEEE Trans. Knowledge & Data Engineering 13(2), 207–218 (2001)
Kohonen, T., Kaski, S., Lagus, K., Salojarvi, J., Honkela, J., Paatero, V., Saarela, A.: Self organization of a massive document collection. IEEE Trans. Neural Networks 11(3), 574–585 (2000)
Morris, S.A., Asnake, B., Yen, G.G.: Dendrogram seriation using simulated annealing. Information Visualization 2(2), 95–104 (2003)
Rauber, A., Merkl, D., Dittenbach, M.: The growing hierarchical self-organizing map: exploratory analysis of high-dimensional data. IEEE Trans. Neural Networks 13(6), 1331–1341 (2002)
Salton, G.: Automatic text processing - the transformation, analysis, and retrieval of information by computer. Addison-Wesley, Reading (1989)
Schwarz, G.: Estimating the dimension of a model. The Annals of Statistics 6(2), 461–464 (1978)
Sebastiani, F.: Machine learning in automated text categorization. ACM Computing Surveys 34(1), 1–47 (2002)
Steinbach, M., Karypis, G., Kumar, V.: A comparison of document clustering techniques. In: Proc. KDD 2000, Boston, USA (2000)
Van Rijsbergen, C.J.: Information Retrieval, 2nd edn. Butterworth, Butterworths (1979)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Freeman, R.T., Yin, H. (2004). Topological Tree for Web Organisation, Discovery and Exploration. In: Yang, Z.R., Yin, H., Everson, R.M. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2004. IDEAL 2004. Lecture Notes in Computer Science, vol 3177. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-28651-6_70
Download citation
DOI: https://doi.org/10.1007/978-3-540-28651-6_70
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22881-3
Online ISBN: 978-3-540-28651-6
eBook Packages: Springer Book Archive