Abstract
Multi-document summarization has become a key technology in natural language processing. This paper proposes a strategy for Chinese multi-document summarization based on clustering and sentence extraction. As for clustering, we propose two heuristics to automatically detect the proper number of clusters: the first one makes full use of the summary length fixed by the user; the second is a stability method, which has been applied to other unsupervised learning problems. We also discuss a global searching method for sentence selection from the clusters. To evaluate our summarization strategy, an extrinsic evaluation method based on classification task is adopted. Experimental results on news document set show that the new strategy can significantly enhance the performance of Chinese multi-document summarization.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Radev, D.R., Jing, H.Y., Budzikowska, M.: Centroid-Based Summarization of Multiple Documents: Sentence Extraction, Utility-Based Evaluation and User Studies. Information Processing and Management 40(6), 919–938 (2004)
Boros, E., Kantor, P., Neu, D.J.: A Clustering Based Approach to Creating Multi-Document Summaries (2001), http://www-nlpir.nist.gov/projects/duc/pubs/2001papers/rutgers_final.pdf
Lange, T., Braun, M.L., Roth, V., Buhmann, J.M.: Stability-Based Model Selection. Advances in Neural Information Processing Systems, vol. 15. MIT Press, Cambridge (2003)
Levine, E., Domany, E.: Resampling Method for Unsupervised Estimation of Cluster Calidity. Neural Computation 13, 2573–2593 (2001)
Niu, Z.Y., Ji, D.H., Tan, C.L.: Document Clustering Based on Cluster Validation. In: CIKM 2004, Washington, DC, USA (2004)
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval, pp. 27–30. Addison Wesley, New York (1999)
Hand, T.F.: A Proposal for Task-based Evaluation of Text Summarization Systems. In: ACLEACL 1997 Summarization Workshop, pp. 31–36 (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Liu, D., He, Y., Ji, D., Yang, H., Wu, Z. (2006). Chinese Multi-document Summarization Using Adaptive Clustering and Global Search Strategy. In: Yang, Q., Webb, G. (eds) PRICAI 2006: Trends in Artificial Intelligence. PRICAI 2006. Lecture Notes in Computer Science(), vol 4099. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-36668-3_148
Download citation
DOI: https://doi.org/10.1007/978-3-540-36668-3_148
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-36667-6
Online ISBN: 978-3-540-36668-3
eBook Packages: Computer ScienceComputer Science (R0)