Abstract
Most previous works on opinion summarization focus on summarizing sentiment polarity distribution toward different aspects of an entity (e.g., battery life and screen of a mobile phone). However, users’ demand may be more beyond this kind of opinion summarization. Besides such coarse-grained summarization on aspects, one may prefer to read detailed but concise text of the opinion data for more information. In this paper, we propose a new framework for opinion summarization. Our goal is to assist users to get helpful opinion suggestions from reviews by only reading a short summary with a few informative sentences, where the quality of summary is evaluated in terms of both aspect coverage and viewpoints preservation. More specifically, we formulate the informative sentence selection problem in opinion summarization as a community leader detection problem, where a community consists of a cluster of sentences toward the same aspect of an entity and leaders can be considered as the most informative sentences of the corresponding aspect. We develop two effective algorithms to identify communities and leaders. Reviews of six products from Amazon.com are used to verify the effectiveness of our method for opinion summarization.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
Note that \(|\mathcal {N}_{k}(s)|\) can be larger than \(k\) since there could be the event of ties (i.e., a set of neighbors have the same similarity to \(s\)).
- 3.
Available at http://sites.google.com/site/linhongi2r/data-and-code.
- 4.
A longer summary is more likely to provide better information but is less concise.
- 5.
ROUGE-N is a popular toolkit which measures the quality of a summary by comparing it to other reference summaries using \(n\)-gram co-occurrence.
References
Ageev AA, Sviridenko M (1999) Approximation algorithms for maximum coverage and max cut with given sizes of parts. In: Proceedings of the 7th international conference on integer programming and combinatorial optimization, Springer, London, pp 17–30
Beineke P, Hastie T, Manning C, Vaithyanathan S (2004) Exploring sentiment summarization. In: AAAI spring symposium on exploring attitude and affect in text: theories and applications
Blair-goldensohn S, Neylon T, Hannan K, Reis GA, Mcdonald R, Reynar J (2008) Building a sentiment summarizer for local service reviews. In: NLP in the information explosion era
Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022
Bookstein A (1990) Informetric distributions, part i: unified overview. J Am Soc Inf Sci 41(5):368–375
Cheng J, Ke Y, Fu AWC, Yu JX, Zhu L (2010) Finding maximal cliques in massive networks by h*-graph. In: Proceedings of the SIGMOD. ACM, New York, pp 447–458
Danescu-Niculescu-Mizil C, Kossinets G, Kleinberg JM, Lee L (2009) How opinions are received by online communities: a case study on amazon.com helpfulness votes. In: Proceedings of the 18th WWW, ACM, New York, pp 141–150
Erkan G, Radev DR (2004) Lexpagerank: prestige in multi-document text summarization. In: Proceedings of EMNLP, Barcelona, Spain
Filippova K (2010) Multi-sentence compression: finding shortest paths in word graphs. In: COLING, pp 322–330
Freeman LC (1979) Centrality in social networks: conceptual clarification. Soc Netw 1(3):215–239
Ganesan K, Zhai C, Han J (2010) Opinosis: a graph based approach to abstractive summarization of highly redundant opinions. In: Proceedings of the 23rd COLING
Heerschop B, Goossen F, Hogenboom A, Frasincar F, Kaymak U, de Jong F (2011) Polarity analysis of texts using discourse structure. In: Proceedings of the 20th CIKM. ACM, New York, pp 1061–1070
Hirsch JE (2005) An index to quantify an individual’s scientific research output. Proc Natl Acad Sci U S A 102(46):16569–16572
Hofmann T (1999) Probabilistic latent semantic analysis. In: Proceedings of uncertainty in artificial intelligence, pp 289–296
Hu B, Song Z, Ester M (2012) User features and social networks for topic modeling in online social media. In: ASONAM, pp 202–209
Hu M, Liu B (2004) Mining and summarizing customer reviews. In: Proceedings of the 10th ACM SIGKDD. ACM, New York, pp 168–177
Jin F, Huang M, Zhu X (2010) A comparative study on ranking and selection strategies for multi-document summarization. In: COLING (Posters), pp 525–533
Khuller S, Moss A, Naor JS (1999) The budgeted maximum coverage problem. Inf Process Lett 70:39–45
Kim HD, Ganesan K, Sondhi P, Zhai C (2011) Comprehensive review of opinion summarization
Kim SM, Pantel P, Chklovski T, Pennacchiotti M (2006) Automatically assessing review helpfulness. In: Proceedings of EMNLP. Association for Computational Linguistics, Stroudsburg, pp 423–430
Lerman K, Blair-Goldensohn S, McDonald R (2009) Sentiment summarization: evaluating and learning user preferences. In: Proceedings of the 12th EACL. ACL, Stroudsburg, pp 514–522
Li F, Huang M, Yang Y, Zhu X (2011) Learning to identify review spam. In: IJCAI, pp 2488–2493
Lim EP, Nguyen VA, Jindal N, Liu B, Lauw HW (2010) Detecting product review spammers using rating behaviors. In: Proceedings of the 19th CIKM. ACM, New York, pp 939–948
Lin CY, Hovy E (2003) Automatic evaluation of summaries using n-gram co-occurrence statistics. In: Proceedings of the NAACL. ACL, Stroudsburg, pp 71–78
Lin H, Bilmes J (2011) A class of submodular functions for document summarization. In: Proceedings of the 49th HLT/ACL. ACL, Stroudsburg, pp 510–520
Liu J, Cao Y, Lin CY, Huang Y, Zhou M (2007) Low-Quality product review detection in opinion summarization. In: Proceedings of the joint conference on EMNLP-CoNLL, pp 334–342
Lu Q, Getoor L (2003) Link-based classification. In: Proceedings of the 20th ICML. AAAI Press, Chicago, pp 496–503
Lu Y, Zhai C, Sundaresan N (2009) Rated aspect summarization of short comments. In: Proceedings of the 18th WWW. ACM, New York, pp 131–140
Mei Q, Ling X, Wondra M, Su H, Zhai C (2007) Topic sentiment mixture: modeling facets and opinions in weblogs. In: Proceedings of the 16th WWW. ACM, New York, pp 171–180
Mihalcea R, Tarau P (2004) Textrank: bringing order into text. In: EMNLP, pp 404–411
Muthukrishnan P, Gerrish J, Radev DR (2008) Detecting multiple facets of an event using graph-based unsupervised methods. In: COLING, pp 609–616
Newman MEJ (2007) The mathematics of networks. The new palgrave encyclopedia of economics pp 1–12
Pang B, Lee L (2004) A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd ACL. Association for Computational Linguistics, Stroudsburg
Popescu AM, Etzioni O (2005) Extracting product features and opinions from reviews. In: Proceedings of the HLT and EMNLP. Association for Computational Linguistics, Stroudsburg, pp 339–346
Sabidussi G (1966) The centrality index of a graph. Psychometrika 31(4):581–603
Smith LM, Zhu L, Lerman K, Kozareva Z (2013) The role of social media in the discussion of controversial topics. In: SocialCom, pp 236–243
Taboada M, Anthony C, Voll K (2006) Methods for creating semantic orientation dictionaries. In: Proceedings of 5th ICLRE, Genoa, Italy pp 427–432
Taskar B, Wong M, Abbeel P, Koller D (2004) Link prediction in relational data. In: NIPS. MIT Press, Cambridge
Titov I, McDonald RT (2008) A joint model of text and aspect ratings for sentiment summarization. In: ACL, pp 308–316
Tsaparas P, Ntoulas A, Terzi E (2011) Selecting a comprehensive set of reviews. In: Proceedings of the 17th ACM SIGKDD. ACM, New York, pp 168–176
Wan X, Yang J (2008) Multi-document summarization using cluster-based link analysis. In: Proceedings of the 31st ACM SIGIR. ACM, New York, pp 299–306
Wang D, Li T (2010) Document update summarization using incremental hierarchical clustering. In: Proceedings of the 19th CIKM. ACM, New York, pp 279–288
Yu J, Zha ZJ, Wang M, Chua TS (2011) Aspect ranking: Identifying important product aspects from online consumer reviews. In: ACL, The Association for Computer Linguistics, pp 1496–1505
Zhu L, Galstyan A, Cheng J, Lerman K (2014) Tripartite graph clustering for dynamic sentiment analysis on social media. In: SIGMOD Conference, pp 1531–1542
Zhu L, Galstyan A, Cheng J, Lerman K (2014) Tripartite graph clustering for dynamic sentiment analysis on social media. CoRR abs/1402.6010
Zhu L, Gao S, Pan SJ, Li H, Deng D, Shahabi C (2013) Graph-based informative-sentence selection for opinion summarization. In: ASONAM, pp 408–412
Zhuang L, Jing F, Zhu XY (2006) Movie review mining and summarization. In: Proceedings of the 15th CIKM. ACM, New York, pp 43–50
Acknowledgments
This work is partially supported by DARPA under grant Number W911NF-12-1-0034.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Zhu, L., Gao, S., Pan, S.J., Li, H., Deng, D., Shahabi, C. (2015). The Pareto Principle Is Everywhere: Finding Informative Sentences for Opinion Summarization Through Leader Detection. In: Ulusoy, Ö., Tansel, A., Arkun, E. (eds) Recommendation and Search in Social Networks. Lecture Notes in Social Networks. Springer, Cham. https://doi.org/10.1007/978-3-319-14379-8_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-14379-8_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-14378-1
Online ISBN: 978-3-319-14379-8
eBook Packages: Computer ScienceComputer Science (R0)