iBet uBet web content aggregator. Adding the entire web to your favor.
iBet uBet web content aggregator. Adding the entire web to your favor.



Link to original content: https://doi.org/10.1007/978-3-319-05491-9_2
Learning Social Relations from Videos: Features, Models, and Analytics | SpringerLink
Skip to main content

Learning Social Relations from Videos: Features, Models, and Analytics

  • Chapter
  • First Online:
Human-Centered Social Media Analytics

Abstract

Despite the progress made during recent years in video understanding, extracting relations among actors in a video is still a largely unexplored area. In this chapter, we review one of the ?rst studies towards learning such relations from videos using visual and auditory cues. The main contribution can be stated as the association of low-level video features to social relations by machine learning methodology. Specifically, support vector regression is leveraged to estimate local grouping cues from low-level visual and auditory features. These locally defined grouping cues are then synthesized to derive the affinity between actors. Finally, the social network defined by the resulting affinity is analyzed to ?nd communities of actors and identify the leader of each community. Furthermore, as an extension to the basic framework, we discuss the relationship between visual concepts and social relations. We demonstrate the performance of these approaches on a set of videos.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    The movies in or dataset are (1) G.I. Joe: The Rise of Cobra (2009); (2) Harry Potter and the Half-Blood Prince (2009); (3) Public Enemies (2009); (4) Troy (2004); (5) Braveheart (1995); (6) Year One (2009); (7) Coraline (2009); (8) True Lies (1994); (9) The Chronicles of Narnia: The Lion, the Witch and the Wardrobe (2005); and (10) The Lord of the Rings: The Return of the King (2003) .

  2. 2.

    In movie (10), Gollum has a good personality except for when he is close to the ring. The ring changes the good behavior of the actors to bad except for Frodo.

  3. 3.

    Ground truth leaders are: (1) Duke and McCullen; (2) Harry and Snape; (3) Dillinger and Purvis; (4) Achilles and Hector; (5) Wallace and Longshanks; (6) Zed and King; (7) Coraline and Other Mother; (8) Harry and Salim; (9) Aslan and Witch; and (10) Frodo and Witch-king.

References

  1. Al-Hames, M., Lenz, C., Reiter, S., Schenk, J., Wallhoff, F., Rigoll, G.: Robust multi-modal group action recognition in meetings from disturbed videos with the asynchronous hidden markov model. In: International Conference on Image Processing (2007)

    Google Scholar 

  2. Ali, S., Basharat, A., Shah. M.: Chaotic invariants for human action recognition. In: IEEE International Conference on Computer Vision (2007)

    Google Scholar 

  3. Alon, J., Athitsos, V., Yuan, Q., Sclaroff, S.: A unified framework for gesture recognition and spatiotemporal gesture segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 31(9), 1685–1699 (2009)

    Article  Google Scholar 

  4. Arandjelović, O., Zisserman, A.: Automatic face recognition for film character retrieval in feature-length films. In: ACM International Conference on Image and Video Retrieval (2005)

    Google Scholar 

  5. Chen, J., Zaiane, O., Goebel, R.: Detecting communities in social networks using max-min modularity. In: SIAM Conference on Data Mining (2009)

    Google Scholar 

  6. Cour, T., Jordan, C., Miltsakaki, E., Taskar, B.: Movie/script: alignment and parsing of video and text transcription. In: European Conference on Computer Vision (2008)

    Google Scholar 

  7. Ding, L., Fan, Q., Hsiao, J., Pankanti, S.: Graph based event detection from realistic videos using weak feature correspondence. In: International Conference on Acoustics, Speech, and Signal Processing (2010)

    Google Scholar 

  8. Ding, L., Yilmaz, A.: Learning relations among movie characters: a social network perspective. In: European Conference on Computer Vision (2010)

    Google Scholar 

  9. Ding, L., Yilmaz, A.: Inferring social relations from visual concepts. In: International Conference on Computer Vision (2011)

    Google Scholar 

  10. Dufrenois, F., Colliez, J., Hamad, D.: Crisp weighted support vector regression for robust single model estimation: application to object tracking in image sequences. In: IEEE Conference on Computer Vision and Pattern Recognition (2007)

    Google Scholar 

  11. Eagle, N., Pentland, A.: Eigenbehaviors: identifying structure in routine. Behav. Ecol. Sociobiol. 63(7), 1057–1066 (2009)

    Article  Google Scholar 

  12. Eagle, N., Pentland, A., Lazer, D.: Inferring social network structure using mobile phone data. Proc. Nat. Acad. Sci. 106(36), 15274–15278 (2009)

    Article  Google Scholar 

  13. Efros, A.A., Berg, A.C., Mori, G., Malik, J.: Recognizing action at a distance. In: IEEE International Conference on Computer Vision (2003)

    Google Scholar 

  14. Fan, Y., Shelton, C.R.: Learning continuous-time social network dynamics. In: Conference on Uncertainty in Artificial Intelligence (2009)

    Google Scholar 

  15. Fathi, A., Hodgins, J.K., Rehg, J.M.: Social interactions: a first-person perspective. In: IEEE Conference on Computer Vision and Pattern Recognition (2012)

    Google Scholar 

  16. Fathi, A., Mori, G.: Action recognition by learning mid-level motion features. In: IEEE Conference on Computer Vision and Pattern Recognition (2008)

    Google Scholar 

  17. Freeman, L.: Centrality in social networks: conceptual clarification. Soc. Netw. 1(3), 215–239 (1979)

    Google Scholar 

  18. Ge, W., Collins, R., Ruback, B.: Automatically detecting the small group structure of a crowd. In: IEEE Workshop on Applications of Computer Vision (2009)

    Google Scholar 

  19. Holden, C.: Giving girls a chance: patterns of talk in co-operative group work. Gend. Educ. 5(2), 179–189 (1993)

    Article  Google Scholar 

  20. Jiang, H., Fels, S., Little, H.: A linear programming approach for multiple object tracking. In: IEEE Conference on Computer Vision and Pattern Recognition (2007)

    Google Scholar 

  21. Kusakunniran, W., Wu, Q., Zhang, J., Li, H.: Support vector regression for multi-view gait recognition based on local motion feature selection. In: IEEE Conference on Computer Vision and Pattern Recognition (2010)

    Google Scholar 

  22. Kyriazis, N., Argyros., A.: Physically plausible 3d scene tracking: the single actor hypothesis. In: IEEE Conference on Computer Vision and Pattern Recognition (2013)

    Google Scholar 

  23. Laptev, I., Lindeberg, T.: Space-time interest points. In: IEEE International Conference on Computer Vision (2003)

    Google Scholar 

  24. Lin, J., Wang, W.: Weakly-supervised violence detection in movies with audio and video based co-training. In: Pacific-Rim Conference on Multimedia (2009)

    Google Scholar 

  25. Lu, Z., Carreira-Perpinan, M.A.: Constrained spectral clustering through affinity propagation. In: IEEE Conference on Computer Vision and Pattern Recognition (2008)

    Google Scholar 

  26. Lucas, B.D., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: International Joint Conferences on Artificial Intelligence (1981)

    Google Scholar 

  27. Myhill, D.: Bad boys and good girls? patterns of interaction and response in whole class teaching. Br. Educ. Res. J. 28(3), 339–352 (2002)

    Article  Google Scholar 

  28. Newman, M.E.J.: Modularity and community structure in networks. Proc. Nat. Acad. Sci. 103(23), 8577–8582 (2006)

    Article  Google Scholar 

  29. Pei, M., Dong, Z., Zhao, M.: Event recognition based on social roles in continuous video. In: IEEE International Conference on Multimedia and Expo (2013)

    Google Scholar 

  30. Qiu, J., Lin, Z., Tang, C., Qiao, S.: Discovering organizational structure in dynamic social network. In: IEEE International Conference on Data Mining (2009)

    Google Scholar 

  31. Ramanathan, V., Yao, B., Fei-Fei, L.: Social role discovery in human events. In: IEEE Conference on Computer Vision and Pattern Recognition (2013)

    Google Scholar 

  32. Rasheed, Z., Shah, M.: Movie genre classification by exploiting audio-visual features of previews. In: International Conference on Pattern Recognition (2002)

    Google Scholar 

  33. Ruhnau, B.: Eigenvector-centrality? a node-centrality. Soc. Netw. 22(4), 357–365 (2000)

    Article  Google Scholar 

  34. Shi, J., Tomasi, C.: Good features to track. In: IEEE Conference on Computer Vision and Pattern Recognition (1994)

    Google Scholar 

  35. Smola, A.J., Schölkopf, B.: A tutorial on support vector regression. Stat. Comput. 14(3), 199–222 (2004)

    Article  MathSciNet  Google Scholar 

  36. Song, Y., Morency, L.-P., Davis, R.: Action recognition by hierarchical sequence summarization. In: IEEE Conference on Computer Vision and Pattern Recognition (2013)

    Google Scholar 

  37. Sugiyama, M.: Dimensionality reduction of multimodal labeled data by local Fisher discriminant analysis. J. Mach. Learn. Res. 8, 1027–1061 (2007)

    MATH  Google Scholar 

  38. Wang, G., Gallagher, A., Luo, J., Forsyth, D.: Seeing people in social context: recognizing people and social relationships. In: European Conference on Computer Vision (2010)

    Google Scholar 

  39. Wasserman, S., Faust, K., Iacobucci, D.: Social Network Analysis: Methods and Applications. Cambridge University Press, Cambridge (1994)

    Google Scholar 

  40. Weng, C.-Y., Chu, W.-T., Wu, J.-L.: Rolenet: Movie analysis from the perspective of social networks. IEEE Trans. Multimedia 11(2), 256–271 (2009)

    Article  Google Scholar 

  41. Yanagawa, A., Chang, S.-F., Kennedy, L., Hsu, W.: Columbia university’s baseline detectors for 374 lscom semantic visual concepts. Technical report, Columbia University (2007)

    Google Scholar 

  42. Yang, T., Chi, Y., Zhu, S., Gong, Y., Jin, R.: A bayesian approach toward finding communities and their evolutions in dynamic social networks. In: SIAM Conference on Data Mining (2009)

    Google Scholar 

  43. Yilmaz, A., Shah, M.: Recognizing human actions in videos acquired by uncalibrated moving cameras. In: International Conference on Computer Visioniccv (2005)

    Google Scholar 

  44. Yilmaz, A., Shah, M.: A differential geometric approach to representing the human actions. Comput. Vis. Image Underst. 109(3), 335–351 (2008)

    Article  Google Scholar 

  45. Yu, T., Lim, S.-N., Patwardhan, K., Krahnstoever, N.: Monitoring, recognizing and discovering social networks. In: IEEE Conference on Computer Vision and Pattern Recognition (2009)

    Google Scholar 

  46. Zhai, Y., Shah, M.: Video scene segmentation using markov chain monte carlo. IEEE Trans. Multimedia 8(4), 686–697 (2006)

    Article  Google Scholar 

  47. Zhang, D., Gatica-Perez, D., Bengio, S., McCowan, I.: Modeling individual and group actions in meetings with layered hmms. IEEE Trans. Multimedia 8(3), 509–520 (2006)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lei Ding .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Ding, L., Yilmaz, A. (2014). Learning Social Relations from Videos: Features, Models, and Analytics. In: Fu, Y. (eds) Human-Centered Social Media Analytics. Springer, Cham. https://doi.org/10.1007/978-3-319-05491-9_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-05491-9_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-05490-2

  • Online ISBN: 978-3-319-05491-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics