iBet uBet web content aggregator. Adding the entire web to your favor.
iBet uBet web content aggregator. Adding the entire web to your favor.



Link to original content: https://doi.org/10.1007/978-3-031-78447-7_23
Who Should Have Been Focused: Transferring Attention-Based Knowledge from Future Observations for Trajectory Prediction | SpringerLink
Skip to main content

Who Should Have Been Focused: Transferring Attention-Based Knowledge from Future Observations for Trajectory Prediction

  • Conference paper
  • First Online:
Pattern Recognition (ICPR 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15317))

Included in the following conference series:

  • 67 Accesses

Abstract

Accurately predicting the trajectories of dynamic agents is crucial for the safe navigation of autonomous robotics. However, achieving precise predictions based solely on past and current observations is challenging due to the inherent uncertainty in each agent’s intentions, greatly influencing their future trajectory. Furthermore, the lack of precise information about agents’ future poses leads to ambiguity regarding which agents should be focused on for predicting the target agent’s future. To solve this problem, we propose a teacher-student learning approach. Here, the teacher model utilizes actual future poses of other agents to determine which agents should be focused on for the final prediction. This attentional knowledge guides the student model in determining which agents to focus on and how much attention to allocate when predicting future trajectories. Additionally, we introduce a Lane-guided Attention Module (LAM) that considers interactions with local lanes near predicted trajectories to enhance prediction performance. This module is integrated into the student model to refine agent features, thereby facilitating a more accurate emulation of the teacher model. We demonstrate the effectiveness of our proposed model with a large-scale Argoverse motion forecasting dataset, improving overall prediction performance. Our model can be used plug-and-play, showing consistent performance gain. Additionally, it generates more human-intuitive trajectories, e.g., avoiding collisions with other agents, keeping its lane, or considering relations with other agents.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Beyer, L., Zhai, X., Royer, A., Markeeva, L., Anil, R., Kolesnikov, A.: Knowledge distillation: a good teacher is patient and consistent. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10925–10934 (2022)

    Google Scholar 

  2. Chai, Y., Sapp, B., Bansal, M., Anguelov, D.: Multipath: multiple probabilistic anchor trajectory hypotheses for behavior prediction. arXiv preprint arXiv:1910.05449 (2019)

  3. Chang, M.F., et al.: Argoverse: 3D tracking and forecasting with rich maps. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8748–8757 (2019)

    Google Scholar 

  4. Cho, J.H., Hariharan, B.: On the efficacy of knowledge distillation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4794–4802 (2019)

    Google Scholar 

  5. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  6. Dosovitskiy, A., et al.: An image is worth 16\(\times \)16 words: transformers for image recognition at scale. In: ICLR (2021)

    Google Scholar 

  7. Gao, J., et al.: VectorNet: encoding HD maps and agent dynamics from vectorized representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11525–11533 (2020)

    Google Scholar 

  8. Gilles, T., Sabatini, S., Tsishkou, D., Stanciulescu, B., Moutarde, F.: THOMAS: trajectory heatmap output with learned multi-agent sampling. arXiv preprint arXiv:2110.06607 (2021)

  9. Girgis, R., et al.: Latent variable sequential set transformers for joint multi-agent motion prediction. arXiv preprint arXiv:2104.00563 (2021)

  10. Gou, J., Yu, B., Maybank, S.J., Tao, D.: Knowledge distillation: a survey. Int. J. Comput. Vision 129, 1789–1819 (2021)

    Article  Google Scholar 

  11. Gu, J., Sun, C., Zhao, H.: DenseTNT: end-to-end trajectory prediction from dense goal sets. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 15303–15312 (2021)

    Google Scholar 

  12. Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)

  13. Kim, Y., Rush, A.M.: Sequence-level knowledge distillation. arXiv preprint arXiv:1606.07947 (2016)

  14. Liang, M., Yang, B., Hu, R., Chen, Y., Liao, R., Feng, S., Urtasun, R.: Learning lane graph representations for motion forecasting. In: ECCV 2020, Part II, pp. 541–556. Springer (2020)

    Google Scholar 

  15. Liu, Y., Zhang, J., Fang, L., Jiang, Q., Zhou, B.: Multimodal motion prediction with stacked transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7577–7586 (2021)

    Google Scholar 

  16. Loshchilov, I., Hutter, F.: SGDR: stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983 (2016)

  17. Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017)

  18. Mirzadeh, S.I., Farajtabar, M., Li, A., Levine, N., Matsukawa, A., Ghasemzadeh, H.: Improved knowledge distillation via teacher assistant. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 5191–5198 (2020)

    Google Scholar 

  19. Monti, A., Porrello, A., Calderara, S., Coscia, P., Ballan, L., Cucchiara, R.: How many observations are enough? Knowledge distillation for trajectory forecasting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6553–6562 (2022)

    Google Scholar 

  20. Nayakanti, N., Al-Rfou, R., Zhou, A., Goel, K., Refaat, K.S., Sapp, B.: Wayformer: Motion forecasting via simple & efficient attention networks. In: 2023 IEEE International Conference on Robotics and Automation (ICRA), pp. 2980–2987. IEEE (2023)

    Google Scholar 

  21. Ngiam, J., et al.: Scene transformer: a unified architecture for predicting future trajectories of multiple agents. In: International Conference on Learning Representations (2021)

    Google Scholar 

  22. Park, W., Kim, D., Lu, Y., Cho, M.: Relational knowledge distillation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3967–3976 (2019)

    Google Scholar 

  23. Phuong, M., Lampert, C.: Towards understanding knowledge distillation. In: International Conference on Machine Learning, pp. 5142–5151. PMLR (2019)

    Google Scholar 

  24. Salzmann, T., Ivanovic, B., Chakravarty, P., Pavone, M.: Trajectron++: dynamically-feasible trajectory forecasting with heterogeneous data. In: ECCV 2020, Part XVIII, pp. 683–700. Springer (2020)

    Google Scholar 

  25. Sheng, Z., Xu, Y., Xue, S., Li, D.: Graph-based spatial-temporal convolutional network for vehicle trajectory prediction in autonomous driving. IEEE Trans. Intell. Transp. Syst. 23(10), 17654–17665 (2022)

    Article  Google Scholar 

  26. Shi, S., Jiang, L., Dai, D., Schiele, B.: MTR-A: 1st place solution for 2022 Waymo open dataset challenge–motion prediction. arXiv preprint arXiv:2209.10033 (2022)

  27. Su, D.A., Douillard, B., Al-Rfou, R., Park, C., Sapp, B.: Narrowing the coordinate-frame gap in behavior prediction models: Distillation for efficient and accurate scene-centric motion forecasting. In: 2022 International Conference on Robotics and Automation (ICRA), pp. 653–659. IEEE (2022)

    Google Scholar 

  28. Sun, Q., Huang, X., Gu, J., Williams, B.C., Zhao, H.: M2I: from factored marginal trajectory prediction to interactive prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6543–6552 (2022)

    Google Scholar 

  29. Tang, J., et al.: Understanding and improving knowledge distillation. arXiv preprint arXiv:2002.03532 (2020)

  30. Varadarajan, B., et al.: Multipath++: efficient information fusion and trajectory aggregation for behavior prediction. In: 2022 International Conference on Robotics and Automation (ICRA), pp. 7814–7821. IEEE (2022)

    Google Scholar 

  31. Wang, M., et al.: GANet: goal area network for motion forecasting. In: 2023 IEEE International Conference on Robotics and Automation (ICRA), pp. 1609–1615. IEEE (2023)

    Google Scholar 

  32. Wang, W., Wei, F., Dong, L., Bao, H., Yang, N., Zhou, M.: MiniLM: deep self-attention distillation for task-agnostic compression of pre-trained transformers. Adv. Neural. Inf. Process. Syst. 33, 5776–5788 (2020)

    Google Scholar 

  33. Zhang, L., Li, P., Chen, J., Shen, S.: Trajectory prediction with graph-based dual-scale context fusion. In: 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 11374–11381. IEEE (2022)

    Google Scholar 

  34. Zhao, H., et al.: TNT: target-driven trajectory prediction. In: Conference on Robot Learning, pp. 895–904. PMLR (2021)

    Google Scholar 

  35. Zhou, Z., Ye, L., Wang, J., Wu, K., Lu, K.: HiVT: hierarchical vector transformer for multi-agent motion prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8823–8833 (2022)

    Google Scholar 

Download references

Acknowledgement

This work was supported by 42dot. Also, this work was supported by Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Education(NRF-2021R1A6A1A13044830, 15%) and supported by Institute of Information & communications Technology Planning & Evaluation(IITP) grant funded by the Korea government(MSIT) (RS-2022-II220043, Adaptive Personality for Intelligent Agents, 15%, IITP-2024-RS-2024-00397085, Leading Generative AI Human Resources Development, 15%).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jinkyu Kim .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Moon, S., Yeon, K., Kim, H., Jeong, SG., Kim, J. (2025). Who Should Have Been Focused: Transferring Attention-Based Knowledge from Future Observations for Trajectory Prediction. In: Antonacopoulos, A., Chaudhuri, S., Chellappa, R., Liu, CL., Bhattacharya, S., Pal, U. (eds) Pattern Recognition. ICPR 2024. Lecture Notes in Computer Science, vol 15317. Springer, Cham. https://doi.org/10.1007/978-3-031-78447-7_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-78447-7_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-78446-0

  • Online ISBN: 978-3-031-78447-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics