Abstract
Accurately predicting the trajectories of dynamic agents is crucial for the safe navigation of autonomous robotics. However, achieving precise predictions based solely on past and current observations is challenging due to the inherent uncertainty in each agent’s intentions, greatly influencing their future trajectory. Furthermore, the lack of precise information about agents’ future poses leads to ambiguity regarding which agents should be focused on for predicting the target agent’s future. To solve this problem, we propose a teacher-student learning approach. Here, the teacher model utilizes actual future poses of other agents to determine which agents should be focused on for the final prediction. This attentional knowledge guides the student model in determining which agents to focus on and how much attention to allocate when predicting future trajectories. Additionally, we introduce a Lane-guided Attention Module (LAM) that considers interactions with local lanes near predicted trajectories to enhance prediction performance. This module is integrated into the student model to refine agent features, thereby facilitating a more accurate emulation of the teacher model. We demonstrate the effectiveness of our proposed model with a large-scale Argoverse motion forecasting dataset, improving overall prediction performance. Our model can be used plug-and-play, showing consistent performance gain. Additionally, it generates more human-intuitive trajectories, e.g., avoiding collisions with other agents, keeping its lane, or considering relations with other agents.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Beyer, L., Zhai, X., Royer, A., Markeeva, L., Anil, R., Kolesnikov, A.: Knowledge distillation: a good teacher is patient and consistent. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10925–10934 (2022)
Chai, Y., Sapp, B., Bansal, M., Anguelov, D.: Multipath: multiple probabilistic anchor trajectory hypotheses for behavior prediction. arXiv preprint arXiv:1910.05449 (2019)
Chang, M.F., et al.: Argoverse: 3D tracking and forecasting with rich maps. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8748–8757 (2019)
Cho, J.H., Hariharan, B.: On the efficacy of knowledge distillation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4794–4802 (2019)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Dosovitskiy, A., et al.: An image is worth 16\(\times \)16 words: transformers for image recognition at scale. In: ICLR (2021)
Gao, J., et al.: VectorNet: encoding HD maps and agent dynamics from vectorized representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11525–11533 (2020)
Gilles, T., Sabatini, S., Tsishkou, D., Stanciulescu, B., Moutarde, F.: THOMAS: trajectory heatmap output with learned multi-agent sampling. arXiv preprint arXiv:2110.06607 (2021)
Girgis, R., et al.: Latent variable sequential set transformers for joint multi-agent motion prediction. arXiv preprint arXiv:2104.00563 (2021)
Gou, J., Yu, B., Maybank, S.J., Tao, D.: Knowledge distillation: a survey. Int. J. Comput. Vision 129, 1789–1819 (2021)
Gu, J., Sun, C., Zhao, H.: DenseTNT: end-to-end trajectory prediction from dense goal sets. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 15303–15312 (2021)
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)
Kim, Y., Rush, A.M.: Sequence-level knowledge distillation. arXiv preprint arXiv:1606.07947 (2016)
Liang, M., Yang, B., Hu, R., Chen, Y., Liao, R., Feng, S., Urtasun, R.: Learning lane graph representations for motion forecasting. In: ECCV 2020, Part II, pp. 541–556. Springer (2020)
Liu, Y., Zhang, J., Fang, L., Jiang, Q., Zhou, B.: Multimodal motion prediction with stacked transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7577–7586 (2021)
Loshchilov, I., Hutter, F.: SGDR: stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983 (2016)
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017)
Mirzadeh, S.I., Farajtabar, M., Li, A., Levine, N., Matsukawa, A., Ghasemzadeh, H.: Improved knowledge distillation via teacher assistant. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 5191–5198 (2020)
Monti, A., Porrello, A., Calderara, S., Coscia, P., Ballan, L., Cucchiara, R.: How many observations are enough? Knowledge distillation for trajectory forecasting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6553–6562 (2022)
Nayakanti, N., Al-Rfou, R., Zhou, A., Goel, K., Refaat, K.S., Sapp, B.: Wayformer: Motion forecasting via simple & efficient attention networks. In: 2023 IEEE International Conference on Robotics and Automation (ICRA), pp. 2980–2987. IEEE (2023)
Ngiam, J., et al.: Scene transformer: a unified architecture for predicting future trajectories of multiple agents. In: International Conference on Learning Representations (2021)
Park, W., Kim, D., Lu, Y., Cho, M.: Relational knowledge distillation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3967–3976 (2019)
Phuong, M., Lampert, C.: Towards understanding knowledge distillation. In: International Conference on Machine Learning, pp. 5142–5151. PMLR (2019)
Salzmann, T., Ivanovic, B., Chakravarty, P., Pavone, M.: Trajectron++: dynamically-feasible trajectory forecasting with heterogeneous data. In: ECCV 2020, Part XVIII, pp. 683–700. Springer (2020)
Sheng, Z., Xu, Y., Xue, S., Li, D.: Graph-based spatial-temporal convolutional network for vehicle trajectory prediction in autonomous driving. IEEE Trans. Intell. Transp. Syst. 23(10), 17654–17665 (2022)
Shi, S., Jiang, L., Dai, D., Schiele, B.: MTR-A: 1st place solution for 2022 Waymo open dataset challenge–motion prediction. arXiv preprint arXiv:2209.10033 (2022)
Su, D.A., Douillard, B., Al-Rfou, R., Park, C., Sapp, B.: Narrowing the coordinate-frame gap in behavior prediction models: Distillation for efficient and accurate scene-centric motion forecasting. In: 2022 International Conference on Robotics and Automation (ICRA), pp. 653–659. IEEE (2022)
Sun, Q., Huang, X., Gu, J., Williams, B.C., Zhao, H.: M2I: from factored marginal trajectory prediction to interactive prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6543–6552 (2022)
Tang, J., et al.: Understanding and improving knowledge distillation. arXiv preprint arXiv:2002.03532 (2020)
Varadarajan, B., et al.: Multipath++: efficient information fusion and trajectory aggregation for behavior prediction. In: 2022 International Conference on Robotics and Automation (ICRA), pp. 7814–7821. IEEE (2022)
Wang, M., et al.: GANet: goal area network for motion forecasting. In: 2023 IEEE International Conference on Robotics and Automation (ICRA), pp. 1609–1615. IEEE (2023)
Wang, W., Wei, F., Dong, L., Bao, H., Yang, N., Zhou, M.: MiniLM: deep self-attention distillation for task-agnostic compression of pre-trained transformers. Adv. Neural. Inf. Process. Syst. 33, 5776–5788 (2020)
Zhang, L., Li, P., Chen, J., Shen, S.: Trajectory prediction with graph-based dual-scale context fusion. In: 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 11374–11381. IEEE (2022)
Zhao, H., et al.: TNT: target-driven trajectory prediction. In: Conference on Robot Learning, pp. 895–904. PMLR (2021)
Zhou, Z., Ye, L., Wang, J., Wu, K., Lu, K.: HiVT: hierarchical vector transformer for multi-agent motion prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8823–8833 (2022)
Acknowledgement
This work was supported by 42dot. Also, this work was supported by Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Education(NRF-2021R1A6A1A13044830, 15%) and supported by Institute of Information & communications Technology Planning & Evaluation(IITP) grant funded by the Korea government(MSIT) (RS-2022-II220043, Adaptive Personality for Intelligent Agents, 15%, IITP-2024-RS-2024-00397085, Leading Generative AI Human Resources Development, 15%).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Moon, S., Yeon, K., Kim, H., Jeong, SG., Kim, J. (2025). Who Should Have Been Focused: Transferring Attention-Based Knowledge from Future Observations for Trajectory Prediction. In: Antonacopoulos, A., Chaudhuri, S., Chellappa, R., Liu, CL., Bhattacharya, S., Pal, U. (eds) Pattern Recognition. ICPR 2024. Lecture Notes in Computer Science, vol 15317. Springer, Cham. https://doi.org/10.1007/978-3-031-78447-7_23
Download citation
DOI: https://doi.org/10.1007/978-3-031-78447-7_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-78446-0
Online ISBN: 978-3-031-78447-7
eBook Packages: Computer ScienceComputer Science (R0)