iBet uBet web content aggregator. Adding the entire web to your favor.
iBet uBet web content aggregator. Adding the entire web to your favor.



Link to original content: https://doi.org/10.1007/978-3-031-19842-7_29
Learning Efficient Multi-agent Cooperative Visual Exploration | SpringerLink
Skip to main content

Learning Efficient Multi-agent Cooperative Visual Exploration

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 (ECCV 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13699))

Included in the following conference series:

  • 3634 Accesses

Abstract

We tackle the problem of cooperative visual exploration where multiple agents need to jointly explore unseen regions as fast as possible based on visual signals. Classical planning-based methods often suffer from expensive computation overhead at each step and a limited expressiveness of complex cooperation strategy. By contrast, reinforcement learning (RL) has recently become a popular paradigm for tackling this challenge due to its modeling capability of arbitrarily complex strategies and minimal inference overhead. In this paper, we propose a novel RL-based multi-agent planning module, Multi-agent Spatial Planner (MSP). MSP leverages a transformer-based architecture, Spatial-TeamFormer, which effectively captures spatial relations and intra-agent interactions via hierarchical spatial self-attentions. In addition, we also implement a few multi-agent enhancements to process local information from each agent for an aligned spatial representation and more precise planning. Finally, we perform policy distillation to extract a meta policy to significantly improve the generalization capability of final policy. We call this overall solution, Multi-Agent Active Neural SLAM (MAANS). MAANS substantially outperforms classical planning-based baselines for the first time in a photo-realistic 3D simulator, Habitat. Code and videos can be found at https://sites.google.com/view/maans.

C. Yu, X. Yang, and J. Gao—Equal contribution.

This research is supported by NSFC (U20A20334, U19B2019 and M-0248), Tsinghua-Meituan Joint Institute for Digital Life, Tsinghua EE Independent Research Project, Beijing National Research Center for Information Science and Technology (BNRist), and Beijing Innovation Center for Future Chips.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Anguelov, D., et al.: Google street view: capturing the world at street level. Computer 43(6), 32–38 (2010)

    Article  Google Scholar 

  2. Bhatti, S., Desmaison, A., Miksik, O., Nardelli, N., Siddharth, N., Torr, P.H.: Playing doom with slam-augmented deep reinforcement learning. arXiv preprint arXiv:1612.00380 (2016)

  3. Bresson, G., Alsayed, Z., Yu, L., Glaser, S.: Simultaneous localization and mapping: a survey of current trends in autonomous driving. IEEE Trans. Intell. Veh. 2(3), 194–220 (2017)

    Article  Google Scholar 

  4. Burgard, W., Moors, M., Stachniss, C., Schneider, F.E.: Coordinated multi-robot exploration. IEEE Trans. Rob. 21(3), 376–386 (2005)

    Article  Google Scholar 

  5. Čáp, M., Novák, P., Vokřínek, J., Pěchouček, M.: Multi-agent RRT*: sampling-based cooperative pathfinding. arXiv preprint arXiv:1302.2828 (2013)

  6. Chaplot, D.S., Gandhi, D., Gupta, S., Gupta, A., Salakhutdinov, R.: Learning to explore using active neural slam. In: International Conference on Learning Representations. ICLR (2020)

    Google Scholar 

  7. Chaplot, D.S., Gandhi, D.P., Gupta, A., Salakhutdinov, R.R.: Object goal navigation using goal-oriented semantic exploration. In: Advances in Neural Information Processing Systems, vol. 33 (2020)

    Google Scholar 

  8. Chaplot, D.S., Salakhutdinov, R., Gupta, A., Gupta, S.: Neural topological slam for visual navigation. In: CVPR (2020)

    Google Scholar 

  9. Chen, T., Gupta, S., Gupta, A.: Learning exploration policies for navigation. In: International Conference on Learning Representations. ICLR (2019)

    Google Scholar 

  10. Chu, X., Ye, H.: Parameter sharing deep deterministic policy gradient for cooperative multi-agent reinforcement learning. CoRR abs/1710.00336 (2017)

    Google Scholar 

  11. Cohen, W.W.: Adaptive mapping and navigation by teams of simple robots. Robot. Auton. Syst. 18(4), 411–434 (1996)

    Article  Google Scholar 

  12. Desaraju, V.R., How, J.P.: Decentralized path planning for multi-agent teams in complex environments using rapidly-exploring random trees. In: 2011 IEEE International Conference on Robotics and Automation, pp. 4956–4961. IEEE (2011)

    Google Scholar 

  13. Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)

  14. Duan, Y., et al.: One-shot imitation learning. In: NIPS (2017)

    Google Scholar 

  15. Foerster, J.N., Assael, Y.M., De Freitas, N., Whiteson, S.: Learning to communicate with deep multi-agent reinforcement learning. arXiv preprint arXiv:1605.06676 (2016)

  16. Henriques, J.F., Vedaldi, A.: Mapnet: an allocentric spatial memory for mapping environments. In: proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8476–8484 (2018)

    Google Scholar 

  17. Hessel, M., Soyer, H., Espeholt, L., Czarnecki, W., Schmitt, S., van Hasselt, H.: Multi-task deep reinforcement learning with popart. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 3796–3803 (2019)

    Google Scholar 

  18. Hu, J., Niu, H., Carrasco, J., Lennox, B., Arvin, F.: Voronoi-based multi-robot autonomous exploration in unknown environments via deep reinforcement learning. IEEE Trans. Veh. Technol. 69(12), 14413–14423 (2020)

    Article  Google Scholar 

  19. Iqbal, S., Sha, F.: Actor-attention-critic for multi-agent reinforcement learning. In: International Conference on Machine Learning, pp. 2961–2970. PMLR (2019)

    Google Scholar 

  20. Iqbal, S., Sha, F.: Coordinated exploration via intrinsic rewards for multi-agent reinforcement learning. arXiv preprint arXiv:1905.12127 (2019)

  21. Isler, S., Sabzevari, R., Delmerico, J., Scaramuzza, D.: An information gain formulation for active volumetric 3D reconstruction. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 3477–3484. IEEE (2016)

    Google Scholar 

  22. Jain, U., et al.: Two body problem: collaborative visual task completion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6689–6699 (2019)

    Google Scholar 

  23. Jiang, J., Dun, C., Huang, T., Lu, Z.: Graph convolutional reinforcement learning. arXiv preprint arXiv:1810.09202 (2018)

  24. Jiang, J., Lu, Z.: Learning attentional communication for multi-agent cooperation. Adv. Neural. Inf. Process. Syst. 31, 7254–7264 (2018)

    Google Scholar 

  25. Juliá, M., Gil, A., Reinoso, O.: A comparison of path planning strategies for autonomous exploration and mapping of unknown environments. Auton. Robot. 33(4), 427–444 (2012)

    Article  Google Scholar 

  26. Kleiner, A., Prediger, J., Nebel, B.: RFID technology-based exploration and slam for search and rescue. In: 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 4054–4059. IEEE (2006)

    Google Scholar 

  27. Li, A.Q.: Exploration and mapping with groups of robots: recent trends. Curr. Robot. Rep. 1(4), 1–11 (2020)

    Google Scholar 

  28. Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)

  29. Liu, I.J., Jain, U., Yeh, R.A., Schwing, A.: Cooperative exploration for multi-agent deep reinforcement learning. In: International Conference on Machine Learning, pp. 6826–6836. PMLR (2021)

    Google Scholar 

  30. Liu, X., Guo, D., Liu, H., Sun, F.: Multi-agent embodied visual semantic navigation with scene prior knowledge. arXiv preprint arXiv:2109.09531 (2021)

  31. Long, Q., Zhou, Z., Gupta, A., Fang, F., Wu, Y., Wang, X.: Evolutionary population curriculum for scaling multi-agent reinforcement learning. In: International Conference on Learning Representations (2020)

    Google Scholar 

  32. Malysheva, A., Sung, T.T., Sohn, C.B., Kudenko, D., Shpilman, A.: Deep multi-agent reinforcement learning with relevance graphs. arXiv preprint arXiv:1811.12557 (2018)

  33. Mnih, V., et al.: Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)

  34. Mousavian, A., Toshev, A., Fišer, M., Košecká, J., Wahid, A., Davidson, J.: Visual representations for semantic target driven navigation. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 8846–8852. IEEE (2019)

    Google Scholar 

  35. Nazif, A.N., Davoodi, A., Pasquier, P.: Multi-agent area coverage using a single query roadmap: a swarm intelligence approach. In: Bai, Q., Fukuta, N. (eds.) Advances in Practical Multi-Agent Systems, pp. 95–112. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-16098-1_7

    Chapter  Google Scholar 

  36. Nguyen, T.T., Nguyen, N.D., Nahavandi, S.: Deep reinforcement learning for multiagent systems: a review of challenges, solutions, and applications. IEEE Trans. Cybern. 50(9), 3826–3839 (2020)

    Article  Google Scholar 

  37. Parisotto, E., Salakhutdinov, R.: Neural map: structured memory for deep reinforcement learning. In: International Conference on Learning Representations. ICLR (2018)

    Google Scholar 

  38. Patel, S., et al.: Multi-agent ergodic coverage in urban environments (2021)

    Google Scholar 

  39. Peng, P., et al.: Multiagent bidirectionally-coordinated nets for learning to play starcraft combat games. CoRR abs/1703.10069 (2017). https://arxiv.org/abs/1703.10069

  40. Ramakrishnan, S.K., Al-Halah, Z., Grauman, K.: Occupancy anticipation for efficient exploration and navigation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12350, pp. 400–418. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58558-7_24

    Chapter  Google Scholar 

  41. Ramakrishnan, S.K., Jayaraman, D., Grauman, K.: An exploration of embodied visual exploration. Int. J. Comput. Vision 129(5), 1616–1649 (2021)

    Article  Google Scholar 

  42. Rashid, T., Samvelyan, M., Schroeder, C., Farquhar, G., Foerster, J., Whiteson, S.: Qmix: monotonic value function factorisation for deep multi-agent reinforcement learning. In: International Conference on Machine Learning, pp. 4295–4304. PMLR (2018)

    Google Scholar 

  43. Ross, S., Gordon, G., Bagnell, D.: A reduction of imitation learning and structured prediction to no-regret online learning. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pp. 627–635. JMLR Workshop and Conference Proceedings (2011)

    Google Scholar 

  44. Ryu, H., Shin, H., Park, J.: Multi-agent actor-critic with hierarchical graph attention network. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 7236–7243 (2020)

    Google Scholar 

  45. Savinov, N., et al.: Episodic curiosity through reachability. In: International Conference on Learning Representations. ICLR (2019)

    Google Scholar 

  46. Savva, M., et al.: Habitat: a platform for embodied AI research. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9339–9347 (2019)

    Google Scholar 

  47. Sethian, J.A.: A fast marching level set method for monotonically advancing fronts. Proc. Natl. Acad. Sci. 93(4), 1591–1595 (1996)

    Article  MathSciNet  Google Scholar 

  48. Sukhbaatar, S., Fergus, R., et al.: Learning multiagent communication with backpropagation. Adv. Neural. Inf. Process. Syst. 29, 2244–2252 (2016)

    Google Scholar 

  49. Sunehag, P., et al.: Value-decomposition networks for cooperative multi-agent learning based on team reward. In: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, pp. 2085–2087 (2018)

    Google Scholar 

  50. Tagliabue, A., Schneider, S., Pavone, M., Agha-mohammadi, A.: Shapeshifter: a multi-agent, multi-modal robotic platform for exploration of titan. CoRR abs/2002.00515 (2020)

    Google Scholar 

  51. Teh, Y.W., et al.: Distral: robust multitask reinforcement learning. In: NIPS (2017)

    Google Scholar 

  52. Terry, J.K., Grammel, N., Hari, A., Santos, L., Black, B., Manocha, D.: Parameter sharing is surprisingly useful for multi-agent deep reinforcement learning. CoRR abs/2005.13625 (2020)

    Google Scholar 

  53. Umari, H., Mukhopadhyay, S.: Autonomous robotic exploration based on multiple rapidly-exploring randomized trees. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1396–1402 (2017). https://doi.org/10.1109/IROS.2017.8202319

  54. Vaswani, A., et al.: Attention is all you need. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30. Curran Associates, Inc. (2017)

    Google Scholar 

  55. Wakilpoor, C., Martin, P.J., Rebhuhn, C., Vu, A.: Heterogeneous multi-agent reinforcement learning for unknown environment mapping. arXiv preprint arXiv:2010.02663 (2020)

  56. Wang, H., Wang, W., Zhu, X., Dai, J., Wang, L.: Collaborative visual navigation. arXiv preprint arXiv:2107.01151 (2021)

  57. Wang, T., Wang, J., Wu, Y., Zhang, C.: Influence-based multi-agent exploration. In: International Conference on Learning Representations (2020)

    Google Scholar 

  58. Wang, W., et al.: From few to more: large-scale dynamic multiagent curriculum learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 7293–7300 (2020)

    Google Scholar 

  59. Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7794–7803 (2018)

    Google Scholar 

  60. Wurm, K.M., Stachniss, C., Burgard, W.: Coordinated multi-robot exploration using a segmentation of the environment. In: 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 1160–1165. IEEE (2008)

    Google Scholar 

  61. Xia, F., Zamir, A.R., He, Z., Sax, A., Malik, J., Savarese, S.: Gibson ENV: real-world perception for embodied agents. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9068–9079 (2018)

    Google Scholar 

  62. Yamauchi, B.: A frontier-based approach for autonomous exploration. In: Proceedings 1997 IEEE International Symposium on Computational Intelligence in Robotics and Automation CIRA 1997. Towards New Computational Principles for Robotics and Automation, pp. 146–151. IEEE (1997)

    Google Scholar 

  63. Yang, W., Wang, X., Farhadi, A., Gupta, A., Mottaghi, R.: Visual semantic navigation using scene priors. arXiv preprint arXiv:1810.06543 (2018)

  64. Yu, C., Velu, A., Vinitsky, E., Wang, Y., Bayen, A., Wu, Y.: The surprising effectiveness of mappo in cooperative, multi-agent games. arXiv preprint arXiv:2103.01955 (2021)

  65. Yu, J., et al.: SMMR-explore: submap-based multi-robot exploration system with multi-robot multi-target potential field exploration method. In: 2021 IEEE International Conference on Robotics and Automation (ICRA) (2021)

    Google Scholar 

  66. Zambaldi, V., et al.: Relational deep reinforcement learning. arXiv preprint arXiv:1806.01830 (2018)

  67. Zhang, C., Song, D., Huang, C., Swami, A., Chawla, N.V.: Heterogeneous graph neural network. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 793–803 (2019)

    Google Scholar 

  68. Zhang, Y., Hare, J., Prugel-Bennett, A.: Deep set prediction networks. Adv. Neural. Inf. Process. Syst. 32, 3212–3222 (2019)

    Google Scholar 

  69. Zhu, F., et al.: Main: a multi-agent indoor navigation benchmark for cooperative learning (2021)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Yu Wang or Yi Wu .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 868 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yu, C., Yang, X., Gao, J., Yang, H., Wang, Y., Wu, Y. (2022). Learning Efficient Multi-agent Cooperative Visual Exploration. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13699. Springer, Cham. https://doi.org/10.1007/978-3-031-19842-7_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-19842-7_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-19841-0

  • Online ISBN: 978-3-031-19842-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics