iBet uBet web content aggregator. Adding the entire web to your favor.
iBet uBet web content aggregator. Adding the entire web to your favor.



Link to original content: https://doi.org/10.1007/s00530-023-01129-3
DRL-based transmission control for QoE guaranteed transmission efficiency optimization in tile-based panoramic video streaming | Multimedia Systems Skip to main content
Log in

DRL-based transmission control for QoE guaranteed transmission efficiency optimization in tile-based panoramic video streaming

  • Regular Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

In the tile-based panoramic video streaming, the Field of View (FOV) is composed of multiple real-time synchronized visible video tiles. The common panoramic video transmission control methods use the FOV prediction and redundant tile transmission to address the issues of network delay and fast viewport switching. However, these methods rely heavily on the FOV prediction accuracy and do not fully consider the transmission efficiency, which is measured by the ratio of data used for FOV to the total transmitted data. Moreover, the existing learning-based methods directly consider the ever-changing factors such as network bandwidth and viewport position in the learning process, resulting in the poor stability of the transmission control. In this paper, we propose a Deep Reinforcement Learning (DRL)-based transmission control method for the tile-based panoramic video streaming, and the objective is to optimize the transmission efficiency on the basis of the guaranteed Quality of Experience (QoE). Firstly, we define the panoramic video transmission control process as the maximization of transmission efficiency on the basis of constraining multiple QoE metrics in the preset acceptable ranges. Secondly, we design a two-stage transmission control decision-making mechanism to improve the stability of transmission process, which includes intermediate decision-making stage and final decision-making stage. During the intermediate decision-making stage, the newly defined aggregated transmission decision variables are learned by using the Rainbow Deep Q Network. In this online learning process, we only consider the QoE and transmission efficiency, and avoid directly involving the ever-changing environment factors. During the final decision-making stage, the bitrate and buffer size of each video tile are determined according to the network bandwidth and viewport under the guidance of the intermediate decisions. Finally, the experiments conducted with the actual network bandwidth and viewport track show that our method performs better in the long-term transmission efficiency than other methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Notes

  1. https://tensorforce.readthedocs.io/en/latest/.

References

  1. Alface, P.R., Macq, J.F., Verzijp, N.: Interactive omnidirectional video delivery: a bandwidth-effective approach. Bell Labs Tech. J. 16(4), 135–147 (2012)

    Article  Google Scholar 

  2. Bellemare, M.G., Dabney, W., Munos, R.: A distributional perspective on reinforcement learning. In: Proceedings of the International Conference on Machine Learning, pp. 449–458 (2017)

  3. Chen, S., Wu, H., Han, X., Xiao, L.: Multi-step truncated q learning algorithm. In: Proceedings of the International Conference on Machine Learning and Cybernetics, vol. 1, pp. 194–198 (2005)

  4. Chen, Z., Li, Y., Zhang, Y.: Recent advances in omnidirectional video coding for virtual reality: Projection and evaluation. Sig. Proc. 146, 66–78 (2018)

    Article  Google Scholar 

  5. Corbillon, X., Devlic, A., Simon, G., Chakareski, J.: Optimal set of 360-degree videos for viewport-adaptive streaming. In: Proceedings of the 25th ACM International Conference on Multimedia, pp. 943–951 (2017)

  6. Corbillon, X., Simon, G., Devlic, A., Chakareski, J.: Viewport-adaptive navigable 360-degree video delivery. In: Proceedings of the IEEE International Conference on Communications (2017)

  7. D’Acunto, L., Berg, J., Thomas, E., Niamut, O.: Using mpeg dash srd for zoomable and navigable video. In: Proceedings of the 7th International Conference on Multimedia Systems (2016)

  8. David, E.J., Gutiérrez, J., Coutrot, A., Da Silva, M.P., Callet, P.L.: A dataset of head and eye movements for 360 videos. In: Proceedings of the 9th ACM Multimedia Systems Conference, pp. 432–437 (2018)

  9. Duanmu, F., He, Y., Xiu, X., Hanhart, P., Ye, Y., Wang, Y.: Hybrid cubemap projection format for 360-degree video coding. In: Data Compression Conference (DCC), pp. 404–405 (2018)

  10. Duanmu, F., Kurdoglu, E., Hosseini, S.A., Liu, Y., Wang, Y.: Prioritized buffer control in two-tier 360 video streaming. In: Proceedings of the Workshop on Virtual Reality and Augmented Reality Network, pp. 13–18 (2017)

  11. Fortunato, M., Azar, M.G., Piot, B., et al.: Noisy networks for exploration. In: Proceedings of the 6th International Conference on Learning Representations (2018)

  12. Fu, C.W., Wan, L., Wong, T.T., Leung, C.S.: The rhombic dodecahedron map: An efficient scheme for encoding panoramic video. IEEE Transact. Mult. 11(4), 634–644 (2009)

    Article  Google Scholar 

  13. Geva, S., Sitte, J.: A cartpole experiment benchmark for trainable controllers. IEEE Control Syst. Mag. 13(5), 40–51 (1993)

    Article  Google Scholar 

  14. Graf, M., Timmerer, C., Mueller, C.: Towards bandwidth efficient adaptive streaming of omnidirectional video over http: Design, implementation, and evaluation. In: Proceedings of the 8th ACM on Multimedia Systems Conference, pp. 261–271 (2017)

  15. Hessel, M., Modayil, J., Van Hasselt, H., et al.: Rainbow: Combining improvements in deep reinforcement learning. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (2018)

  16. Hou, X., Dey, S., Zhang, J., Budagavi, M.: Predictive view generation to enable mobile 360-degree and VR experiences. In: Proceedings of the 2018 Morning Workshop on Virtual Reality and Augmented Reality Network, pp. 20–26 (2018)

  17. Huang, K.C., Chien, P.Y., Chien, C.A., Chang, H.C., Guo, J.I.: A 360-degree panoramic video system design. In: Technical Papers of 2014 International Symposium on VLSI Design, Automation and Test (2014)

  18. Huo, Y., Kuang, H.: Ts360: A two-stage deep reinforcement learning system for 360-degree video streaming. In: 2022 IEEE International Conference on Multimedia and Expo (2022)

  19. Jeppsson, M., Espeland, H.N., Stensland, H., et al.: Efficient live and on-demand tiled hevc 360 VR video streaming. Int. J. Semantic Comput. 13(03), 367–391 (2019)

    Article  Google Scholar 

  20. Jiang, Z., Zhang, X., Xu, Y., Ma, Z., Sun, J., Zhang, Y.: Reinforcement learning based rate adaptation for 360-degree video streaming. IEEE Transact. Broadcast. 67(2), 409–423 (2021)

    Article  Google Scholar 

  21. Liu, X., Xiao, Q., Gopalakrishnan, V., Han, B., Qian, F., Varvello, M.: 360 innovations for panoramic video streaming. In: Proceedings of the 16th ACM Workshop on Hot Topics in Networks, pp. 50–56 (2017)

  22. Mahzari, A., Taghavi Nasrabadi, A., Samiei, A., Prakash, R.: Fov-aware edge caching for adaptive 360 video streaming. In: Proceedings of the 26th ACM International Conference on Multimedia, pp. 173–181 (2018)

  23. Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)

    Article  Google Scholar 

  24. Nasrabadi, A.T., Mahzari, A., Beshay, J.D., Prakash, R.: Adaptive 360-degree video streaming using scalable video coding. In: Proceedings of the 25th ACM International Conference on Multimedia, pp. 1689–1697 (2017)

  25. Ohashi, A., Tanaka, Y., Masuyama, G., et al.: Fisheye stereo camera using equirectangular images. In: Proceedings of the 17th International Conference on Research and Education in Mechatronics, pp. 284–289 (2016)

  26. Petrangeli, S., Swaminathan, V., Hosseini, M., De Turck, F.: An http/2-based adaptive streaming framework for 360 virtual reality videos. In: Proceedings of the 25th ACM international conference on Multimedia, pp. 306–314 (2017)

  27. Petrangeli, S., Swaminathan, V., Hosseini, M., De Turck, F.: Improving virtual reality streaming using http/2. In: Proceedings of the 8th ACM on Multimedia Systems Conference, pp. 225–228 (2017)

  28. Podborski, D., Son, J., Bhullar, G.S., et al.: HTML5 MSE playback of mpeg 360 VR tiled streaming: JavaScript implementation of MPEG-OMAF viewport-dependent video profile with HEVC tiles. In: Proceedings of the 10th ACM Multimedia Systems Conference, pp. 324–327 (2019)

  29. Qian, F., Ji, L., Han, B., Gopalakrishnan, V.: Optimizing 360 video delivery over cellular networks. In: Proceedings of the 5th Workshop on All Things Cellular: Operations, Applications and Challenges (2016)

  30. Rondao Alface, P., Aerts, M., Tytgat, D., Lievens, S., Stevens, C., Verzijp, N., Macq, J.F.: 16k cinematic vr streaming. In: Proceedings of the 25th ACM International Conference on Multimedia, pp. 1105–1112 (2017)

  31. Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized experience replay. Proceedings of the 4th International Conference on Learning Representations (2015)

  32. Son, J., Jang, D., Ryu, E.S.: Implementing motion-constrained tile and viewport extraction for VR streaming. In: Proceedings of the 28th ACM SIGMM Workshop on Network and Operating Systems Support for Digital Audio and Video, pp. 61–66 (2018)

  33. Sreedhar, K.K., Aminlou, A., Hannuksela, M.M., Gabbouj, M.: Viewport-adaptive encoding and streaming of 360-degree video for virtual reality applications. In: IEEE International Symposium on Multimedia, pp. 583–586 (2016)

  34. Stockhammer, T.: Dynamic adaptive streaming over http – standards and design principles. In: Proceedings of the Second Annual ACM Conference on Multimedia systems, pp. 133–144 (2011)

  35. Van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double q-learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 30 (2016)

  36. Wang, Z., Schaul, T., Hessel, M., et al.: Dueling network architectures for deep reinforcement learning. In: Proceedings of the International Conference on Machine Learning, pp. 1995–2003 (2016)

  37. Xiao, G., Wu, M., Shi, Q., Zhou, Z., Chen, X.: DeepVR: Deep reinforcement learning for predictive panoramic video streaming. IEEE Trans. Cogn. Commun. Netw. 5(4), 1167–1177 (2019)

    Article  Google Scholar 

  38. Xie, L., Xu, Z., Ban, Y., Zhang, X., Guo, Z.: 360probdash: Improving qoe of 360 video streaming using tile-based http adaptive streaming. In: Proceedings of the 25th ACM International Conference on Multimedia, pp. 315–323 (2017)

  39. Xie, L., Zhang, X., Guo, Z.: Cls: A cross-user learning based system for improving qoe in 360-degree video adaptive streaming. In: Proceedings of the 26th ACM International Conference on Multimedia, pp. 564–572 (2018)

  40. Xu, Y., Dong, Y., Wu, J., et al.: Gaze prediction in dynamic 360 immersive videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5333–5342 (2018)

  41. Yun, D., Chung, K.: DASH-based multi-view video streaming system. IEEE Transact. Circuits Syst. Video Technol. 28(8), 1974–1980 (2017)

    Article  Google Scholar 

  42. Zare, A., Aminlou, A., Hannuksela, M.M., Gabbouj, M.: HEVC-compliant tile-based streaming of panoramic video for virtual reality applications. In: Proceedings of the 24th ACM International Conference on Multimedia, pp. 601–605 (2016)

  43. Zhang, Y., Zhao, P., Bian, K., Liu, Y., Song, L., Li, X.: DRL360: 360-degree video streaming with deep reinforcement learning. In: Proceedings of IEEE Conference on Computer Communications, pp. 1252–1260 (2019)

  44. Zheng, X., Jiang, G., Yu, M., Jiang, H.: Segmented spherical projection-based blind omnidirectional image quality assessment. IEEE Access 8, 31647–31659 (2016)

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported by the Funds for Creative Research Groups of China under Grant No. 61921003.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haitao Zhang.

Additional information

Communicated by Q. Shen.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, J., Zhang, H. & Ma, H. DRL-based transmission control for QoE guaranteed transmission efficiency optimization in tile-based panoramic video streaming. Multimedia Systems 29, 2761–2777 (2023). https://doi.org/10.1007/s00530-023-01129-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-023-01129-3

Keywords

Navigation