Abstract
In the tile-based panoramic video streaming, the Field of View (FOV) is composed of multiple real-time synchronized visible video tiles. The common panoramic video transmission control methods use the FOV prediction and redundant tile transmission to address the issues of network delay and fast viewport switching. However, these methods rely heavily on the FOV prediction accuracy and do not fully consider the transmission efficiency, which is measured by the ratio of data used for FOV to the total transmitted data. Moreover, the existing learning-based methods directly consider the ever-changing factors such as network bandwidth and viewport position in the learning process, resulting in the poor stability of the transmission control. In this paper, we propose a Deep Reinforcement Learning (DRL)-based transmission control method for the tile-based panoramic video streaming, and the objective is to optimize the transmission efficiency on the basis of the guaranteed Quality of Experience (QoE). Firstly, we define the panoramic video transmission control process as the maximization of transmission efficiency on the basis of constraining multiple QoE metrics in the preset acceptable ranges. Secondly, we design a two-stage transmission control decision-making mechanism to improve the stability of transmission process, which includes intermediate decision-making stage and final decision-making stage. During the intermediate decision-making stage, the newly defined aggregated transmission decision variables are learned by using the Rainbow Deep Q Network. In this online learning process, we only consider the QoE and transmission efficiency, and avoid directly involving the ever-changing environment factors. During the final decision-making stage, the bitrate and buffer size of each video tile are determined according to the network bandwidth and viewport under the guidance of the intermediate decisions. Finally, the experiments conducted with the actual network bandwidth and viewport track show that our method performs better in the long-term transmission efficiency than other methods.
Similar content being viewed by others
References
Alface, P.R., Macq, J.F., Verzijp, N.: Interactive omnidirectional video delivery: a bandwidth-effective approach. Bell Labs Tech. J. 16(4), 135–147 (2012)
Bellemare, M.G., Dabney, W., Munos, R.: A distributional perspective on reinforcement learning. In: Proceedings of the International Conference on Machine Learning, pp. 449–458 (2017)
Chen, S., Wu, H., Han, X., Xiao, L.: Multi-step truncated q learning algorithm. In: Proceedings of the International Conference on Machine Learning and Cybernetics, vol. 1, pp. 194–198 (2005)
Chen, Z., Li, Y., Zhang, Y.: Recent advances in omnidirectional video coding for virtual reality: Projection and evaluation. Sig. Proc. 146, 66–78 (2018)
Corbillon, X., Devlic, A., Simon, G., Chakareski, J.: Optimal set of 360-degree videos for viewport-adaptive streaming. In: Proceedings of the 25th ACM International Conference on Multimedia, pp. 943–951 (2017)
Corbillon, X., Simon, G., Devlic, A., Chakareski, J.: Viewport-adaptive navigable 360-degree video delivery. In: Proceedings of the IEEE International Conference on Communications (2017)
D’Acunto, L., Berg, J., Thomas, E., Niamut, O.: Using mpeg dash srd for zoomable and navigable video. In: Proceedings of the 7th International Conference on Multimedia Systems (2016)
David, E.J., Gutiérrez, J., Coutrot, A., Da Silva, M.P., Callet, P.L.: A dataset of head and eye movements for 360 videos. In: Proceedings of the 9th ACM Multimedia Systems Conference, pp. 432–437 (2018)
Duanmu, F., He, Y., Xiu, X., Hanhart, P., Ye, Y., Wang, Y.: Hybrid cubemap projection format for 360-degree video coding. In: Data Compression Conference (DCC), pp. 404–405 (2018)
Duanmu, F., Kurdoglu, E., Hosseini, S.A., Liu, Y., Wang, Y.: Prioritized buffer control in two-tier 360 video streaming. In: Proceedings of the Workshop on Virtual Reality and Augmented Reality Network, pp. 13–18 (2017)
Fortunato, M., Azar, M.G., Piot, B., et al.: Noisy networks for exploration. In: Proceedings of the 6th International Conference on Learning Representations (2018)
Fu, C.W., Wan, L., Wong, T.T., Leung, C.S.: The rhombic dodecahedron map: An efficient scheme for encoding panoramic video. IEEE Transact. Mult. 11(4), 634–644 (2009)
Geva, S., Sitte, J.: A cartpole experiment benchmark for trainable controllers. IEEE Control Syst. Mag. 13(5), 40–51 (1993)
Graf, M., Timmerer, C., Mueller, C.: Towards bandwidth efficient adaptive streaming of omnidirectional video over http: Design, implementation, and evaluation. In: Proceedings of the 8th ACM on Multimedia Systems Conference, pp. 261–271 (2017)
Hessel, M., Modayil, J., Van Hasselt, H., et al.: Rainbow: Combining improvements in deep reinforcement learning. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Hou, X., Dey, S., Zhang, J., Budagavi, M.: Predictive view generation to enable mobile 360-degree and VR experiences. In: Proceedings of the 2018 Morning Workshop on Virtual Reality and Augmented Reality Network, pp. 20–26 (2018)
Huang, K.C., Chien, P.Y., Chien, C.A., Chang, H.C., Guo, J.I.: A 360-degree panoramic video system design. In: Technical Papers of 2014 International Symposium on VLSI Design, Automation and Test (2014)
Huo, Y., Kuang, H.: Ts360: A two-stage deep reinforcement learning system for 360-degree video streaming. In: 2022 IEEE International Conference on Multimedia and Expo (2022)
Jeppsson, M., Espeland, H.N., Stensland, H., et al.: Efficient live and on-demand tiled hevc 360 VR video streaming. Int. J. Semantic Comput. 13(03), 367–391 (2019)
Jiang, Z., Zhang, X., Xu, Y., Ma, Z., Sun, J., Zhang, Y.: Reinforcement learning based rate adaptation for 360-degree video streaming. IEEE Transact. Broadcast. 67(2), 409–423 (2021)
Liu, X., Xiao, Q., Gopalakrishnan, V., Han, B., Qian, F., Varvello, M.: 360 innovations for panoramic video streaming. In: Proceedings of the 16th ACM Workshop on Hot Topics in Networks, pp. 50–56 (2017)
Mahzari, A., Taghavi Nasrabadi, A., Samiei, A., Prakash, R.: Fov-aware edge caching for adaptive 360 video streaming. In: Proceedings of the 26th ACM International Conference on Multimedia, pp. 173–181 (2018)
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Nasrabadi, A.T., Mahzari, A., Beshay, J.D., Prakash, R.: Adaptive 360-degree video streaming using scalable video coding. In: Proceedings of the 25th ACM International Conference on Multimedia, pp. 1689–1697 (2017)
Ohashi, A., Tanaka, Y., Masuyama, G., et al.: Fisheye stereo camera using equirectangular images. In: Proceedings of the 17th International Conference on Research and Education in Mechatronics, pp. 284–289 (2016)
Petrangeli, S., Swaminathan, V., Hosseini, M., De Turck, F.: An http/2-based adaptive streaming framework for 360 virtual reality videos. In: Proceedings of the 25th ACM international conference on Multimedia, pp. 306–314 (2017)
Petrangeli, S., Swaminathan, V., Hosseini, M., De Turck, F.: Improving virtual reality streaming using http/2. In: Proceedings of the 8th ACM on Multimedia Systems Conference, pp. 225–228 (2017)
Podborski, D., Son, J., Bhullar, G.S., et al.: HTML5 MSE playback of mpeg 360 VR tiled streaming: JavaScript implementation of MPEG-OMAF viewport-dependent video profile with HEVC tiles. In: Proceedings of the 10th ACM Multimedia Systems Conference, pp. 324–327 (2019)
Qian, F., Ji, L., Han, B., Gopalakrishnan, V.: Optimizing 360 video delivery over cellular networks. In: Proceedings of the 5th Workshop on All Things Cellular: Operations, Applications and Challenges (2016)
Rondao Alface, P., Aerts, M., Tytgat, D., Lievens, S., Stevens, C., Verzijp, N., Macq, J.F.: 16k cinematic vr streaming. In: Proceedings of the 25th ACM International Conference on Multimedia, pp. 1105–1112 (2017)
Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized experience replay. Proceedings of the 4th International Conference on Learning Representations (2015)
Son, J., Jang, D., Ryu, E.S.: Implementing motion-constrained tile and viewport extraction for VR streaming. In: Proceedings of the 28th ACM SIGMM Workshop on Network and Operating Systems Support for Digital Audio and Video, pp. 61–66 (2018)
Sreedhar, K.K., Aminlou, A., Hannuksela, M.M., Gabbouj, M.: Viewport-adaptive encoding and streaming of 360-degree video for virtual reality applications. In: IEEE International Symposium on Multimedia, pp. 583–586 (2016)
Stockhammer, T.: Dynamic adaptive streaming over http – standards and design principles. In: Proceedings of the Second Annual ACM Conference on Multimedia systems, pp. 133–144 (2011)
Van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double q-learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 30 (2016)
Wang, Z., Schaul, T., Hessel, M., et al.: Dueling network architectures for deep reinforcement learning. In: Proceedings of the International Conference on Machine Learning, pp. 1995–2003 (2016)
Xiao, G., Wu, M., Shi, Q., Zhou, Z., Chen, X.: DeepVR: Deep reinforcement learning for predictive panoramic video streaming. IEEE Trans. Cogn. Commun. Netw. 5(4), 1167–1177 (2019)
Xie, L., Xu, Z., Ban, Y., Zhang, X., Guo, Z.: 360probdash: Improving qoe of 360 video streaming using tile-based http adaptive streaming. In: Proceedings of the 25th ACM International Conference on Multimedia, pp. 315–323 (2017)
Xie, L., Zhang, X., Guo, Z.: Cls: A cross-user learning based system for improving qoe in 360-degree video adaptive streaming. In: Proceedings of the 26th ACM International Conference on Multimedia, pp. 564–572 (2018)
Xu, Y., Dong, Y., Wu, J., et al.: Gaze prediction in dynamic 360 immersive videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5333–5342 (2018)
Yun, D., Chung, K.: DASH-based multi-view video streaming system. IEEE Transact. Circuits Syst. Video Technol. 28(8), 1974–1980 (2017)
Zare, A., Aminlou, A., Hannuksela, M.M., Gabbouj, M.: HEVC-compliant tile-based streaming of panoramic video for virtual reality applications. In: Proceedings of the 24th ACM International Conference on Multimedia, pp. 601–605 (2016)
Zhang, Y., Zhao, P., Bian, K., Liu, Y., Song, L., Li, X.: DRL360: 360-degree video streaming with deep reinforcement learning. In: Proceedings of IEEE Conference on Computer Communications, pp. 1252–1260 (2019)
Zheng, X., Jiang, G., Yu, M., Jiang, H.: Segmented spherical projection-based blind omnidirectional image quality assessment. IEEE Access 8, 31647–31659 (2016)
Acknowledgements
This work is supported by the Funds for Creative Research Groups of China under Grant No. 61921003.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Q. Shen.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, J., Zhang, H. & Ma, H. DRL-based transmission control for QoE guaranteed transmission efficiency optimization in tile-based panoramic video streaming. Multimedia Systems 29, 2761–2777 (2023). https://doi.org/10.1007/s00530-023-01129-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-023-01129-3