To improve the anti-jamming performance of frequency hopping system in complex electromagnetic environment, a Deep Q-Network algorithm with priority experience replay (PER) based on Pareto samples (PPER-DQN) is proposed, which makes intelligent decision for bivariate FH pattern. The system model, state-action space and reward function are designed based on the main parameters of the FH pattern. The DQN is used to improve the flexibility of the FH pattern. Based on the definition of Pareto dominance, the PER based on the TD-error and immediate reward is proposed. To ensure the diversity of the training set, it is formed by Pareto sample set and several random samples. When selecting Pareto sample, the confidence coefficient is introduced to modify its priority. It guarantees the learning value of the training set and improves the learning efficiency of DQN. The simulation results show that the efficiency, convergence speed and stability of the algorithm are effectively improved. And the generated bivariate FH pattern has better performance than the conventional FH pattern.<\/p>","DOI":"10.4018\/ijitwe.297970","type":"journal-article","created":{"date-parts":[[2022,2,23]],"date-time":"2022-02-23T19:18:25Z","timestamp":1645643905000},"page":"1-23","source":"Crossref","is-referenced-by-count":3,"title":["Intelligent Anti-Jamming Decision Algorithm of Bivariate Frequency Hopping Pattern Based on DQN With PER and Pareto"],"prefix":"10.4018","volume":"17","author":[{"given":"Jiasheng","family":"Zhu","sequence":"first","affiliation":[{"name":"Hangzhou Dianzi University, China"}]},{"given":"Zhijin","family":"Zhao","sequence":"additional","affiliation":[{"name":"School of Communication Engineering, Hangzhou Dianzi University, China"}]},{"given":"Shilian","family":"Zheng","sequence":"additional","affiliation":[{"name":"Science and Technology on Communication Information Security Control Laboratory, China"}]}],"member":"2432","reference":[{"issue":"2","key":"IJITWE.297970-0","first-page":"262","article-title":"Active sampling for deep Q-Learning based on TD-error adaptive correction.","volume":"56","author":"C.Bai","year":"2019","journal-title":"Journal of Computer Research and Development"},{"key":"IJITWE.297970-1","doi-asserted-by":"crossref","unstructured":"Cao, X., Wan, H., Lin, Y., & Han, S. (2019). High-value prioritized experience replay for off-policy reinforcement learning. 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI), 1510-1514.","DOI":"10.1109\/ICTAI.2019.00215"},{"issue":"7","key":"IJITWE.297970-2","first-page":"107","article-title":"Research on anti-follower jamming performance of variable rate frequency hopping communications.","volume":"41","author":"G.Chen","year":"2016","journal-title":"Fire Control & Command Control"},{"key":"IJITWE.297970-3","article-title":"A power allocation algorithm based on cooperative Q-learning for multi-agent D2D communication networks.","volume":"47","author":"Z.Dou","year":"2021","journal-title":"Physical Communication"},{"key":"IJITWE.297970-4","article-title":"Reinforcement and deep reinforcement learning for wireless Internet of Things: A survey.","author":"M. S.Frikha","year":"2021","journal-title":"Computer Communications"},{"issue":"5","key":"IJITWE.297970-5","doi-asserted-by":"crossref","first-page":"5331","DOI":"10.1109\/TVT.2020.2982672","article-title":"Spatial anti-jamming scheme for internet of satellites based on the deep reinforcement learning and Stackelberg game.","volume":"69","author":"C.Han","year":"2020","journal-title":"IEEE Transactions on Vehicular Technology"},{"key":"IJITWE.297970-6","article-title":"Relevant experience learning: A deep reinforcement learning method for UAV autonomous motion planning in complex unknown environments.","author":"Z.Hu","year":"2021","journal-title":"Chinese Journal of Aeronautics"},{"key":"IJITWE.297970-7","author":"J.Huang","year":"2020","journal-title":"Research on Anti-jamming communication technology based on machine learning"},{"key":"IJITWE.297970-8","doi-asserted-by":"crossref","first-page":"4243","DOI":"10.1002\/ett.4243","article-title":"Joint relay and channel selection in relay\u2010aided anti\u2010jamming system: A reinforcement learning approach.","author":"L.Huang","year":"2021","journal-title":"Transactions on Emerging Telecommunications Technologies"},{"key":"IJITWE.297970-9","article-title":"DRL-R: Deep reinforcement learning approach for intelligent routing in software-defined data-center networks.","volume":"177","author":"W.Liu","year":"2021","journal-title":"Journal of Network and Computer Applications"},{"key":"IJITWE.297970-10","doi-asserted-by":"crossref","first-page":"305","DOI":"10.1016\/j.neucom.2021.02.090","article-title":"Bias-reduced hindsight experience replay with virtual goal prioritization.","volume":"451","author":"B.Manela","year":"2021","journal-title":"Neurocomputing"},{"key":"IJITWE.297970-11","doi-asserted-by":"crossref","first-page":"29","DOI":"10.1016\/j.cogsys.2021.07.002","article-title":"Proposal and evaluation of deep exploitation-oriented learning under multiple reward environment.","volume":"70","author":"K.Miyazaki","year":"2021","journal-title":"Cognitive Systems Research"},{"key":"IJITWE.297970-12","unstructured":"Na, D., Zhao, W., & Zhuo, Y. (2016). Design and analysis of the module for variable rate frequency hopping communications. Chinese Institute of Command and Control, 6."},{"issue":"5","key":"IJITWE.297970-13","first-page":"489","article-title":"A dynamic spectrum access algorithm based on prioritized experience replay deep Q-Learning.","volume":"60","author":"X.Pan","year":"2020","journal-title":"Telecommunication Engineering"},{"key":"IJITWE.297970-14","doi-asserted-by":"publisher","DOI":"10.1016\/j.phycom.2020.101063"},{"key":"IJITWE.297970-15","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1016\/j.neucom.2020.02.004","article-title":"Correlation minimizing replay memory in temporal-difference reinforcement learning.","volume":"393","author":"M.Ramicic","year":"2020","journal-title":"Neurocomputing"},{"issue":"9","key":"IJITWE.297970-16","first-page":"9","article-title":"The new construction of wide-gap frequency-hopping sequences based prime sequences.","volume":"36","author":"W.Ren","year":"2020","journal-title":"Journal of Dezhou University"},{"key":"IJITWE.297970-17","doi-asserted-by":"crossref","first-page":"33","DOI":"10.1016\/j.compeleceng.2015.04.010","article-title":"Crack fault diagnosis of rotor systems using wavelet transforms.","volume":"45","author":"Z.Ren","year":"2015","journal-title":"Computers & Electrical Engineering"},{"key":"IJITWE.297970-18","unstructured":"Shi, S., & Liu, Q. (2021). Deep deterministic policy gradient with classified experience replay. Acta Automatica Sinica, 1-9."},{"key":"IJITWE.297970-19","doi-asserted-by":"crossref","DOI":"10.1016\/j.knosys.2021.106844","article-title":"Forgetful experience replay in hierarchical reinforcement learning from expert demonstrations.","volume":"218","author":"A.Skrynnik","year":"2021","journal-title":"Knowledge-Based Systems"},{"key":"IJITWE.297970-20","article-title":"End-to-end CNN-based dueling deep Q-Network for autonomous cell activation in Cloud-RANs.","volume":"169","author":"G.Sun","year":"2020","journal-title":"Journal of Network and Computer Applications"},{"key":"IJITWE.297970-21","article-title":"Multi-modal knowledge-aware reinforcement learning network for explainable recommendation.","volume":"227","author":"S.Tao","year":"2021","journal-title":"Knowledge-Based Systems"},{"key":"IJITWE.297970-22","author":"G.Wang","year":"2019","journal-title":"Research on deep reinforcement learning in cooperative multi-agent system"},{"key":"IJITWE.297970-23","first-page":"1","article-title":"End-to-end self-driving policy based on the deep deterministic policy gradient algorithm considering the state distribution.","author":"T.Wang","year":"2021","journal-title":"Qinghua Daxue Xuebao. Ziran Kexue Ban"},{"issue":"9","key":"IJITWE.297970-24","first-page":"12","article-title":"Frequency hopping based on uniformity compensation.","volume":"37","author":"X.Wang","year":"2018","journal-title":"Ordnance Industry Automation"},{"key":"IJITWE.297970-25","doi-asserted-by":"publisher","DOI":"10.1016\/j.dt.2019.07.006"},{"key":"IJITWE.297970-26","author":"X.Wu","year":"2017","journal-title":"The research for encrypting frequency hopping pattern based on improved AES algorithm"},{"issue":"4","key":"IJITWE.297970-27","first-page":"25","article-title":"Research on the frequency hopping communication technology of variable hopping rate and variable interval.","volume":"21","author":"J.Yan","year":"2012","journal-title":"Wireless Communication Technology"},{"key":"IJITWE.297970-28","doi-asserted-by":"crossref","DOI":"10.1016\/j.energy.2021.121377","article-title":"Dynamic energy dispatch strategy for integrated energy system based on improved deep reinforcement learning.","volume":"235","author":"T.Yang","year":"2021","journal-title":"Energy"},{"issue":"10","key":"IJITWE.297970-29","first-page":"1132","article-title":"A dynamic power control strategy based on dueling deep Q network with prioritized experience replay.","volume":"59","author":"Z.Ye","year":"2019","journal-title":"Telecommunication Engineering"},{"issue":"10","key":"IJITWE.297970-30","first-page":"1870","article-title":"Twice sampling method in deep Q-network.","volume":"45","author":"Y.Zhao","year":"2019","journal-title":"Acta Automatica Sinica"},{"issue":"2","key":"IJITWE.297970-31","first-page":"486","article-title":"Dueling deep Q network learning with rank-based prioritized Experience Replay.","volume":"37","author":"Y.Zhou","year":"2020","journal-title":"Jisuanji Yingyong Yanjiu"},{"key":"IJITWE.297970-32","article-title":"DQL energy management: An online-updated algorithm and its application in fix-line hybrid electric vehicle.","volume":"225","author":"R.Zou","year":"2021","journal-title":"Energy"}],"container-title":["International Journal of Information Technology and Web Engineering"],"original-title":[],"language":"ng","link":[{"URL":"https:\/\/www.igi-global.com\/viewtitle.aspx?TitleId=297970","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,2,7]],"date-time":"2024-02-07T12:38:10Z","timestamp":1707309490000},"score":1,"resource":{"primary":{"URL":"https:\/\/services.igi-global.com\/resolvedoi\/resolve.aspx?doi=10.4018\/IJITWE.297970"}},"subtitle":[""],"short-title":[],"issued":{"date-parts":[[2022,6,14]]},"references-count":33,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2022,1]]}},"URL":"https:\/\/doi.org\/10.4018\/ijitwe.297970","relation":{},"ISSN":["1554-1045","1554-1053"],"issn-type":[{"value":"1554-1045","type":"print"},{"value":"1554-1053","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,6,14]]}}}