Abstract
Domain generation algorithms (DGA) provide methods that use specific parameters as random seeds to generate a large number of random domain names for preventing malicious domain name detection, which greatly increases the difficulty of detecting and defending botnets and malware. State-of-the-art models for detecting algorithmically generated domain names are generally based on the principle of analyzing the statistical characteristics of the domain name and building a classifier to locate the algorithmically generated ones. However, most current models have problems of requiring the manual construction of feature sets for classification, as they are sensitive to the imbalance of the sample distribution in the domain name dataset and are difficult to adapt to frequent changes of the domain-name algorithm. To address this issue, we propose a hybrid model that combines a convolutional neural network (CNN) and a bidirectional long-term memory network (BLSTM). First, to solve the problem of the number of domain names generated by DGAs being relatively small and the sample distribution being unbalanced, which consequently decreases detection accuracy, the borderline synthetic minority over-sampling technique is employed to optimize the sample balance of the domain name dataset. Second, a hybrid deep neural network that combines CNN and BLSTM is introduced to extract the semantic and context-dependency features from the domain names. The experimental results from different domain-name datasets demonstrate that the proposed model achieves significant improvement over state-of-the-art models with regard to precision and robustness.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bilge, L., Sen, S., Balzarotti, D., Kirda, E., Kruegel, C.: Exposure: a passive DNS analysis service to detect and report malicious domains. ACM Trans. Inf. Syst. Secur. (TISSEC) 16(4), 14 (2014)
Schiavoni, S., Maggi, F., Cavallaro, L., Zanero, S.: Phoenix: DGA-based botnet tracking and intelligence. In: Dietrich, S. (ed.) DIMVA 2014. LNCS, vol. 8550, pp. 192–211. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-08509-8_11
Choi, H., Lee, H., Lee, H., Kim, H.: Botnet detection by monitoring group activities in DNS traffic. In: 7th IEEE International Conference on Computer and Information Technology, pp. 715–720. IEEE, CIT, USA (2007)
Qu, Y.Z., Lu, Q.K.: Effectively mining network traffic intelligence to detect malicious stealthy port scanning to cloud servers. J. Internet Technol. 15(5), 841–852, (2014). https://doi.org/10.6138/jit.2014.15.5.14
Jiang, J., Zhuge, J.W., Duan, H.X., Wu, J.P.: Research on botnet mechanisms and defenses. J. Softw. 23(1), 82–96 (2012)
Zhou, H., Guo, W., Feng, Y.: An automatic extraction approach of worm signatures based on behavioral footprint analysis. J. Internet Technol. 15(3), 405–412 (2014)
Kührer, M., Rossow, C., Holz, T.: Paint it black: evaluating the effectiveness of malware blacklists. In: Stavrou, A., Bos, H., Portokalidis, G. (eds.) RAID 2014. LNCS, vol. 8688, pp. 1–21. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11379-1_1
Wang, T.S., Lin, H.T., Cheng, W.T., Chen, C.Y.: DBod: clustering and detecting DGA-based botnets using DNS traffic analysis. Comput. Secur. 64, 1–15 (2017)
Truong, D.T., Cheng, G., Jakalan, A.: Detecting DGA-based botnet with DNS traffic analysis in monitored network. J. Internet Technol. 17(2), 217–230 (2016)
Yadav, S., Reddy, A.K.K., Reddy, A.L., Ranjan, S.: Detecting algorithmically generated malicious domain names. In: Proceedings of the 10th ACM SIGCOMM Conference on Internet Measurement, pp. 48–61. ACM, USA (2010)
Xiaodong, Z., Jian, G., Xiaoyan, H.: Detecting malicious domain names based on AGD. J. Commun. 39(7), 1000–1436 (2018)
Antonakakis, M., et al.: From throw-away traffic to bots: detecting the rise of DGA-based malware. Presented as part of the 21st Security Symposium, pp. 491–506, Bellevue, WA (2012)
Kejun, Z., Liansheng, G., Fenglin, Q., Xiaoguang, H.: Deep model for DGA botnet detection based on word-hashing. J. Southeast Univ. 373(07), 19–29 (2017)
Woodbridge, J., Anderson, H.S., Ahuja, A.: Predicting domain generation algorithms with long short-term memory networks. arXiv preprint arXiv:1611.00791 (2016)
Feng, Z., Shuo, C., Xiaochuan, W.: Classification for DGA-based malicious domain names with deep learning architectures. In: 2017 Second International Conference on Applied Mathematics and Information Technology, vol. 6, no. 6, pp. 67–71 (2017)
Han, H., Wang, W.-Y., Mao, B.-H.: Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In: Huang, D.-S., Zhang, X.-P., Huang, G.-B. (eds.) ICIC 2005. LNCS, vol. 3644, pp. 878–887. Springer, Heidelberg (2005). https://doi.org/10.1007/11538059_91
Chollet, F.: Keras. https://github.com/fchollet/keras. Accessed 2016
Does Alexa have a list of its top ranked webites?. https://support.alexa.com/hc/enus/articles/200449834Does-Alexa-have-a-list-of-its-top-ranked-websites. Accessed 2019
Bambenek consulting master feeds. http://osint.bambenekconsultin.com/feeds/. Accessed 06 Apr 2016
DGA Page. https://data.netlab.360.com/dga. Accessed 2018
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Zhang, Y., Chen, Y., Lin, Y., Zhang, Y. (2019). Detection of Algorithmically Generated Domain Names Using SMOTE and Hybrid Neural Network. In: Sun, Y., Lu, T., Yu, Z., Fan, H., Gao, L. (eds) Computer Supported Cooperative Work and Social Computing. ChineseCSCW 2019. Communications in Computer and Information Science, vol 1042. Springer, Singapore. https://doi.org/10.1007/978-981-15-1377-0_57
Download citation
DOI: https://doi.org/10.1007/978-981-15-1377-0_57
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-1376-3
Online ISBN: 978-981-15-1377-0
eBook Packages: Computer ScienceComputer Science (R0)