Abstract
The Internet has become an indispensable part of people’s work and life. It provides favorable communication conditions for malwares. Therefore, malwares are endless and spread faster and become one of the main threats of current network security. Based on the malware analysis process, from the original feature extraction and feature selection to malware detection, this paper introduces the machine learning algorithm such as clustering, classification and association analysis, and how to use the machine learning algorithm to malware and its variants for effective analysis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Michael, S., Andrew. H.: Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software. Publishing House of Electronics Industry (2014)
Liao, G., Liu, J.A.: Malicious code detection method based on data mining and machine learning. J. Inf. Secur. Res. (2016)
Huang, H.X., Zhang, L., Deng, L.: Review of malware detection based on data mining. Comput. Sci. (2016)
Lee, D.H., Song, I.S., Kim, K.J.: A study on malicious codes pattern analysis using visualization. In: IEEE Computer Society, pp. 1–5 (2011)
Kolter, J.Z., Maloof, M.A.: Learning to detect and classify malicious executables in the wild. J. Mach. Learn. Res. 7, 2721–2744 (2006)
Schultz, M.G., Eskin, E., Zadok, E.: Data mining methods for detection of new malicious executables, pp. 38–49 (2001)
Shabtai, A., Moskovitch, R., Feher, C.: Detecting unknown malicious code by applying classification techniques on OpCode patterns. Secur. Inform. (2012)
Lai, Y.A.: Feature selection for malicious detection. In: ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/distributed Computing, pp. 365–370. IEEE Xplore (2008)
Mao, M., Liu, Y.: Research on malicious program detection based on machine learning. Softw. Guide (2010)
Domingos, P.: A few useful things to know about machine learning. Commun. ACM 55, 78–87 (2012)
Perdisci, R., Lanzi, A., Lee, W.: Classification of packed executables for accurate computer virus detection. Pattern Recogn. Lett. 29, 1941–1946 (2008)
Ding, Y., Yuan, X., Tang, K.: A fast malware detection algorithm based on objective-oriented association mining. Comput. Secur. 39, 315–324 (2013)
Santos, I., Brezo, F., Nieves, J., Penya, Y.K., Sanz, B., Laorden, C., Bringas, Pablo G.: Idea: opcode-sequence-based malware detection. In: Massacci, F., Wallach, D., Zannone, N. (eds.) ESSoS 2010. LNCS, vol. 5965, pp. 35–43. Springer, Heidelberg (2010). doi:10.1007/978-3-642-11747-3_3
Karim, M.E., Walenstein, A., Lakhotia, A.: Malware phylogeny generation using permutations of code. J. Comput. Virol. Hacking Techn. 1, 13–23 (2005)
Bilar, D.: Opcodes as predictor for malware. Int. J. Electron. Secur. Digital Forensics 1, 156–168 (2007)
Santos, I., Brezo, F., Ugarte-Pedrero, X.: Opcode sequences as representation of executables for data-mining-based unknown malware detection. Inf. Sci. 231, 64–82 (2013)
Liang, C.: Research on the main techonologies. In: Malware Code Detection. Yangzhou University (2012)
Chen, X., Zhang, J., Xiao-Guang, L.: A text classification method for chinese pornographic web recognition. Meas. Control Technol. 30(5), 27–26 (2011)
Cavnar, W.B., Trenkle, J.M.: N-Gram-based text categorization. In: Proceedings of SDAIR 1994, 3rd Annual Symposium on Document Analysis and Information Retrieval, Las Vegas, US (1994)
Dash, M., Liu, H.: Feature selection for classification. Intell. Data Anal. 1, 131–156 (1997)
Adebayo, O.S., Abdulaziz, N.: Android malware classification using static code analysis and Apriori algorithm improved with particle swarm optimization. In: Information and Communication Technologies, pp. 123–128 (2015)
Fang, Z.: Research and Implementation of Malware Classification. National University of Defense Technology (2011)
Li, W.: Research and Implementation of Mobile Customer Churn Prediction Based on Decision Tree Algorithm. Beijing University (2010)
Zhu, L.J., Yu-Fen, X.U.: Application of C4.5 algorithm in unknown malicious code identification. J. Shenyang Univ. Chem. Technol. (2013)
Zhang, M.: Remote Sensing Image Classification Algorithm Based on Random Forest. Shandong University of Science and Technology (2013)
Tian, R., Batten, L., Islam, R.: An automated classification system based on the strings of trojan and virus families. Malware (2009)
Zhao, Z., Wang, J., Wang, C.: An unknown malware detection scheme based on the features of graph. Secur. Commun. Netw. 6, 239–246 (2013)
Zhu, K., Yin, B., Mao, Y.: Malware classification approach based on valid window and Naive Bayes. J. Comput. Res. Develop. 373–381 (2014)
Sun, G.: Research on intrusion detection system based on SVM. Beijing University of Posts and Telecommunications (2007)
Qu, J.: Research on Overlap Similarity-based Hierarchical Clustering Algorithms and Its Application. Xiamen University (2007)
Feng, S.R.: Research and application of DBSCAN clustering algorithm based on density. Comput. Eng. Appl. 162–165 (2006)
Yu, J., He, P., Sun, Y.H.: Research on text hierarchical clustering algorithm based on K-Means. Comput. Appl. (2005)
Qian, Y., Peng, G., Wang, Y.: Homology analysis of malicious code and family clustering. Comput. Eng. Appl. 51, 76–81 (2015)
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: International Conference on Very Large Data Bases. Morgan Kaufmann Publishers Inc (1994)
Zhang, W., Zheng, Q., Shuai, J.M.: New malicious executables detection based on association rules. Comput. Eng. 172–174 (2008)
Li, Z.: Research on Malicious Code Analysis Based on API Association. The PLA Information Engineering University (2014)
Alazab, M.: Profiling and classifying the behaviour of malicious codes. J. Syst. Softw. 100, 91–102 (2014)
Wang, X.Z., Sun, L.C., Zhang, M.: Malicious behavior detection method based on sequential pattern discovery. Comput. Eng. 37, 1–3 (2011)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: ACM SIGMOD International Conference on Management of Data. ACM, pp. 1–12 (2000)
Qin, L., Shi, Z.: Net flow association rules mining based on iceberg queries. Comput. Eng. 31, 9–11 (2005)
Wang, W.J., Liu, B.X.: Association rule-based network intrusion detection system. Hedianzixue Yu Tance Jishu/Nuclear Electron. Detection Technol. 119–123 (2015)
Kruczkowski, M., Niewiadomska-Szynkiewicz, E., Kozakiewicz, A.: FP-tree and SVM for malicious web campaign detection. In: Nguyen, N.T., Trawiński, B., Kosala, R. (eds.) ACIIDS 2015. LNCS, vol. 9012, pp. 193–201. Springer, Cham (2015). doi:10.1007/978-3-319-15705-4_19
Zheng, L.X., Xu, X.L., Li, J.: Malicious URL prediction based on community detection. In: International Conference on Cyber Security of Smart Cities, Industrial Control System and Communications, pp. 1–7. IEEE (2015)
Appavu, S., Rajaram, R.: Association rule mining for suspicious email detection: a data mining approach. In: Intelligence and Security Informatics, pp. 316–323. IEEE (2007)
Li, X., Dong, X., Wang, Y.: Malicious code forensics based on data mining. In: International Conference on Fuzzy Systems and Knowledge Discovery, pp. 978–983. IEEE (2013)
Acknowledgements
This work was financially supported by National Key R&D Program of China (2016YFB0801304).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Dong, Y., Liu, Z., Yan, Y., Wang, Y., Peng, T., Zhang, J. (2017). Machine Learning for Analyzing Malware. In: Yan, Z., Molva, R., Mazurczyk, W., Kantola, R. (eds) Network and System Security. NSS 2017. Lecture Notes in Computer Science(), vol 10394. Springer, Cham. https://doi.org/10.1007/978-3-319-64701-2_28
Download citation
DOI: https://doi.org/10.1007/978-3-319-64701-2_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-64700-5
Online ISBN: 978-3-319-64701-2
eBook Packages: Computer ScienceComputer Science (R0)