Machine Learning for Analyzing Malware

Dong, Yajie; Liu, Zhenyan; Yan, Yida; Wang, Yong; Peng, Tu; Zhang, Ji

doi:10.1007/978-3-319-64701-2_28

Yajie Dong¹⁷,
Zhenyan Liu¹⁷,
Yida Yan¹⁷,
Yong Wang¹⁷,
Tu Peng¹⁷ &
…
Ji Zhang¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 10394))

Included in the following conference series:

International Conference on Network and System Security

3270 Accesses
2 Citations

Abstract

The Internet has become an indispensable part of people’s work and life. It provides favorable communication conditions for malwares. Therefore, malwares are endless and spread faster and become one of the main threats of current network security. Based on the malware analysis process, from the original feature extraction and feature selection to malware detection, this paper introduces the machine learning algorithm such as clustering, classification and association analysis, and how to use the machine learning algorithm to malware and its variants for effective analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Malware Detection Using Machine Learning Techniques

Malware Analysis and Detection Using Data Mining and Machine Learning Classification

A Comparative Analysis of Malware Anomaly Detection

References

Michael, S., Andrew. H.: Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software. Publishing House of Electronics Industry (2014)
Google Scholar
Liao, G., Liu, J.A.: Malicious code detection method based on data mining and machine learning. J. Inf. Secur. Res. (2016)
Google Scholar
Huang, H.X., Zhang, L., Deng, L.: Review of malware detection based on data mining. Comput. Sci. (2016)
Google Scholar
Lee, D.H., Song, I.S., Kim, K.J.: A study on malicious codes pattern analysis using visualization. In: IEEE Computer Society, pp. 1–5 (2011)
Google Scholar
Kolter, J.Z., Maloof, M.A.: Learning to detect and classify malicious executables in the wild. J. Mach. Learn. Res. 7, 2721–2744 (2006)
MathSciNet MATH Google Scholar
Schultz, M.G., Eskin, E., Zadok, E.: Data mining methods for detection of new malicious executables, pp. 38–49 (2001)
Google Scholar
Shabtai, A., Moskovitch, R., Feher, C.: Detecting unknown malicious code by applying classification techniques on OpCode patterns. Secur. Inform. (2012)
Google Scholar
Lai, Y.A.: Feature selection for malicious detection. In: ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/distributed Computing, pp. 365–370. IEEE Xplore (2008)
Google Scholar
Mao, M., Liu, Y.: Research on malicious program detection based on machine learning. Softw. Guide (2010)
Google Scholar
Domingos, P.: A few useful things to know about machine learning. Commun. ACM 55, 78–87 (2012)
Article Google Scholar
Perdisci, R., Lanzi, A., Lee, W.: Classification of packed executables for accurate computer virus detection. Pattern Recogn. Lett. 29, 1941–1946 (2008)
Article Google Scholar
Ding, Y., Yuan, X., Tang, K.: A fast malware detection algorithm based on objective-oriented association mining. Comput. Secur. 39, 315–324 (2013)
Article Google Scholar
Santos, I., Brezo, F., Nieves, J., Penya, Y.K., Sanz, B., Laorden, C., Bringas, Pablo G.: Idea: opcode-sequence-based malware detection. In: Massacci, F., Wallach, D., Zannone, N. (eds.) ESSoS 2010. LNCS, vol. 5965, pp. 35–43. Springer, Heidelberg (2010). doi:10.1007/978-3-642-11747-3_3
Chapter Google Scholar
Karim, M.E., Walenstein, A., Lakhotia, A.: Malware phylogeny generation using permutations of code. J. Comput. Virol. Hacking Techn. 1, 13–23 (2005)
Article Google Scholar
Bilar, D.: Opcodes as predictor for malware. Int. J. Electron. Secur. Digital Forensics 1, 156–168 (2007)
Article Google Scholar
Santos, I., Brezo, F., Ugarte-Pedrero, X.: Opcode sequences as representation of executables for data-mining-based unknown malware detection. Inf. Sci. 231, 64–82 (2013)
Article MathSciNet Google Scholar
Liang, C.: Research on the main techonologies. In: Malware Code Detection. Yangzhou University (2012)
Google Scholar
Chen, X., Zhang, J., Xiao-Guang, L.: A text classification method for chinese pornographic web recognition. Meas. Control Technol. 30(5), 27–26 (2011)
Google Scholar
Cavnar, W.B., Trenkle, J.M.: N-Gram-based text categorization. In: Proceedings of SDAIR 1994, 3rd Annual Symposium on Document Analysis and Information Retrieval, Las Vegas, US (1994)
Google Scholar
Dash, M., Liu, H.: Feature selection for classification. Intell. Data Anal. 1, 131–156 (1997)
Article Google Scholar
Adebayo, O.S., Abdulaziz, N.: Android malware classification using static code analysis and Apriori algorithm improved with particle swarm optimization. In: Information and Communication Technologies, pp. 123–128 (2015)
Google Scholar
www.kaggle.com/malware-classification
Fang, Z.: Research and Implementation of Malware Classification. National University of Defense Technology (2011)
Google Scholar
Li, W.: Research and Implementation of Mobile Customer Churn Prediction Based on Decision Tree Algorithm. Beijing University (2010)
Google Scholar
Zhu, L.J., Yu-Fen, X.U.: Application of C4.5 algorithm in unknown malicious code identification. J. Shenyang Univ. Chem. Technol. (2013)
Google Scholar
Zhang, M.: Remote Sensing Image Classification Algorithm Based on Random Forest. Shandong University of Science and Technology (2013)
Google Scholar
Tian, R., Batten, L., Islam, R.: An automated classification system based on the strings of trojan and virus families. Malware (2009)
Google Scholar
Zhao, Z., Wang, J., Wang, C.: An unknown malware detection scheme based on the features of graph. Secur. Commun. Netw. 6, 239–246 (2013)
Article Google Scholar
Zhu, K., Yin, B., Mao, Y.: Malware classification approach based on valid window and Naive Bayes. J. Comput. Res. Develop. 373–381 (2014)
Google Scholar
Sun, G.: Research on intrusion detection system based on SVM. Beijing University of Posts and Telecommunications (2007)
Google Scholar
Qu, J.: Research on Overlap Similarity-based Hierarchical Clustering Algorithms and Its Application. Xiamen University (2007)
Google Scholar
Feng, S.R.: Research and application of DBSCAN clustering algorithm based on density. Comput. Eng. Appl. 162–165 (2006)
Google Scholar
Yu, J., He, P., Sun, Y.H.: Research on text hierarchical clustering algorithm based on K-Means. Comput. Appl. (2005)
Google Scholar
Qian, Y., Peng, G., Wang, Y.: Homology analysis of malicious code and family clustering. Comput. Eng. Appl. 51, 76–81 (2015)
Google Scholar
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: International Conference on Very Large Data Bases. Morgan Kaufmann Publishers Inc (1994)
Google Scholar
Zhang, W., Zheng, Q., Shuai, J.M.: New malicious executables detection based on association rules. Comput. Eng. 172–174 (2008)
Google Scholar
Li, Z.: Research on Malicious Code Analysis Based on API Association. The PLA Information Engineering University (2014)
Google Scholar
Alazab, M.: Profiling and classifying the behaviour of malicious codes. J. Syst. Softw. 100, 91–102 (2014)
Article Google Scholar
Wang, X.Z., Sun, L.C., Zhang, M.: Malicious behavior detection method based on sequential pattern discovery. Comput. Eng. 37, 1–3 (2011)
Google Scholar
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: ACM SIGMOD International Conference on Management of Data. ACM, pp. 1–12 (2000)
Google Scholar
Qin, L., Shi, Z.: Net flow association rules mining based on iceberg queries. Comput. Eng. 31, 9–11 (2005)
Google Scholar
Wang, W.J., Liu, B.X.: Association rule-based network intrusion detection system. Hedianzixue Yu Tance Jishu/Nuclear Electron. Detection Technol. 119–123 (2015)
Google Scholar
Kruczkowski, M., Niewiadomska-Szynkiewicz, E., Kozakiewicz, A.: FP-tree and SVM for malicious web campaign detection. In: Nguyen, N.T., Trawiński, B., Kosala, R. (eds.) ACIIDS 2015. LNCS, vol. 9012, pp. 193–201. Springer, Cham (2015). doi:10.1007/978-3-319-15705-4_19
Google Scholar
Zheng, L.X., Xu, X.L., Li, J.: Malicious URL prediction based on community detection. In: International Conference on Cyber Security of Smart Cities, Industrial Control System and Communications, pp. 1–7. IEEE (2015)
Google Scholar
Appavu, S., Rajaram, R.: Association rule mining for suspicious email detection: a data mining approach. In: Intelligence and Security Informatics, pp. 316–323. IEEE (2007)
Google Scholar
Li, X., Dong, X., Wang, Y.: Malicious code forensics based on data mining. In: International Conference on Fuzzy Systems and Knowledge Discovery, pp. 978–983. IEEE (2013)
Google Scholar

Download references

Acknowledgements

This work was financially supported by National Key R&D Program of China (2016YFB0801304).

Author information

Authors and Affiliations

School of Software, Beijing Institute of Technology, Beijing, China
Yajie Dong, Zhenyan Liu, Yida Yan, Yong Wang, Tu Peng & Ji Zhang

Authors

Yajie Dong
View author publications
You can also search for this author in PubMed Google Scholar
Zhenyan Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yida Yan
View author publications
You can also search for this author in PubMed Google Scholar
Yong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Tu Peng
View author publications
You can also search for this author in PubMed Google Scholar
Ji Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yajie Dong .

Editor information

Editors and Affiliations

Xidian University, Xi’an, China
Zheng Yan
Eurecom, Sophia Antipolos, Valbonne, France
Refik Molva
Warsaw University of Technology, Warsaw, Poland
Wojciech Mazurczyk
Aalto University, Espoo, Finland
Raimo Kantola

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dong, Y., Liu, Z., Yan, Y., Wang, Y., Peng, T., Zhang, J. (2017). Machine Learning for Analyzing Malware. In: Yan, Z., Molva, R., Mazurczyk, W., Kantola, R. (eds) Network and System Security. NSS 2017. Lecture Notes in Computer Science(), vol 10394. Springer, Cham. https://doi.org/10.1007/978-3-319-64701-2_28

Download citation

DOI: https://doi.org/10.1007/978-3-319-64701-2_28
Published: 26 July 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-64700-5
Online ISBN: 978-3-319-64701-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Machine Learning for Analyzing Malware

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Malware Detection Using Machine Learning Techniques

Malware Analysis and Detection Using Data Mining and Machine Learning Classification

A Comparative Analysis of Malware Anomaly Detection

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Machine Learning for Analyzing Malware

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Malware Detection Using Machine Learning Techniques

Malware Analysis and Detection Using Data Mining and Machine Learning Classification

A Comparative Analysis of Malware Anomaly Detection

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation