Home   >   CSC-OpenAccess Library   >    Manuscript Information
Full Text Available

(318.76KB)
This is an Open Access publication published under CSC-OpenAccess Policy.
Publications from CSC-OpenAccess Library are being accessed from over 74 countries worldwide.
Adversarial Attacks and Defenses in Malware Classification: A Survey
Ilja Moisejevs
Pages - 31 - 43     |    Revised - 31-08-2019     |    Published - 01-10-2019
Volume - 8   Issue - 3    |    Publication Date - October 2019  Table of Contents
MORE INFORMATION
KEYWORDS
Machine Learning, Malware Classification, Adversarial Attacks, Evasion Attacks.
ABSTRACT
As malware continues to grow more sophisticated and more plentiful - traditional signature and heuristics-based defenses no longer cut it. Instead, the industry has recently turned to using machine learning for malicious file detection. The challenge with this approach is that machine learning itself comes with vulnerabilities - and if left unattended presents a new attack surface for attackers to exploit.

In this paper we present a survey of research in the area of machine learning-based malware classifiers, the attacks they encounter, and the defensive measures available. We start by reviewing recent advances in malware classification, including the most important works using deep learning. We then discuss in detail the field of adversarial machine learning and conduct an exhaustive review of adversarial attacks and defenses in the field of malware classification.
1 Google Scholar 
2 refSeek 
3 Doc Player 
4 Scribd 
5 SlideShare 
1 B. Kolosnjaji, A. Demontis, B. Biggio, D. Maiorca, G. Giacinto, C. Eckert and F. Roli. "Adversarial Malware Binaries: Evading Deep Learning for Malware Detection in Executables," in Proc. 2018 26th European Signal Processing Conference (EUSIPCO), 2018, pp. 533-7.
2 "Internet security threat report (ISTR 22)." Internet: https://www.symantec.com/content/dam/symantec/docs/reports/istr-22-2017-en.pdf, Apr. 2017 [Jul. 24, 2019].
3 "Machine learning for malware detection." Internet: https://media.kaspersky.com/en/enterprise-security/Kaspersky-Lab-Whitepaper-Machine- Learning.pdf, nd [Jul. 24, 2019].
4 T. Abou-Assaleh, N. Cercone, V. Keselj and R. Sweidan. "N-gram-based detection of new malicious code," in Proc. 28th Annual International Computer Software and Applications Conference 2004 (COMPSAC 2004), 2004, pp. 41-2.
5 K. Rieck, T. Holz, C. Willems, P. Düssel and P. Laskov. "Learning and Classification of Malware Behavior, " in Proc. DIMVA'08: Detection of Intrusions and Malware, and Vulnerability Assessment, 2008, pp. 108-25.
6 G. Yan, N. Brown and D. Kong. "Exploring Discriminatory Features for Automated Malware Classification," in Proc. DIMVA'13 - 10th international conference on Detection of Intrusions and Malware, and Vulnerability Assessment, 2013, pp. 41-61.
7 E. Raff, J. Barker, J. Sylvester, R. Brandon, B. Catanzaro and C. Nicholas. "Malware Detection by Eating a Whole EXE," in Proc. 32nd AAAI Conference on Artificial Intelligence, 2017.
8 E. Raff, J. Sylvester and C. Nicholas. "Learning the PE Header, Malware Detection with Minimal Domain Knowledge," in Proc. AISec'17 - 10th ACM Workshop on Artificial Intelligence and Security, 2017, pp. 121-32.
9 F. Pendlebury, F. Pierazzi, R. Jordaney, J. Kinder and L. Cavallaro. "TESSERACT: Eliminating Experimental Bias in Malware Classification across Space and Time," in Proc. SEC'19 - 28th USENIX Conference on Security Symposium, 2018, pp. 729-46.
10 B. Biggio, I. Corona, D. Maiorca, B. Nelson, N. Šrndić, P. Laskov, G. Giacinto and F. Roli. "Evasion attacks against machine learning at test time," in Proc. ECML/PKDD'13 - 2013 European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part III, 2013, pp. 387-402.
11 C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow and R. Fergus. (2013, Dec.). "Intriguing properties of neural networks." arXiv Preprint - arXiv:1312.6199. [Online]. Available: https://arxiv.org/abs/1312.6199 [Jul. 24, 2019].
12 N. Šrndić and P. Laskov. "Practical Evasion of a Learning-Based Classifier: A Case Study," in Proc. 2014 IEEE Symposium on Security and Privacy, 2014, pp. 197-211.
13 D. Maiorca, I. Corona and G. Giacinto. "Looking at the bag is not enough to find the bomb: an evasion of structural methods for malicious PDF files detection," in ASIACCS'13 - 8th ACM SIGSAC symposium on Information, computer and communications security, 2013, pp. 119-30.
14 W. Hu and Y. Tan. (2017, Feb.). "Generating Adversarial Malware Examples for Black-Box Attacks Based on GAN." arXiv Preprint - ArXiv:1702.05983. [On-line]. Available: https://arxiv.org/abs/ 1702.05983 [Jul. 24, 2019].
15 H. Dang, Y. Huang and E. Chang. "Evading Classifiers by Morphing in the Dark," in Proc. CCS'17 - 2017 ACM SIGSAC Conference on Computer and Communications Security, 2017, pp. 119-33.
16 M. Barreno, B. Nelson, R. Sears, A.D. Joseph and J.D. Tygar. "Can machine learning be secure?," in Proc. ASIACCS'06 - 2006 ACM Symposium on Information, computer and communications security, 2006, pp. 16-25.
17 C. Smutz and A. Stavrou. "Malicious PDF detection using metadata and structural features," in Proc. ACSAC'12 - 28th Annual Computer Security Applications Conference, 2012, pp. 239-48.
18 D. Chinavle, P. Kolari, T. Oates and T. Finin. "Ensembles in adversarial classification for spam," in Proc. CIKM'09 - 18th ACM Conference on Information and Knowledge Management, 2009, pp. 2015-8.
19 W. Xu, Y. Qi and D. Evans. "Automatically Evading Classifiers: A Case Study on PDF Malware Classifiers," in Proc. NDSS'16 - Network and Distributed System Security Symposium, 2016.
20 K. Grosse, N. Papernot, P. Manoharan, M. Backes and P.D. McDaniel. (2016, Jun.). "Adversarial Perturbations Against Deep Neural Networks for Malware Classification." arXiv Preprint - ArXiv:1606.04435. [Online]. Available: https://arxiv.org/abs/1606.04435 [Jul. 24, 2019].
21 B. Biggio, G. Fumera and F. Roli. "Security Evaluation of Pattern Classifiers under Attack." IEEE Transactions on Knowledge and Data Engineering, vol. 26(4), pp. 984-96, Apr. 2014.
22 B. Biggio, G. Fumera and F. Roli. "Multiple classifier systems for robust classifier design in adversarial environments." International Journal of Machine Learning and Cybernetics, vol. 1(1-4), pp. 27-41, 2010.
23 B. Biggio, G. Fumera and F. Roli. "Evade hard multiple classifier systems." in Applications of Supervised and Unsupervised Ensemble Methods, vol. 245. O. Okun and G. Valentini, Eds. Berlin: Springer, 2009, pp. 15-38.
24 B. Li and Y. Vorobeychik. "Feature Cross-Substitution in Adversarial Classification," in Proc. NIPS'14 - 27th International Conference on Neural Information Processing Systems - Volume 2, 2014, pp. 2087-95.
25 F. Wang, W. Liu and S. Chawla. "On Sparse Feature Attacks in Adversarial Learning," in Proc. 2014 IEEE International Conference on Data Mining (ICDM), 2014, pp. 1013-8.
26 F. Zhang, P.P.K. Chan, B. Biggio, D.S. Yeung and F. Roli. "Adversarial Feature Selection Against Evasion Attacks." IEEE Transactions on Cybernetics, vol. 46(3), pp. 766-77, Mar. 2016.
27 L. Chen, S. Hou and Y. Ye. "SecureDroid: Enhancing Security of Machine Learning-based Detection against Adversarial Android Malware Attacks," in Proc. ACSAC 2017 - 33rd Annual Computer Security Applications Conference, 2017, pp. 362-72.
28 L. Tong, B. Li, C. Hajaj, C Xiao and Y. Vorobeychik. (2017, Nov.). "Hardening Classifiers Against Evasion: the Good, the Bad, and the Ugly." arXiv Preprint - ArXiv:1708.08327v2. [Online]. Available: https://arxiv.org/abs/1708.08327v2 [Jul. 26, 2019].
29 H.S. Anderson, A. Kharkar, B. Filar, D. Evans and P. Roth. (2018, Jan.). "Learning to Evade Static PE Machine Learning Malware Models via Reinforcement Learning." arXiv Preprint - ArXiv:1801.08917. [Online]. Available: https://arxiv.org/abs/1801.08917 [Jul. 26, 2019].
30 N. Šrndić and P. Laskov. "Detection of Malicious PDF Files Based on Hierarchical Document Structure," in Proc. NDSS 2013 - Network and Distributed System Security Symposium, 2013.
31 G.E. Hinton, O. Vinyals and J. Dean. (2015, Mar.). "Distilling the Knowledge in a Neural Network." arXiv Preprint - ArXiv:1503.02531. [Online]. Available: https://arxiv.org/abs/1503.02531 [Jul. 26, 2019].
32 N. Papernot, P.D. McDaniel, X. Wu, S. Jha and A. Swami. "Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks," in Proc. 2016 IEEE Symposium on Security and Privacy (SP), 2016, pp. 582-97.
33 S. Saad, W. Briguglio and H. Elmiligi. "The Curious Case of Machine Learning in Malware Detection," in Proc. ICISSP 2019 - 5th International Conference on Information Systems Security and Privacy, 2019.
34 B. Athiwaratkun and J.W. Stokes. "Malware classification with LSTM and GRU language models and a character-level CNN," in Proc. 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017, pp. 2482-6.
35 G.E. Dahl, J.W. Stokes, L. Deng and D. Yu. "Large-scale malware classification using random projections and neural networks," in Proc. 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2013, pp. 3422-6.
36 R. Pascanu, J.W. Stokes, H. Sanossian, M. Marinescu and A. Thomas. "Malware classification with recurrent networks," in Proc. 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2015, pp. 1916-20.
37 T. Raffetseder, C. Kruegel and E. Kirda. "Detecting System Emulators," in Proc. ISC'07 - 10th international conference on Information Security, 2007, pp. 1-18.
38 T. Garfinkel, K. Adams, A. Warfield and J. Franklin. "Compatibility Is Not Transparency: VMM Detection Myths and Realities," in Proc. HOTOS'07 - 11th USENIX workshop on Hot topics in Operating Systems, 2007.
39 M. Carpenter, T. Liston and E. Skoudis. "Hiding Virtualization from Attackers and Malware." IEEE Security and Privacy, vol. 5(3), pp. 62-5, May-Jun. 2007.
40 C. Rossow, C.J. Dietrich, C. Grier, C. Kreibich, V. Paxson, N. Pohlmann, H. Bos and M. van Steen. "Prudent Practices for Designing Malware Experiments: Status Quo and Outlook," in Proc. SP'12 - 2012 IEEE Symposium on Security and Privacy, 2012, pp. 65-79.
41 N. Papernot, P. McDaniel, A. Sinha and M. Wellman. (2016, Nov.)."Towards the Science of Security and Privacy in Machine Learning." arXiv Preprint - ArXiv:1611.03814. [Online]. Available: https://arxiv.org/abs/1611.03814 [Jul. 26, 2019].
42 A. Chakraborty, M. Alam, V. Dey, A. Chattopadhyay and D. Mukhopadhyay. (2018, Sep.). "Adversarial Attacks and Defences: A Survey." arXiv Preprint - arXiv:1810.00069. [On-line]. Available: https://arxiv.org/abs/1810.00069 [Jul. 24, 2019].
43 I. Moisejevs. "Will My Machine Learning Be Attacked?." Internet: https://towardsdatascience.com/will-my-machine-learning-be-attacked-6295707625d8, Jul. 14, 2019 [Jul. 24, 2019].
44 P. McDaniel, N. Papernot and Z.B. Celik. "Machine Learning in Adversarial Settings." IEEE Security & Privacy, vol. 14(3), pp. 68-72, May-Jun. 2016.
45 J.S. Cross and M.A. Munson. "Deep pdf parsing to extract features for detecting embedded malware," in Technical report: SAND2011-7982. California: Sandia National Laboratories, Sept. 2011.
46 I. Rosenberg, A. Shabtai, L. Rokach and Y. Elovici. "Generic Black-Box End-to-End Attack Against State of the Art API Call Based Malware Classifiers," in Proc. RAID 2018: Research in Attacks, Intrusions, and Defenses, 2018, pp. 490-510.
47 W. Yang, D. Kong, T. Xie and C.A. Gunter. "Malware Detection in Adversarial Settings: Exploiting Feature Evolutions and Confusions in Android Apps," in Proc. ACSAC 2017 - 33rd Annual Computer Security Applications Conference, 2017, pp. 288-302.
48 W. Hu and Y. Tan. (2017, May.). "Black-Box Attacks against RNN based Malware Detection Algorithms." arXiv Preprint - arXiv:1705.08131. [On-line]. Available: https://arxiv.org/abs/1705.08131 [Jul. 24, 2019].
49 A. Al-Dubjaili, A. Huang, E. Hemberg and U. O'Reilly. "Adversarial Deep Learning for Robust Detection of Binary Encoded Malware," in Proc. 2018 IEEE Symposium on Security and Privacy Workshops (SPW), 2018, pp. 76-82.
50 J. Gilmer, R.P. Adams, I. Goodfellow, D. Andersen and G.E. Dahl. (2018, Jul.). "Motivating the Rules of the Game for Adversarial Example Research." arXiv Preprint - arXiv:1807.06732. [On-line]. Available: https://arxiv.org/abs/1807.06732 [Jul. 24, 2019].
51 "Neural Networks - History: The 1940's to the 1970's." Internet: https://cs.stanford.edu/people/eroberts/courses/soco/projects/neural- networks/History/history1.html, nd [Jul. 24, 2019].
52 A. Krizhevky, I. Sutskever and G.E. Hinton. "ImageNet Classification with Deep Convolutional Neural Networks." Communications of the ACM, vol. 60(6), pp. 84-90, Jun. 2017.
53 "Imagenet Large Scale Visual Recognition Challenge 2013 (ILSVRC2013)." Internet: http://www.image-net.org/challenges/LSVRC/2013/index.php, 2013 [Jul. 24, 2019].
54 I.J. Goodfellow, J. Shlens and C. Szegedy. "Explaining and Harnessing Adversarial Examples," in Proc. 3rd International Conference on Learning Representations (ICLR 2015), 2015.
55 F. Tramèr, N. Papernot, I. Goodfellow, D. Boneh and P. McDaniel. (2017, May.). "The Space of Transferable Adversarial Examples." arXiv Preprint - ArXiv, abs/1704.03453. [On- line]. Available: https://arxiv.org/abs/1704.03453 [Jul. 24, 2019].
Mr. Ilja Moisejevs
Calypso AI - United Kingdom
umba3abp@gmail.com