№1, 2023

MALWARE DETECTION BASED ON OPCODE FREQUENCY

Elshan O. Baghirov

The amount of new malware has been continuously growing, and its threats are increasing rapidly. Developing new types of detection methods and thereby protecting computer systems from malicious programs has always been of interest to scientific researchers, individuals and organizations. In this work, several classification methods are applied on the dataset which is prepared on the basis of opcodes obtained from known malicious and benign program samples. Dependency between opcodes higher than 70% of total are removed to achieve more relevant results. The other main factors affecting the results of the methods are evaluated. Results prove that Random Forest classifier can classify suspicious programs with higher accuracy than others (pp.3-7).

Keywords: Opcode frequency, correlation, malware, signature, obfuscation
DOI : 10.25045/jpit.v14.i1.01
References
  • Arzu G.K., Mert N., Ibrahim S. “Metamorphic malware identification using engine-specific patterns based on co-opcode graphs”, Computer Standards & Interfaces, vol. 71, pp. 1-12, 2020.
  • Carlin D., Philip O., Sezer S. “A Cost Analysis of Machine Learning Using Dynamic Runtime Opcodes for Malware Detection”, Computers & Security, vol. 85, pp. 138-155, 2019.
  • James B., Dehghantanha A. “Leveraging Support Vector Machine for Opcode Density Based Detection of Crypto-Ransomware”, Cyber Threat Intelligence, pp. 107-136, 2018.
  •  Khalilian A., Nourazar A., Vahidi M. et al. “G3MD: Mining frequent opcode sub-graphs for metamorphic malware detection of existing families”, Expert Systems With Applications, vol. 112, pp. 15–33, 2018.
  • McLaughlin N., Rincon J. M. “Data augmentation for opcode sequence based malware detection”, arXiv:2106.11821v1, pp. 1-11, 2021.
  • Renjie L. “Malware Detection with LSTM using Opcode Language”, arXiv:1906.04593v1, pp. 1-7, 2019.
  • Santos I., Brezo F., Ugarte X.P. et al. “Opcode sequences as representation of executables for data-mining-based unknown malware detection”, Information Sciences, vol. 231, pp. 64-82, 2013.
  • Seungho J., Jongsub M. “Malware-Detection Method with a Convolutional Recurrent Neural Network Using Opcode Sequences”, Information Sciences, vol. 535, pp. 1-15, 2020.
  • Zhang J., Zheng Q., Yin H. et al. “A feature-hybrid malware variants detection using CNN based opcode embedding and BPNN based API embedding”, Computers & Security, vol. 84, pp. 376-392, 2019.
  • Zhang J., Zheng Q., Yin H. et al. “IRMD: Malware variant Detection using opcode Image Recognition”, IEEE 22nd International Conference on Parallel and Distributed Systems (ICPADS), pp. 1175-1180, Wuhan, China, 13-16 December, 2016.