Skip to main content
eScholarship
Open Access Publications from the University of California

UC Riverside

UC Riverside Electronic Theses and Dissertations bannerUC Riverside

Towards Robust Deep Neural Network Architectures for Malware Classification

Abstract

Modern commercial antivirus systems increasingly rely on machine learning to keep up with the rampant inflation of new malware. However, it is well-known that machine learning models are vulnerable to adversarial examples. Previous works have shown that ML malware classifiers are fragile to the white-box adversarial attacks. However, ML models used in commercial antivirus products are usually not available to attackers and only return hard classification labels. Therefore, it is more practical to evaluate the robustness of ML models and real-world AVs in a pure black-box manner. Since existing state-of-the-art malware classifications are quite vulnerable to our attacks, the next question is how to create a new architecture to make the malware classifiers more robust against different kinds of adversarial attacks, including Benign content appending, content relocation, and code randomization. Finally, memory-only malware has become more and more popular in recent years. Since they are not written on disks, it becomes important to recognize their presence in memory. Moreover, these malware samples may hide their process information in the system, we need a way to identify them fast and robustly.This dissertation addresses these problems by presenting insights, methods, and techniques on how to perform attacks and defenses on malware classification. Firstly, a black-box Reinforcement Learning based framework called MAB-Malware is developed to generate adversarial examples for PE malware classifiers and AV engines. It has a much higher evasion rate than other off-the-shelf frameworks. Secondly, a selective hierarchical BERT-based new classification architecture is proposed to automatically select malicious functions for malware classification that is robust against different attacks. Thirdly, a graph-based deep learning approach is presented to automatically generate abstract representations for kernel objects, with which we could recognize the objects from raw memory dumps in a fast and robust way.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View