The search functionality is under construction.
The search functionality is under construction.

Performance Comparison of Training Datasets for System Call-Based Malware Detection with Thread Information

Yuki KAJIWARA, Junjun ZHENG, Koichi MOURI

  • Full Text Views

    0

  • Cite this

Summary :

The number of malware, including variants and new types, is dramatically increasing over the years, posing one of the greatest cybersecurity threats nowadays. To counteract such security threats, it is crucial to detect malware accurately and early enough. The recent advances in machine learning technology have brought increasing interest in malware detection. A number of research studies have been conducted in the field. It is well known that malware detection accuracy largely depends on the training dataset used. Creating a suitable training dataset for efficient malware detection is thus crucial. Different works usually use their own dataset; therefore, a dataset is only effective for one detection method, and strictly comparing several methods using a common training dataset is difficult. In this paper, we focus on how to create a training dataset for efficiently detecting malware. To achieve our goal, the first step is to clarify the information that can accurately characterize malware. This paper concentrates on threads, by treating them as important information for characterizing malware. Specifically, on the basis of the dynamic analysis log from the Alkanet, a system call tracer, we obtain the thread information and classify the thread information processing into four patterns. Then the malware detection is performed using the number of transitions of system calls appearing in the thread as a feature. Our comparative experimental results showed that the primary thread information is important and useful for detecting malware with high accuracy.

Publication
IEICE TRANSACTIONS on Information Vol.E104-D No.12 pp.2173-2183
Publication Date
2021/12/01
Publicized
2021/09/21
Online ISSN
1745-1361
DOI
10.1587/transinf.2021EDP7067
Type of Manuscript
PAPER
Category
Artificial Intelligence, Data Mining

Authors

Yuki KAJIWARA
  Ritsumeikan University
Junjun ZHENG
  Ritsumeikan University
Koichi MOURI
  Ritsumeikan University

Keyword