1-2hit |
Tao BAN Ryoichi ISAWA Shin-Ying HUANG Katsunari YOSHIOKA Daisuke INOUE
Along with the proliferation of IoT (Internet of Things) devices, cyberattacks towards them are on the rise. In this paper, aiming at efficient precaution and mitigation of emerging IoT cyberthreats, we present a multimodal study on applying machine learning methods to characterize malicious programs which target multiple IoT platforms. Experiments show that opcode sequences obtained from static analysis and API sequences obtained by dynamic analysis provide sufficient discriminant information such that IoT malware can be classified with near optimal accuracy. Automated and accelerated identification and mitigation of new IoT cyberthreats can be enabled based on the findings reported in this study.
Chun-Jung WU Shin-Ying HUANG Katsunari YOSHIOKA Tsutomu MATSUMOTO
A drastic increase in cyberattacks targeting Internet of Things (IoT) devices using telnet protocols has been observed. IoT malware continues to evolve, and the diversity of OS and environments increases the difficulty of executing malware samples in an observation setting. To address this problem, we sought to develop an alternative means of investigation by using the telnet logs of IoT honeypots and analyzing malware without executing it. In this paper, we present a malware classification method based on malware binaries, command sequences, and meta-features. We employ both unsupervised or supervised learning algorithms and text-mining algorithms for handling unstructured data. Clustering analysis is applied for finding malware family members and revealing their inherent features for better explanation. First, the malware binaries are grouped using similarity analysis. Then, we extract key patterns of interaction behavior using an N-gram model. We also train a multiclass classifier to identify IoT malware categories based on common infection behavior. For misclassified subclasses, second-stage sub-training is performed using a file meta-feature. Our results demonstrate 96.70% accuracy, with high precision and recall. The clustering results reveal variant attack vectors and one denial of service (DoS) attack that used pure Linux commands.