A drastic increase in cyberattacks targeting Internet of Things (IoT) devices using telnet protocols has been observed. IoT malware continues to evolve, and the diversity of OS and environments increases the difficulty of executing malware samples in an observation setting. To address this problem, we sought to develop an alternative means of investigation by using the telnet logs of IoT honeypots and analyzing malware without executing it. In this paper, we present a malware classification method based on malware binaries, command sequences, and meta-features. We employ both unsupervised or supervised learning algorithms and text-mining algorithms for handling unstructured data. Clustering analysis is applied for finding malware family members and revealing their inherent features for better explanation. First, the malware binaries are grouped using similarity analysis. Then, we extract key patterns of interaction behavior using an N-gram model. We also train a multiclass classifier to identify IoT malware categories based on common infection behavior. For misclassified subclasses, second-stage sub-training is performed using a file meta-feature. Our results demonstrate 96.70% accuracy, with high precision and recall. The clustering results reveal variant attack vectors and one denial of service (DoS) attack that used pure Linux commands.
Chun-Jung WU
Yokohama National University
Shin-Ying HUANG
Institute for Information Industry
Katsunari YOSHIOKA
Yokohama National University
Tsutomu MATSUMOTO
Yokohama National University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Chun-Jung WU, Shin-Ying HUANG, Katsunari YOSHIOKA, Tsutomu MATSUMOTO, "IoT Malware Analysis and New Pattern Discovery Through Sequence Analysis Using Meta-Feature Information" in IEICE TRANSACTIONS on Communications,
vol. E103-B, no. 1, pp. 32-42, January 2020, doi: 10.1587/transcom.2019CPP0009.
Abstract: A drastic increase in cyberattacks targeting Internet of Things (IoT) devices using telnet protocols has been observed. IoT malware continues to evolve, and the diversity of OS and environments increases the difficulty of executing malware samples in an observation setting. To address this problem, we sought to develop an alternative means of investigation by using the telnet logs of IoT honeypots and analyzing malware without executing it. In this paper, we present a malware classification method based on malware binaries, command sequences, and meta-features. We employ both unsupervised or supervised learning algorithms and text-mining algorithms for handling unstructured data. Clustering analysis is applied for finding malware family members and revealing their inherent features for better explanation. First, the malware binaries are grouped using similarity analysis. Then, we extract key patterns of interaction behavior using an N-gram model. We also train a multiclass classifier to identify IoT malware categories based on common infection behavior. For misclassified subclasses, second-stage sub-training is performed using a file meta-feature. Our results demonstrate 96.70% accuracy, with high precision and recall. The clustering results reveal variant attack vectors and one denial of service (DoS) attack that used pure Linux commands.
URL: https://global.ieice.org/en_transactions/communications/10.1587/transcom.2019CPP0009/_p
Copy
@ARTICLE{e103-b_1_32,
author={Chun-Jung WU, Shin-Ying HUANG, Katsunari YOSHIOKA, Tsutomu MATSUMOTO, },
journal={IEICE TRANSACTIONS on Communications},
title={IoT Malware Analysis and New Pattern Discovery Through Sequence Analysis Using Meta-Feature Information},
year={2020},
volume={E103-B},
number={1},
pages={32-42},
abstract={A drastic increase in cyberattacks targeting Internet of Things (IoT) devices using telnet protocols has been observed. IoT malware continues to evolve, and the diversity of OS and environments increases the difficulty of executing malware samples in an observation setting. To address this problem, we sought to develop an alternative means of investigation by using the telnet logs of IoT honeypots and analyzing malware without executing it. In this paper, we present a malware classification method based on malware binaries, command sequences, and meta-features. We employ both unsupervised or supervised learning algorithms and text-mining algorithms for handling unstructured data. Clustering analysis is applied for finding malware family members and revealing their inherent features for better explanation. First, the malware binaries are grouped using similarity analysis. Then, we extract key patterns of interaction behavior using an N-gram model. We also train a multiclass classifier to identify IoT malware categories based on common infection behavior. For misclassified subclasses, second-stage sub-training is performed using a file meta-feature. Our results demonstrate 96.70% accuracy, with high precision and recall. The clustering results reveal variant attack vectors and one denial of service (DoS) attack that used pure Linux commands.},
keywords={},
doi={10.1587/transcom.2019CPP0009},
ISSN={1745-1345},
month={January},}
Copy
TY - JOUR
TI - IoT Malware Analysis and New Pattern Discovery Through Sequence Analysis Using Meta-Feature Information
T2 - IEICE TRANSACTIONS on Communications
SP - 32
EP - 42
AU - Chun-Jung WU
AU - Shin-Ying HUANG
AU - Katsunari YOSHIOKA
AU - Tsutomu MATSUMOTO
PY - 2020
DO - 10.1587/transcom.2019CPP0009
JO - IEICE TRANSACTIONS on Communications
SN - 1745-1345
VL - E103-B
IS - 1
JA - IEICE TRANSACTIONS on Communications
Y1 - January 2020
AB - A drastic increase in cyberattacks targeting Internet of Things (IoT) devices using telnet protocols has been observed. IoT malware continues to evolve, and the diversity of OS and environments increases the difficulty of executing malware samples in an observation setting. To address this problem, we sought to develop an alternative means of investigation by using the telnet logs of IoT honeypots and analyzing malware without executing it. In this paper, we present a malware classification method based on malware binaries, command sequences, and meta-features. We employ both unsupervised or supervised learning algorithms and text-mining algorithms for handling unstructured data. Clustering analysis is applied for finding malware family members and revealing their inherent features for better explanation. First, the malware binaries are grouped using similarity analysis. Then, we extract key patterns of interaction behavior using an N-gram model. We also train a multiclass classifier to identify IoT malware categories based on common infection behavior. For misclassified subclasses, second-stage sub-training is performed using a file meta-feature. Our results demonstrate 96.70% accuracy, with high precision and recall. The clustering results reveal variant attack vectors and one denial of service (DoS) attack that used pure Linux commands.
ER -