Nowadays, malware is a serious threat to the Internet. Traditional signature-based malware detection method can be easily evaded by code obfuscation. Therefore, many researchers use the high-level structure of malware like function call graph, which is impacted less from the obfuscation, to find the malware variants. However, existing graph match methods rely on approximate calculation, which are inefficient and the accuracy cannot be effectively guaranteed. Inspired by the successful application of graph convolutional network in node classification and graph classification, we propose a novel malware similarity metric method based on graph convolutional network. We use graph convolutional network to compute the graph embedding vectors, and then we calculate the similarity metric of two graph based on the distance between two graph embedding vectors. Experimental results on the Kaggle dataset show that our method can applied to the graph based malware similarity metric method, and the accuracy of clustering application with our method reaches to 97% with high time efficiency.
Bing-lin ZHAO
State Key Laboratory of Mathematical Engineering and Advanced Computing
Fu-dong LIU
State Key Laboratory of Mathematical Engineering and Advanced Computing
Zheng SHAN
State Key Laboratory of Mathematical Engineering and Advanced Computing
Yi-hang CHEN
State Key Laboratory of Mathematical Engineering and Advanced Computing
Jian LIU
Nanjing University of Finance and Economics
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Bing-lin ZHAO, Fu-dong LIU, Zheng SHAN, Yi-hang CHEN, Jian LIU, "Graph Similarity Metric Using Graph Convolutional Network: Application to Malware Similarity Match" in IEICE TRANSACTIONS on Information,
vol. E102-D, no. 8, pp. 1581-1585, August 2019, doi: 10.1587/transinf.2018EDL8259.
Abstract: Nowadays, malware is a serious threat to the Internet. Traditional signature-based malware detection method can be easily evaded by code obfuscation. Therefore, many researchers use the high-level structure of malware like function call graph, which is impacted less from the obfuscation, to find the malware variants. However, existing graph match methods rely on approximate calculation, which are inefficient and the accuracy cannot be effectively guaranteed. Inspired by the successful application of graph convolutional network in node classification and graph classification, we propose a novel malware similarity metric method based on graph convolutional network. We use graph convolutional network to compute the graph embedding vectors, and then we calculate the similarity metric of two graph based on the distance between two graph embedding vectors. Experimental results on the Kaggle dataset show that our method can applied to the graph based malware similarity metric method, and the accuracy of clustering application with our method reaches to 97% with high time efficiency.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2018EDL8259/_p
Copy
@ARTICLE{e102-d_8_1581,
author={Bing-lin ZHAO, Fu-dong LIU, Zheng SHAN, Yi-hang CHEN, Jian LIU, },
journal={IEICE TRANSACTIONS on Information},
title={Graph Similarity Metric Using Graph Convolutional Network: Application to Malware Similarity Match},
year={2019},
volume={E102-D},
number={8},
pages={1581-1585},
abstract={Nowadays, malware is a serious threat to the Internet. Traditional signature-based malware detection method can be easily evaded by code obfuscation. Therefore, many researchers use the high-level structure of malware like function call graph, which is impacted less from the obfuscation, to find the malware variants. However, existing graph match methods rely on approximate calculation, which are inefficient and the accuracy cannot be effectively guaranteed. Inspired by the successful application of graph convolutional network in node classification and graph classification, we propose a novel malware similarity metric method based on graph convolutional network. We use graph convolutional network to compute the graph embedding vectors, and then we calculate the similarity metric of two graph based on the distance between two graph embedding vectors. Experimental results on the Kaggle dataset show that our method can applied to the graph based malware similarity metric method, and the accuracy of clustering application with our method reaches to 97% with high time efficiency.},
keywords={},
doi={10.1587/transinf.2018EDL8259},
ISSN={1745-1361},
month={August},}
Copy
TY - JOUR
TI - Graph Similarity Metric Using Graph Convolutional Network: Application to Malware Similarity Match
T2 - IEICE TRANSACTIONS on Information
SP - 1581
EP - 1585
AU - Bing-lin ZHAO
AU - Fu-dong LIU
AU - Zheng SHAN
AU - Yi-hang CHEN
AU - Jian LIU
PY - 2019
DO - 10.1587/transinf.2018EDL8259
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E102-D
IS - 8
JA - IEICE TRANSACTIONS on Information
Y1 - August 2019
AB - Nowadays, malware is a serious threat to the Internet. Traditional signature-based malware detection method can be easily evaded by code obfuscation. Therefore, many researchers use the high-level structure of malware like function call graph, which is impacted less from the obfuscation, to find the malware variants. However, existing graph match methods rely on approximate calculation, which are inefficient and the accuracy cannot be effectively guaranteed. Inspired by the successful application of graph convolutional network in node classification and graph classification, we propose a novel malware similarity metric method based on graph convolutional network. We use graph convolutional network to compute the graph embedding vectors, and then we calculate the similarity metric of two graph based on the distance between two graph embedding vectors. Experimental results on the Kaggle dataset show that our method can applied to the graph based malware similarity metric method, and the accuracy of clustering application with our method reaches to 97% with high time efficiency.
ER -