We propose a multi-targeted backdoor that misleads different models to different classes. The method trains multiple models with data that include specific triggers that will be misclassified by different models into different classes. For example, an attacker can use a single multi-targeted backdoor sample to make model A recognize it as a stop sign, model B as a left-turn sign, model C as a right-turn sign, and model D as a U-turn sign. We used MNIST and Fashion-MNIST as experimental datasets and Tensorflow as a machine learning library. Experimental results show that the proposed method with a trigger can cause misclassification as different classes by different models with a 100% attack success rate on MNIST and Fashion-MNIST while maintaining the 97.18% and 91.1% accuracy, respectively, on data without a trigger.
Hyun KWON
Korea Advanced Institute of Science and Technology,Korea Military Academy
Hyunsoo YOON
Korea Advanced Institute of Science and Technology
Ki-Woong PARK
Sejong University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Hyun KWON, Hyunsoo YOON, Ki-Woong PARK, "Multi-Targeted Backdoor: Indentifying Backdoor Attack for Multiple Deep Neural Networks" in IEICE TRANSACTIONS on Information,
vol. E103-D, no. 4, pp. 883-887, April 2020, doi: 10.1587/transinf.2019EDL8170.
Abstract: We propose a multi-targeted backdoor that misleads different models to different classes. The method trains multiple models with data that include specific triggers that will be misclassified by different models into different classes. For example, an attacker can use a single multi-targeted backdoor sample to make model A recognize it as a stop sign, model B as a left-turn sign, model C as a right-turn sign, and model D as a U-turn sign. We used MNIST and Fashion-MNIST as experimental datasets and Tensorflow as a machine learning library. Experimental results show that the proposed method with a trigger can cause misclassification as different classes by different models with a 100% attack success rate on MNIST and Fashion-MNIST while maintaining the 97.18% and 91.1% accuracy, respectively, on data without a trigger.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2019EDL8170/_p
Copy
@ARTICLE{e103-d_4_883,
author={Hyun KWON, Hyunsoo YOON, Ki-Woong PARK, },
journal={IEICE TRANSACTIONS on Information},
title={Multi-Targeted Backdoor: Indentifying Backdoor Attack for Multiple Deep Neural Networks},
year={2020},
volume={E103-D},
number={4},
pages={883-887},
abstract={We propose a multi-targeted backdoor that misleads different models to different classes. The method trains multiple models with data that include specific triggers that will be misclassified by different models into different classes. For example, an attacker can use a single multi-targeted backdoor sample to make model A recognize it as a stop sign, model B as a left-turn sign, model C as a right-turn sign, and model D as a U-turn sign. We used MNIST and Fashion-MNIST as experimental datasets and Tensorflow as a machine learning library. Experimental results show that the proposed method with a trigger can cause misclassification as different classes by different models with a 100% attack success rate on MNIST and Fashion-MNIST while maintaining the 97.18% and 91.1% accuracy, respectively, on data without a trigger.},
keywords={},
doi={10.1587/transinf.2019EDL8170},
ISSN={1745-1361},
month={April},}
Copy
TY - JOUR
TI - Multi-Targeted Backdoor: Indentifying Backdoor Attack for Multiple Deep Neural Networks
T2 - IEICE TRANSACTIONS on Information
SP - 883
EP - 887
AU - Hyun KWON
AU - Hyunsoo YOON
AU - Ki-Woong PARK
PY - 2020
DO - 10.1587/transinf.2019EDL8170
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E103-D
IS - 4
JA - IEICE TRANSACTIONS on Information
Y1 - April 2020
AB - We propose a multi-targeted backdoor that misleads different models to different classes. The method trains multiple models with data that include specific triggers that will be misclassified by different models into different classes. For example, an attacker can use a single multi-targeted backdoor sample to make model A recognize it as a stop sign, model B as a left-turn sign, model C as a right-turn sign, and model D as a U-turn sign. We used MNIST and Fashion-MNIST as experimental datasets and Tensorflow as a machine learning library. Experimental results show that the proposed method with a trigger can cause misclassification as different classes by different models with a 100% attack success rate on MNIST and Fashion-MNIST while maintaining the 97.18% and 91.1% accuracy, respectively, on data without a trigger.
ER -