Formal Verification of a Decision-Tree Ensemble Model and Detection of Its Violation Ranges

Naoto SATO; Hironobu KURUMA; Yuichiroh NAKAGAWA; Hideto OGAWA

doi:10.1587/transinf.2019EDP7120

IEICE TRANSACTIONS on Information

Formal Verification of a Decision-Tree Ensemble Model and Detection of Its Violation Ranges

Naoto SATO, Hironobu KURUMA, Yuichiroh NAKAGAWA, Hideto OGAWA

Full Text Views

0

Cite this

Summary :

As one type of machine-learning model, a “decision-tree ensemble model” (DTEM) is represented by a set of decision trees. A DTEM is mainly known to be valid for structured data; however, like other machine-learning models, it is difficult to train so that it returns the correct output value (called “prediction value”) for any input value (called “attribute value”). Accordingly, when a DTEM is used in regard to a system that requires reliability, it is important to comprehensively detect attribute values that lead to malfunctions of a system (failures) during development and take appropriate countermeasures. One conceivable solution is to install an input filter that controls the input to the DTEM and to use separate software to process attribute values that may lead to failures. To develop the input filter, it is necessary to specify the filtering condition for the attribute value that leads to the malfunction of the system. In consideration of that necessity, we propose a method for formally verifying a DTEM and, according to the result of the verification, if an attribute value leading to a failure is found, extracting the range in which such an attribute value exists. The proposed method can comprehensively extract the range in which the attribute value leading to the failure exists; therefore, by creating an input filter based on that range, it is possible to prevent the failure. To demonstrate the feasibility of the proposed method, we performed a case study using a dataset of house prices. Through the case study, we also evaluated its scalability and it is shown that the number and depth of decision trees are important factors that determines the applicability of the proposed method.

Publication: IEICE TRANSACTIONS on Information Vol.E103-D No.2 pp.363-378

Publication Date: 2020/02/01

Publicized: 2019/11/20

Online ISSN: 1745-1361

DOI: 10.1587/transinf.2019EDP7120

Type of Manuscript: PAPER

Category: Dependable Computing

Authors

Naoto SATO
  Hitachi, Ltd.
Hironobu KURUMA
  Hitachi, Ltd.
Yuichiroh NAKAGAWA
  Hitachi, Ltd.
Hideto OGAWA
  Hitachi, Ltd.

Keyword

machine learning, formal verification, decision-tree ensemble model

Cite this

Copy

Naoto SATO, Hironobu KURUMA, Yuichiroh NAKAGAWA, Hideto OGAWA, "Formal Verification of a Decision-Tree Ensemble Model and Detection of Its Violation Ranges" in IEICE TRANSACTIONS on Information, vol. E103-D, no. 2, pp. 363-378, February 2020, doi: 10.1587/transinf.2019EDP7120.
Abstract: As one type of machine-learning model, a “decision-tree ensemble model” (DTEM) is represented by a set of decision trees. A DTEM is mainly known to be valid for structured data; however, like other machine-learning models, it is difficult to train so that it returns the correct output value (called “prediction value”) for any input value (called “attribute value”). Accordingly, when a DTEM is used in regard to a system that requires reliability, it is important to comprehensively detect attribute values that lead to malfunctions of a system (failures) during development and take appropriate countermeasures. One conceivable solution is to install an input filter that controls the input to the DTEM and to use separate software to process attribute values that may lead to failures. To develop the input filter, it is necessary to specify the filtering condition for the attribute value that leads to the malfunction of the system. In consideration of that necessity, we propose a method for formally verifying a DTEM and, according to the result of the verification, if an attribute value leading to a failure is found, extracting the range in which such an attribute value exists. The proposed method can comprehensively extract the range in which the attribute value leading to the failure exists; therefore, by creating an input filter based on that range, it is possible to prevent the failure. To demonstrate the feasibility of the proposed method, we performed a case study using a dataset of house prices. Through the case study, we also evaluated its scalability and it is shown that the number and depth of decision trees are important factors that determines the applicability of the proposed method.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2019EDP7120/_p

Copy

@ARTICLE{e103-d_2_363,
author={Naoto SATO, Hironobu KURUMA, Yuichiroh NAKAGAWA, Hideto OGAWA, },
journal={IEICE TRANSACTIONS on Information},
title={Formal Verification of a Decision-Tree Ensemble Model and Detection of Its Violation Ranges},
year={2020},
volume={E103-D},
number={2},
pages={363-378},
abstract={As one type of machine-learning model, a “decision-tree ensemble model” (DTEM) is represented by a set of decision trees. A DTEM is mainly known to be valid for structured data; however, like other machine-learning models, it is difficult to train so that it returns the correct output value (called “prediction value”) for any input value (called “attribute value”). Accordingly, when a DTEM is used in regard to a system that requires reliability, it is important to comprehensively detect attribute values that lead to malfunctions of a system (failures) during development and take appropriate countermeasures. One conceivable solution is to install an input filter that controls the input to the DTEM and to use separate software to process attribute values that may lead to failures. To develop the input filter, it is necessary to specify the filtering condition for the attribute value that leads to the malfunction of the system. In consideration of that necessity, we propose a method for formally verifying a DTEM and, according to the result of the verification, if an attribute value leading to a failure is found, extracting the range in which such an attribute value exists. The proposed method can comprehensively extract the range in which the attribute value leading to the failure exists; therefore, by creating an input filter based on that range, it is possible to prevent the failure. To demonstrate the feasibility of the proposed method, we performed a case study using a dataset of house prices. Through the case study, we also evaluated its scalability and it is shown that the number and depth of decision trees are important factors that determines the applicability of the proposed method.},
keywords={},
doi={10.1587/transinf.2019EDP7120},
ISSN={1745-1361},
month={February},}

Copy

TY - JOUR
TI - Formal Verification of a Decision-Tree Ensemble Model and Detection of Its Violation Ranges
T2 - IEICE TRANSACTIONS on Information
SP - 363
EP - 378
AU - Naoto SATO
AU - Hironobu KURUMA
AU - Yuichiroh NAKAGAWA
AU - Hideto OGAWA
PY - 2020
DO - 10.1587/transinf.2019EDP7120
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E103-D
IS - 2
JA - IEICE TRANSACTIONS on Information
Y1 - February 2020
AB - As one type of machine-learning model, a “decision-tree ensemble model” (DTEM) is represented by a set of decision trees. A DTEM is mainly known to be valid for structured data; however, like other machine-learning models, it is difficult to train so that it returns the correct output value (called “prediction value”) for any input value (called “attribute value”). Accordingly, when a DTEM is used in regard to a system that requires reliability, it is important to comprehensively detect attribute values that lead to malfunctions of a system (failures) during development and take appropriate countermeasures. One conceivable solution is to install an input filter that controls the input to the DTEM and to use separate software to process attribute values that may lead to failures. To develop the input filter, it is necessary to specify the filtering condition for the attribute value that leads to the malfunction of the system. In consideration of that necessity, we propose a method for formally verifying a DTEM and, according to the result of the verification, if an attribute value leading to a failure is found, extracting the range in which such an attribute value exists. The proposed method can comprehensively extract the range in which the attribute value leading to the failure exists; therefore, by creating an input filter based on that range, it is possible to prevent the failure. To demonstrate the feasibility of the proposed method, we performed a case study using a dataset of house prices. Through the case study, we also evaluated its scalability and it is shown that the number and depth of decision trees are important factors that determines the applicability of the proposed method.
ER -

IEICE TRANSACTIONS on Information