Contamination of water resources with pathogenic microorganisms excreted in human feces is a worldwide public health concern. Surveillance of fecal contamination is commonly performed by routine monitoring for a single type or a few types of microorganism(s). To design a feasible routine for periodic monitoring and to control risks of exposure to pathogens, reliable statistical algorithms for inferring correlations between concentrations of microorganisms in water need to be established. Moreover, because pathogens are often present in low concentrations, some contaminations are likely to be under a detection limit. This yields a pairwise left-censored dataset and complicates computation of correlation coefficients. Errors of correlation estimation can be smaller if undetected values are imputed better. To obtain better imputations, we utilize side information and develop a new technique, the asymmetric Tobit model which is an extension of the Tobit model so that domain knowledge can be exploited effectively when fitting the model to a censored dataset. The empirical results demonstrate that imputation with domain knowledge is effective for this task.
HongYuan CAO
Gunma University
Tsuyoshi KATO
Gunma University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
HongYuan CAO, Tsuyoshi KATO, "Asymmetric Tobit Analysis for Correlation Estimation from Censored Data" in IEICE TRANSACTIONS on Information,
vol. E104-D, no. 10, pp. 1632-1639, October 2021, doi: 10.1587/transinf.2021EDP7022.
Abstract: Contamination of water resources with pathogenic microorganisms excreted in human feces is a worldwide public health concern. Surveillance of fecal contamination is commonly performed by routine monitoring for a single type or a few types of microorganism(s). To design a feasible routine for periodic monitoring and to control risks of exposure to pathogens, reliable statistical algorithms for inferring correlations between concentrations of microorganisms in water need to be established. Moreover, because pathogens are often present in low concentrations, some contaminations are likely to be under a detection limit. This yields a pairwise left-censored dataset and complicates computation of correlation coefficients. Errors of correlation estimation can be smaller if undetected values are imputed better. To obtain better imputations, we utilize side information and develop a new technique, the asymmetric Tobit model which is an extension of the Tobit model so that domain knowledge can be exploited effectively when fitting the model to a censored dataset. The empirical results demonstrate that imputation with domain knowledge is effective for this task.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2021EDP7022/_p
Copy
@ARTICLE{e104-d_10_1632,
author={HongYuan CAO, Tsuyoshi KATO, },
journal={IEICE TRANSACTIONS on Information},
title={Asymmetric Tobit Analysis for Correlation Estimation from Censored Data},
year={2021},
volume={E104-D},
number={10},
pages={1632-1639},
abstract={Contamination of water resources with pathogenic microorganisms excreted in human feces is a worldwide public health concern. Surveillance of fecal contamination is commonly performed by routine monitoring for a single type or a few types of microorganism(s). To design a feasible routine for periodic monitoring and to control risks of exposure to pathogens, reliable statistical algorithms for inferring correlations between concentrations of microorganisms in water need to be established. Moreover, because pathogens are often present in low concentrations, some contaminations are likely to be under a detection limit. This yields a pairwise left-censored dataset and complicates computation of correlation coefficients. Errors of correlation estimation can be smaller if undetected values are imputed better. To obtain better imputations, we utilize side information and develop a new technique, the asymmetric Tobit model which is an extension of the Tobit model so that domain knowledge can be exploited effectively when fitting the model to a censored dataset. The empirical results demonstrate that imputation with domain knowledge is effective for this task.},
keywords={},
doi={10.1587/transinf.2021EDP7022},
ISSN={1745-1361},
month={October},}
Copy
TY - JOUR
TI - Asymmetric Tobit Analysis for Correlation Estimation from Censored Data
T2 - IEICE TRANSACTIONS on Information
SP - 1632
EP - 1639
AU - HongYuan CAO
AU - Tsuyoshi KATO
PY - 2021
DO - 10.1587/transinf.2021EDP7022
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E104-D
IS - 10
JA - IEICE TRANSACTIONS on Information
Y1 - October 2021
AB - Contamination of water resources with pathogenic microorganisms excreted in human feces is a worldwide public health concern. Surveillance of fecal contamination is commonly performed by routine monitoring for a single type or a few types of microorganism(s). To design a feasible routine for periodic monitoring and to control risks of exposure to pathogens, reliable statistical algorithms for inferring correlations between concentrations of microorganisms in water need to be established. Moreover, because pathogens are often present in low concentrations, some contaminations are likely to be under a detection limit. This yields a pairwise left-censored dataset and complicates computation of correlation coefficients. Errors of correlation estimation can be smaller if undetected values are imputed better. To obtain better imputations, we utilize side information and develop a new technique, the asymmetric Tobit model which is an extension of the Tobit model so that domain knowledge can be exploited effectively when fitting the model to a censored dataset. The empirical results demonstrate that imputation with domain knowledge is effective for this task.
ER -