1-3hit |
PPDP (Privacy-Preserving Data Publishing) is technology that discloses personal information while protecting individual privacy. k-anonymity is a privacy model that should be achieved in PPDP. However, k-anonymity does not guarantee privacy against adversaries who have knowledge of even a few uncommon individuals in a population. In this paper, we propose a new model, called k-presence-secrecy, that prevents such adversaries from inferring whether an arbitrary individual is included in a personal data table. We also propose an algorithm that satisfies the model. k-presence-secrecy is a practical model because an algorithm that satisfies it requires only a PPDP target table as personal information, whereas previous models require a PPDP target table and almost all the background knowledge of adversaries. Our experiments show that, whereas an algorithm satisfying only k-anonymity cannot protect privacy, even against adversaries who have knowledge for one uncommon individual in a population, our algorithm can do so with less information loss and shorter execution time.
Hiroaki KIKUCHI Kouichi ITOH Mebae USHIDA Hiroshi TSUDA Yuji YAMAOKA
This paper studies a privacy-preserving decision tree learning protocol (PPDT) for vertically partitioned datasets. In vertically partitioned datasets, a single class (target) attribute is shared by both parities or carefully treated by either party in existing studies. The proposed scheme allows both parties to have independent class attributes in a secure way and to combine multiple class attributes in arbitrary boolean function, which gives parties some flexibility in data-mining. Our proposed PPDT protocol reduces the CPU-intensive computation of logarithms by approximating with a piecewise linear function defined by light-weight fundamental operations of addition and constant multiplication so that information gain for attributes can be evaluated in a secure function evaluation scheme. Using the UCI Machine Learning dataset and a synthesized dataset, the proposed protocol is evaluated in terms of its accuracy and the sizes of trees*.
Hiroaki KIKUCHI Takayasu YAMAGUCHI Koki HAMADA Yuji YAMAOKA Hidenobu OGURI Jun SAKUMA
Data anonymization is required before a big-data business can run effectively without compromising the privacy of personal information it uses. It is not trivial to choose the best algorithm to anonymize some given data securely for a given purpose. In accurately assessing the risk of data being compromised, there needs to be a balance between utility and security. Therefore, using common pseudo microdata, we propose a competition for the best anonymization and re-identification algorithm. The paper reported the result of the competition and the analysis on the effective of anonymization technique. The competition result reveals that there is a tradeoff between utility and security, and 20.9% records were re-identified in average.