The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] data sanitization(2hit)

1-2hit
  • Single-Letter Characterizations for Information Erasure under Restriction on the Output Distribution

    Naruaki AMADA  Hideki YAGI  

     
    PAPER-Information Theory

      Pubricized:
    2020/11/09
      Vol:
    E104-A No:5
      Page(s):
    805-813

    In order to erase data including confidential information stored in storage devices, an unrelated and random sequence is usually overwritten, which prevents the data from being restored. The problem of minimizing the cost for information erasure when the amount of information leakage of the confidential information should be less than or equal to a constant asymptotically has been introduced by T. Matsuta and T. Uyematsu. Whereas the minimum cost for overwriting has been given for general sources, a single-letter characterization for stationary memoryless sources is not easily derived. In this paper, we give single-letter characterizations for stationary memoryless sources under two types of restrictions: one requires the output distribution of the encoder to be independent and identically distributed (i.i.d.) and the other requires it to be memoryless but not necessarily i.i.d. asymptotically. The characterizations indicate the relation among the amount of information leakage, the minimum cost for information erasure and the rate of the size of uniformly distributed sequences. The obtained results show that the minimum costs are different between these restrictions.

  • Manage the Tradeoff in Data Sanitization

    Peng CHENG  Chun-Wei LIN  Jeng-Shyang PAN  Ivan LEE  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2015/07/14
      Vol:
    E98-D No:10
      Page(s):
    1856-1860

    Sharing data might bring the risk of disclosing the sensitive knowledge in it. Usually, the data owner may choose to sanitize data by modifying some items in it to hide sensitive knowledge prior to sharing. This paper focuses on protecting sensitive knowledge in the form of frequent itemsets by data sanitization. The sanitization process may result in side effects, i.e., the data distortion and the damage to the non-sensitive frequent itemsets. How to minimize these side effects is a challenging problem faced by the research community. Actually, there is a trade-off when trying to minimize both side effects simultaneously. In view of this, we propose a data sanitization method based on evolutionary multi-objective optimization (EMO). This method can hide specified sensitive itemsets completely while minimizing the accompanying side effects. Experiments on real datasets show that the proposed approach is very effective in performing the hiding task with fewer damage to the original data and non-sensitive knowledge.