IEICE global.ieice.org Site

Author Search Result

[Author] Yiqun DING(2hit)

1-2hit

Finding Frequent Closed Itemsets in Sliding Window in Linear Time
Junbo CHEN Bo ZHOU Lu CHEN Xinyu WANG Yiqun DING

PAPER-Data Mining

Vol:
E91-D No:10
Page(s):
2406-2418
One of the most well-studied problems in data mining is computing the collection of frequent itemsets in large transactional databases. Since the introduction of the famous Apriori algorithm [14], many others have been proposed to find the frequent itemsets. Among such algorithms, the approach of mining closed itemsets has raised much interest in data mining community. The algorithms taking this approach include TITANIC [8], CLOSET+ [6], DCI-Closed [4], FCI-Stream [3], GC-Tree [5], TGC-Tree [16] etc. Among these algorithms, FCI-Stream, GC-Tree and TGC-Tree are online algorithms work under sliding window environments. By the performance evaluation in [16], GC-Tree [15] is the fastest one. In this paper, an improved algorithm based on GC-Tree is proposed, the computational complexity of which is proved to be a linear combination of the average transaction size and the average closed itemset size. The algorithm is based on the essential theorem presented in Sect. 4.2. Empirically, the new algorithm is several orders of magnitude faster than the state of art algorithm, GC-Tree.
Mining Noise-Tolerant Frequent Closed Itemsets in Very Large Database
Junbo CHEN Bo ZHOU Xinyu WANG Yiqun DING Lu CHEN

PAPER-Data Mining

Vol:
E92-D No:8
Page(s):
1523-1533
Frequent Itemsets(FI) mining is a popular and important first step in analyzing datasets across a broad range of applications. There are two main problems with the traditional approach for finding frequent itemsets. Firstly, it may often derive an undesirably huge set of frequent itemsets and association rules. Secondly, it is vulnerable to noise. There are two approaches which have been proposed to address these problems individually. The first problem is addressed by the approach Frequent Closed Itemsets(FCI), FCI removes all the redundant information from the result and makes sure there is no information loss. The second problem is addressed by the approach Approximate Frequent Itemsets(AFI), AFI could identify and fix the noises in the datasets. Each of these two concepts has its own limitations, however, the authors find that if FCI and AFI are put together, they could help each other to overcome the limitations and amplify the advantages. The new integrated approach is termed Noise-tolerant Frequent Closed Itemset(NFCI). The results of the experiments demonstrate the advantages of the new approach: (1) It is noise tolerant. (2) The number of itemsets generated would be dramatically reduced with almost no information loss except for the noise and the infrequent patterns. (3) Hence, it is both time and space efficient. (4) No redundant information is in the result.

Author Search Result

[Author] Yiqun DING(2hit)

Finding Frequent Closed Itemsets in Sliding Window in Linear Time

Mining Noise-Tolerant Frequent Closed Itemsets in Very Large Database

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles