The search functionality is under construction.

The search functionality is under construction.

One of the most well-studied problems in data mining is computing the collection of frequent itemsets in large transactional databases. Since the introduction of the famous Apriori algorithm [14], many others have been proposed to find the frequent itemsets. Among such algorithms, the approach of mining *closed itemsets* has raised much interest in data mining community. The algorithms taking this approach include TITANIC [8], CLOSET+ [6], DCI-Closed [4], FCI-Stream [3], GC-Tree [5], TGC-Tree [16] etc. Among these algorithms, FCI-Stream, GC-Tree and TGC-Tree are online algorithms work under sliding window environments. By the performance evaluation in [16], GC-Tree [15] is the fastest one. In this paper, an improved algorithm based on GC-Tree is proposed, the computational complexity of which is proved to be a linear combination of the average transaction size and the average closed itemset size. The algorithm is based on the essential theorem presented in Sect. 4.2. Empirically, the new algorithm is several orders of magnitude faster than the state of art algorithm, GC-Tree.

- Publication
- IEICE TRANSACTIONS on Information Vol.E91-D No.10 pp.2406-2418

- Publication Date
- 2008/10/01

- Publicized

- Online ISSN
- 1745-1361

- DOI
- 10.1093/ietisy/e91-d.10.2406

- Type of Manuscript
- PAPER

- Category
- Data Mining

The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.

Copy

Junbo CHEN, Bo ZHOU, Lu CHEN, Xinyu WANG, Yiqun DING, "Finding Frequent Closed Itemsets in Sliding Window in Linear Time" in IEICE TRANSACTIONS on Information,
vol. E91-D, no. 10, pp. 2406-2418, October 2008, doi: 10.1093/ietisy/e91-d.10.2406.

Abstract: One of the most well-studied problems in data mining is computing the collection of frequent itemsets in large transactional databases. Since the introduction of the famous Apriori algorithm [14], many others have been proposed to find the frequent itemsets. Among such algorithms, the approach of mining *closed itemsets* has raised much interest in data mining community. The algorithms taking this approach include TITANIC [8], CLOSET+ [6], DCI-Closed [4], FCI-Stream [3], GC-Tree [5], TGC-Tree [16] etc. Among these algorithms, FCI-Stream, GC-Tree and TGC-Tree are online algorithms work under sliding window environments. By the performance evaluation in [16], GC-Tree [15] is the fastest one. In this paper, an improved algorithm based on GC-Tree is proposed, the computational complexity of which is proved to be a linear combination of the average transaction size and the average closed itemset size. The algorithm is based on the essential theorem presented in Sect. 4.2. Empirically, the new algorithm is several orders of magnitude faster than the state of art algorithm, GC-Tree.

URL: https://global.ieice.org/en_transactions/information/10.1093/ietisy/e91-d.10.2406/_p

Copy

@ARTICLE{e91-d_10_2406,

author={Junbo CHEN, Bo ZHOU, Lu CHEN, Xinyu WANG, Yiqun DING, },

journal={IEICE TRANSACTIONS on Information},

title={Finding Frequent Closed Itemsets in Sliding Window in Linear Time},

year={2008},

volume={E91-D},

number={10},

pages={2406-2418},

abstract={One of the most well-studied problems in data mining is computing the collection of frequent itemsets in large transactional databases. Since the introduction of the famous Apriori algorithm [14], many others have been proposed to find the frequent itemsets. Among such algorithms, the approach of mining *closed itemsets* has raised much interest in data mining community. The algorithms taking this approach include TITANIC [8], CLOSET+ [6], DCI-Closed [4], FCI-Stream [3], GC-Tree [5], TGC-Tree [16] etc. Among these algorithms, FCI-Stream, GC-Tree and TGC-Tree are online algorithms work under sliding window environments. By the performance evaluation in [16], GC-Tree [15] is the fastest one. In this paper, an improved algorithm based on GC-Tree is proposed, the computational complexity of which is proved to be a linear combination of the average transaction size and the average closed itemset size. The algorithm is based on the essential theorem presented in Sect. 4.2. Empirically, the new algorithm is several orders of magnitude faster than the state of art algorithm, GC-Tree.},

keywords={},

doi={10.1093/ietisy/e91-d.10.2406},

ISSN={1745-1361},

month={October},}

Copy

TY - JOUR

TI - Finding Frequent Closed Itemsets in Sliding Window in Linear Time

T2 - IEICE TRANSACTIONS on Information

SP - 2406

EP - 2418

AU - Junbo CHEN

AU - Bo ZHOU

AU - Lu CHEN

AU - Xinyu WANG

AU - Yiqun DING

PY - 2008

DO - 10.1093/ietisy/e91-d.10.2406

JO - IEICE TRANSACTIONS on Information

SN - 1745-1361

VL - E91-D

IS - 10

JA - IEICE TRANSACTIONS on Information

Y1 - October 2008

AB - One of the most well-studied problems in data mining is computing the collection of frequent itemsets in large transactional databases. Since the introduction of the famous Apriori algorithm [14], many others have been proposed to find the frequent itemsets. Among such algorithms, the approach of mining *closed itemsets* has raised much interest in data mining community. The algorithms taking this approach include TITANIC [8], CLOSET+ [6], DCI-Closed [4], FCI-Stream [3], GC-Tree [5], TGC-Tree [16] etc. Among these algorithms, FCI-Stream, GC-Tree and TGC-Tree are online algorithms work under sliding window environments. By the performance evaluation in [16], GC-Tree [15] is the fastest one. In this paper, an improved algorithm based on GC-Tree is proposed, the computational complexity of which is proved to be a linear combination of the average transaction size and the average closed itemset size. The algorithm is based on the essential theorem presented in Sect. 4.2. Empirically, the new algorithm is several orders of magnitude faster than the state of art algorithm, GC-Tree.

ER -