The Bloom Filter (BF), a space-and-time-efficient hash-coding method, is used as one of the fundamental modules in several network processing algorithms and applications such as route lookups, cache hits, packet classification, per-flow state management or network monitoring. BF is a simple space-efficient randomized data structure used to represent a data set in order to support membership queries. However, BF generates false positives, and cannot count the number of distinct elements. A counting Bloom Filter (CBF) can count the number of distinct elements, but CBF needs more space than BF. We propose an alternative data structure of CBF, and we called this structure an Adaptive Bloom Filter (ABF). Although ABF uses the same-sized bit-vector used in BF, the number of hash functions employed by ABF is dynamically changed to record the number of appearances of a each key element. Considering the hash collisions, the multiplicity of a each key element on ABF can be estimated from the number of hash functions used to decode the membership of the each key element. Although ABF can realize the same functionality as CBF, ABF requires the same memory size as BF. We describe the construction of ABF and IABF (Improved ABF), and provide a mathematical analysis and simulation using Zipf's distribution. Finally, we show that ABF can be used for an unpredictable data set such as real network traffic.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Yoshihide MATSUMOTO, Hiroaki HAZEYAMA, Youki KADOBAYASHI, "Adaptive Bloom Filter: A Space-Efficient Counting Algorithm for Unpredictable Network Traffic" in IEICE TRANSACTIONS on Information,
vol. E91-D, no. 5, pp. 1292-1299, May 2008, doi: 10.1093/ietisy/e91-d.5.1292.
Abstract: The Bloom Filter (BF), a space-and-time-efficient hash-coding method, is used as one of the fundamental modules in several network processing algorithms and applications such as route lookups, cache hits, packet classification, per-flow state management or network monitoring. BF is a simple space-efficient randomized data structure used to represent a data set in order to support membership queries. However, BF generates false positives, and cannot count the number of distinct elements. A counting Bloom Filter (CBF) can count the number of distinct elements, but CBF needs more space than BF. We propose an alternative data structure of CBF, and we called this structure an Adaptive Bloom Filter (ABF). Although ABF uses the same-sized bit-vector used in BF, the number of hash functions employed by ABF is dynamically changed to record the number of appearances of a each key element. Considering the hash collisions, the multiplicity of a each key element on ABF can be estimated from the number of hash functions used to decode the membership of the each key element. Although ABF can realize the same functionality as CBF, ABF requires the same memory size as BF. We describe the construction of ABF and IABF (Improved ABF), and provide a mathematical analysis and simulation using Zipf's distribution. Finally, we show that ABF can be used for an unpredictable data set such as real network traffic.
URL: https://global.ieice.org/en_transactions/information/10.1093/ietisy/e91-d.5.1292/_p
Copy
@ARTICLE{e91-d_5_1292,
author={Yoshihide MATSUMOTO, Hiroaki HAZEYAMA, Youki KADOBAYASHI, },
journal={IEICE TRANSACTIONS on Information},
title={Adaptive Bloom Filter: A Space-Efficient Counting Algorithm for Unpredictable Network Traffic},
year={2008},
volume={E91-D},
number={5},
pages={1292-1299},
abstract={The Bloom Filter (BF), a space-and-time-efficient hash-coding method, is used as one of the fundamental modules in several network processing algorithms and applications such as route lookups, cache hits, packet classification, per-flow state management or network monitoring. BF is a simple space-efficient randomized data structure used to represent a data set in order to support membership queries. However, BF generates false positives, and cannot count the number of distinct elements. A counting Bloom Filter (CBF) can count the number of distinct elements, but CBF needs more space than BF. We propose an alternative data structure of CBF, and we called this structure an Adaptive Bloom Filter (ABF). Although ABF uses the same-sized bit-vector used in BF, the number of hash functions employed by ABF is dynamically changed to record the number of appearances of a each key element. Considering the hash collisions, the multiplicity of a each key element on ABF can be estimated from the number of hash functions used to decode the membership of the each key element. Although ABF can realize the same functionality as CBF, ABF requires the same memory size as BF. We describe the construction of ABF and IABF (Improved ABF), and provide a mathematical analysis and simulation using Zipf's distribution. Finally, we show that ABF can be used for an unpredictable data set such as real network traffic.},
keywords={},
doi={10.1093/ietisy/e91-d.5.1292},
ISSN={1745-1361},
month={May},}
Copy
TY - JOUR
TI - Adaptive Bloom Filter: A Space-Efficient Counting Algorithm for Unpredictable Network Traffic
T2 - IEICE TRANSACTIONS on Information
SP - 1292
EP - 1299
AU - Yoshihide MATSUMOTO
AU - Hiroaki HAZEYAMA
AU - Youki KADOBAYASHI
PY - 2008
DO - 10.1093/ietisy/e91-d.5.1292
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E91-D
IS - 5
JA - IEICE TRANSACTIONS on Information
Y1 - May 2008
AB - The Bloom Filter (BF), a space-and-time-efficient hash-coding method, is used as one of the fundamental modules in several network processing algorithms and applications such as route lookups, cache hits, packet classification, per-flow state management or network monitoring. BF is a simple space-efficient randomized data structure used to represent a data set in order to support membership queries. However, BF generates false positives, and cannot count the number of distinct elements. A counting Bloom Filter (CBF) can count the number of distinct elements, but CBF needs more space than BF. We propose an alternative data structure of CBF, and we called this structure an Adaptive Bloom Filter (ABF). Although ABF uses the same-sized bit-vector used in BF, the number of hash functions employed by ABF is dynamically changed to record the number of appearances of a each key element. Considering the hash collisions, the multiplicity of a each key element on ABF can be estimated from the number of hash functions used to decode the membership of the each key element. Although ABF can realize the same functionality as CBF, ABF requires the same memory size as BF. We describe the construction of ABF and IABF (Improved ABF), and provide a mathematical analysis and simulation using Zipf's distribution. Finally, we show that ABF can be used for an unpredictable data set such as real network traffic.
ER -