Filtering uninteresting data is important to utilize “big data”. Skyline query is popular technique to filter uninteresting data, in which it selects a set of objects that are not dominated by another from a given large database. However, a skyline query often retrieves too many objects to analyze intensively especially for high-dimensional dataset. To solve the problem, k-dominant skyline queries have been introduced. The size of databases sometimes become too large to compute in a centralized environment. Conventional algorithms for computing k-dominant skyline queries are not well suited for parallel and distributed environments, such as the MapReduce framework. In this paper, we consider an efficient parallel algorithm to process k-dominant skyline query in MapReduce framework. Extensive experiments demonstrate the scalability of proposed algorithm for synthetic big datasets under different settings of data distribution, dimensionality, and cardinality.
Md. Anisuzzaman SIDDIQUE
Hiroshima University
Hao TIAN
Hiroshima University
Yasuhiko MORIMOTO
Hiroshima University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Md. Anisuzzaman SIDDIQUE, Hao TIAN, Yasuhiko MORIMOTO, "k-Dominant Skyline Query Computation in MapReduce Environment" in IEICE TRANSACTIONS on Information,
vol. E98-D, no. 5, pp. 1027-1034, May 2015, doi: 10.1587/transinf.2014DAP0010.
Abstract: Filtering uninteresting data is important to utilize “big data”. Skyline query is popular technique to filter uninteresting data, in which it selects a set of objects that are not dominated by another from a given large database. However, a skyline query often retrieves too many objects to analyze intensively especially for high-dimensional dataset. To solve the problem, k-dominant skyline queries have been introduced. The size of databases sometimes become too large to compute in a centralized environment. Conventional algorithms for computing k-dominant skyline queries are not well suited for parallel and distributed environments, such as the MapReduce framework. In this paper, we consider an efficient parallel algorithm to process k-dominant skyline query in MapReduce framework. Extensive experiments demonstrate the scalability of proposed algorithm for synthetic big datasets under different settings of data distribution, dimensionality, and cardinality.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2014DAP0010/_p
Copy
@ARTICLE{e98-d_5_1027,
author={Md. Anisuzzaman SIDDIQUE, Hao TIAN, Yasuhiko MORIMOTO, },
journal={IEICE TRANSACTIONS on Information},
title={k-Dominant Skyline Query Computation in MapReduce Environment},
year={2015},
volume={E98-D},
number={5},
pages={1027-1034},
abstract={Filtering uninteresting data is important to utilize “big data”. Skyline query is popular technique to filter uninteresting data, in which it selects a set of objects that are not dominated by another from a given large database. However, a skyline query often retrieves too many objects to analyze intensively especially for high-dimensional dataset. To solve the problem, k-dominant skyline queries have been introduced. The size of databases sometimes become too large to compute in a centralized environment. Conventional algorithms for computing k-dominant skyline queries are not well suited for parallel and distributed environments, such as the MapReduce framework. In this paper, we consider an efficient parallel algorithm to process k-dominant skyline query in MapReduce framework. Extensive experiments demonstrate the scalability of proposed algorithm for synthetic big datasets under different settings of data distribution, dimensionality, and cardinality.},
keywords={},
doi={10.1587/transinf.2014DAP0010},
ISSN={1745-1361},
month={May},}
Copy
TY - JOUR
TI - k-Dominant Skyline Query Computation in MapReduce Environment
T2 - IEICE TRANSACTIONS on Information
SP - 1027
EP - 1034
AU - Md. Anisuzzaman SIDDIQUE
AU - Hao TIAN
AU - Yasuhiko MORIMOTO
PY - 2015
DO - 10.1587/transinf.2014DAP0010
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E98-D
IS - 5
JA - IEICE TRANSACTIONS on Information
Y1 - May 2015
AB - Filtering uninteresting data is important to utilize “big data”. Skyline query is popular technique to filter uninteresting data, in which it selects a set of objects that are not dominated by another from a given large database. However, a skyline query often retrieves too many objects to analyze intensively especially for high-dimensional dataset. To solve the problem, k-dominant skyline queries have been introduced. The size of databases sometimes become too large to compute in a centralized environment. Conventional algorithms for computing k-dominant skyline queries are not well suited for parallel and distributed environments, such as the MapReduce framework. In this paper, we consider an efficient parallel algorithm to process k-dominant skyline query in MapReduce framework. Extensive experiments demonstrate the scalability of proposed algorithm for synthetic big datasets under different settings of data distribution, dimensionality, and cardinality.
ER -