<I>k</I>-Dominant Skyline Query Computation in MapReduce Environment

Md. Anisuzzaman SIDDIQUE; Hao TIAN; Yasuhiko MORIMOTO

doi:10.1587/transinf.2014DAP0010

k-Dominant Skyline Query Computation in MapReduce Environment

Md. Anisuzzaman SIDDIQUE, Hao TIAN, Yasuhiko MORIMOTO

Full Text Views

0

Cite this

Summary :

Filtering uninteresting data is important to utilize “big data”. Skyline query is popular technique to filter uninteresting data, in which it selects a set of objects that are not dominated by another from a given large database. However, a skyline query often retrieves too many objects to analyze intensively especially for high-dimensional dataset. To solve the problem, k-dominant skyline queries have been introduced. The size of databases sometimes become too large to compute in a centralized environment. Conventional algorithms for computing k-dominant skyline queries are not well suited for parallel and distributed environments, such as the MapReduce framework. In this paper, we consider an efficient parallel algorithm to process k-dominant skyline query in MapReduce framework. Extensive experiments demonstrate the scalability of proposed algorithm for synthetic big datasets under different settings of data distribution, dimensionality, and cardinality.

Publication: IEICE TRANSACTIONS on Information Vol.E98-D No.5 pp.1027-1034

Publication Date: 2015/05/01

Publicized: 2015/01/21

Online ISSN: 1745-1361

DOI: 10.1587/transinf.2014DAP0010

Type of Manuscript: Special Section PAPER (Special Section on Data Engineering and Information Management)

Category

Authors

Md. Anisuzzaman SIDDIQUE
  Hiroshima University
Hao TIAN
  Hiroshima University
Yasuhiko MORIMOTO
  Hiroshima University

Keyword

skyline query, k-dominant skyline query, MapReduce, big data

Cite this

Copy

Md. Anisuzzaman SIDDIQUE, Hao TIAN, Yasuhiko MORIMOTO, "k-Dominant Skyline Query Computation in MapReduce Environment" in IEICE TRANSACTIONS on Information, vol. E98-D, no. 5, pp. 1027-1034, May 2015, doi: 10.1587/transinf.2014DAP0010.
Abstract: Filtering uninteresting data is important to utilize “big data”. Skyline query is popular technique to filter uninteresting data, in which it selects a set of objects that are not dominated by another from a given large database. However, a skyline query often retrieves too many objects to analyze intensively especially for high-dimensional dataset. To solve the problem, k-dominant skyline queries have been introduced. The size of databases sometimes become too large to compute in a centralized environment. Conventional algorithms for computing k-dominant skyline queries are not well suited for parallel and distributed environments, such as the MapReduce framework. In this paper, we consider an efficient parallel algorithm to process k-dominant skyline query in MapReduce framework. Extensive experiments demonstrate the scalability of proposed algorithm for synthetic big datasets under different settings of data distribution, dimensionality, and cardinality.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2014DAP0010/_p

Copy

@ARTICLE{e98-d_5_1027,
author={Md. Anisuzzaman SIDDIQUE, Hao TIAN, Yasuhiko MORIMOTO, },
journal={IEICE TRANSACTIONS on Information},
title={k-Dominant Skyline Query Computation in MapReduce Environment},
year={2015},
volume={E98-D},
number={5},
pages={1027-1034},
abstract={Filtering uninteresting data is important to utilize “big data”. Skyline query is popular technique to filter uninteresting data, in which it selects a set of objects that are not dominated by another from a given large database. However, a skyline query often retrieves too many objects to analyze intensively especially for high-dimensional dataset. To solve the problem, k-dominant skyline queries have been introduced. The size of databases sometimes become too large to compute in a centralized environment. Conventional algorithms for computing k-dominant skyline queries are not well suited for parallel and distributed environments, such as the MapReduce framework. In this paper, we consider an efficient parallel algorithm to process k-dominant skyline query in MapReduce framework. Extensive experiments demonstrate the scalability of proposed algorithm for synthetic big datasets under different settings of data distribution, dimensionality, and cardinality.},
keywords={},
doi={10.1587/transinf.2014DAP0010},
ISSN={1745-1361},
month={May},}

Copy

TY - JOUR
TI - k-Dominant Skyline Query Computation in MapReduce Environment
T2 - IEICE TRANSACTIONS on Information
SP - 1027
EP - 1034
AU - Md. Anisuzzaman SIDDIQUE
AU - Hao TIAN
AU - Yasuhiko MORIMOTO
PY - 2015
DO - 10.1587/transinf.2014DAP0010
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E98-D
IS - 5
JA - IEICE TRANSACTIONS on Information
Y1 - May 2015
AB - Filtering uninteresting data is important to utilize “big data”. Skyline query is popular technique to filter uninteresting data, in which it selects a set of objects that are not dominated by another from a given large database. However, a skyline query often retrieves too many objects to analyze intensively especially for high-dimensional dataset. To solve the problem, k-dominant skyline queries have been introduced. The size of databases sometimes become too large to compute in a centralized environment. Conventional algorithms for computing k-dominant skyline queries are not well suited for parallel and distributed environments, such as the MapReduce framework. In this paper, we consider an efficient parallel algorithm to process k-dominant skyline query in MapReduce framework. Extensive experiments demonstrate the scalability of proposed algorithm for synthetic big datasets under different settings of data distribution, dimensionality, and cardinality.
ER -