Research on Reverse Skyline Query Algorithm Based on Decision Set

Research on Reverse Skyline Query Algorithm Based on Decision Set

Lan Huang, Yuanwei Zhao, Pedro Mestre, Laipeng Han, Kangping Wang, Wenjuan Gao, Rui Zhang
Copyright: © 2022 |Pages: 28
DOI: 10.4018/JDM.313971
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Reverse skyline query is an extension of the classical skyline query, widely used in the decision support in e-business. The vast burst of big data in e-business challenges the classical algorithms for such queries. This paper provides a novel definition of decision set and a decision set based reverse skyline query method called DRS on the double-layer R tree indexing in a map-reduce manner. Theoretical proofs are provided for the correctness and complexity of the DRS algorithm. Experiments made using several large data sets are presented and analyzed to illustrate the applicability and the outperformance of DRS over the state-of-the-art reverse skyline query methods.
Article Preview
Top

Introduction

Skyline (Borzsony, Kossmann, & Stocker, 2001) algorithm has been widely concerned since it was proposed, and it has been widely used in the field of recommendation strategy. Due to the limitation that skyline algorithm can't set query points individually, its variant dynamic skyline query emerges as the times require. However, the main function of skyline and dynamic skyline(Zhou, Qin, Zhang, & Jian-Qiu, 2019) is to find the target point according to the query point. For example, when a diner wants to find the most favorite restaurant to eat, he can find the most qualified restaurant by using his demands as the query point. But what if the query is not from a diner but a restaurant? How do restaurant owners find potential diners? Reverse skyline algorithm is proposed to solve this problem. Reverse skyline algorithm takes the information attribute of the businesses as the query point and the potential user as the output point. This can help businesses to find potential customers, so as to target advertising and attract target customers.

With the rapid growth of data in recent years, the traditional reverse skyline query algorithm based on single machine(Rajba, 2021) processing mode is no longer practical. Therefore, it is vital to improve and apply the Reverse Skyline query algorithm in a distributed environment (Gaj et al., 2013). The MapReduce(Dean & Ghemawat, 2008) design pattern effectively implements parallel processing of big data under distributed environment by using the idea of partition(Ning, Sun, Zhao, Xing, & Li, 2020). However, MapReduce was designed to store the data without moving(Ramanathan & Latha, 2019), as well as the subsequent data manipulation. Each query will inevitably result in the overall traversal of the data. Quantities of IO transmission in the shuffle process will consequently ruin the query efficiency. Therefore, how to take advantage of MapReduce and avoid traversal is a matter of high importance to design and implement an efficient, stable, universal, and well-suited reverse skyline query algorithm for distributed environments in the contemporary big data environment.

Most of the previous skyline query algorithms based on the pruning strategy(Krishnamoorthy, 2015) are difficult to deal with the situation of a large number of data(Islam, Rahayu, Liu, Anwar, & Stantic, 2017). This situation is worse in reverse skyline query, where the amount of calculation is generally greater than skyline query. This paper defines the filter set to rule out a large portion of the data points and cut the cost of subsequent operations. The window query(Jing, Xin, & Liu, 2005) is a necessary and sufficient condition to judge whether a data point is the reverse skyline point of the query point. This concept is widely used in the reverse skyline query algorithm. Create a window for data points and query points. If there are no other data points in the window, the point becomes the reverse skyline point of the query point. The decision set is defined to reduce the cost of window queries on the remaining points from the filter set. In turn it cuts the consumption of resources to traverse on the remaining points.

Based on the existing research in the industry(Kim & Oh, 2015), the authors make an optimization on the Reverse Skyline query algorithm aiming at efficiency, real-time, reusability and combination with index structure. The main contribution is as follows:

Complete Article List

Search this Journal:
Reset
Volume 35: 1 Issue (2024)
Volume 34: 3 Issues (2023)
Volume 33: 5 Issues (2022): 4 Released, 1 Forthcoming
Volume 32: 4 Issues (2021)
Volume 31: 4 Issues (2020)
Volume 30: 4 Issues (2019)
Volume 29: 4 Issues (2018)
Volume 28: 4 Issues (2017)
Volume 27: 4 Issues (2016)
Volume 26: 4 Issues (2015)
Volume 25: 4 Issues (2014)
Volume 24: 4 Issues (2013)
Volume 23: 4 Issues (2012)
Volume 22: 4 Issues (2011)
Volume 21: 4 Issues (2010)
Volume 20: 4 Issues (2009)
Volume 19: 4 Issues (2008)
Volume 18: 4 Issues (2007)
Volume 17: 4 Issues (2006)
Volume 16: 4 Issues (2005)
Volume 15: 4 Issues (2004)
Volume 14: 4 Issues (2003)
Volume 13: 4 Issues (2002)
Volume 12: 4 Issues (2001)
Volume 11: 4 Issues (2000)
Volume 10: 4 Issues (1999)
Volume 9: 4 Issues (1998)
Volume 8: 4 Issues (1997)
Volume 7: 4 Issues (1996)
Volume 6: 4 Issues (1995)
Volume 5: 4 Issues (1994)
Volume 4: 4 Issues (1993)
Volume 3: 4 Issues (1992)
Volume 2: 4 Issues (1991)
Volume 1: 2 Issues (1990)
View Complete Journal Contents Listing