秦小麟
Personal Homepage
Paper Publications
A Dynamic Top-k Query Based on the Improved Grid Multi-Dimensional Index TTI
Hits:

Affiliation of Author(s):计算机科学与技术学院/人工智能学院/软件学院

Journal:Jisuanji Xuebao

Abstract:Efficient processing of Top-k queries is a crucial requirement in many dynamic environments that involve massive amounts of data. Observed in many real scenario, a Top-k query often consists of two components to reflect a user's preference: selection condition and ranking function. Users can set their own sort function, simultaneously can choose to query different interesting subsets of the data. For traditional database, scholars have studied in the Top-k algorithm. However, previous works do not adapt to the situation when object attribute values is dynamic or the data amount is great. To address the problem of dynamic Top-k query, we propose a Trunk Tree Index (TTI) structure based on grid index. To remedy the lack ability of dealing these data, we describe a novel grid index method and give the GID-Top-k algorithm. The TTI index is determined according to the dominating relation between grids, and the dominating relation is further divided into father and son relation, brother relation and so on. TTI index can acquire the position of the grid subscript when there exists insert, delete and update. Furthermore, the process of pruning in Grid Index based Dynamic Top-k can effectively computing the k dominated capability through the summary of the grid in the Trunk tree Index, and minimizing the number of k dominated capability to be counted by looking for determinant units. Divide the influence area and free area according to the division of the region. Finally, pruning the data set, and if the data changes in the free area, there is no need to recalculate Top-k result. It can reduce the probability of recalculation when data changes frequently. The building blocks and calculation modules of the multi-dimensional TTI index are given, the grid dominance relationships, the relationship between brothers and sons are clarified from the two-dimensional to the multi-dimensional case with TTI. The pruning rules and calculation of free area and influence are related with the operation of searching grid, which avoid the stochastic search in grid using linked list. The cost of building and maintaining of the index is also analyzed with query efficiency. The experiment chooses different datasets to test, which compared with the baseline algorithm Top-k, RankCube and CIA algorithm separately. Because grid index is different from general index, the n dimensions on query space is divided into such as large area of the grid, under the condition of the same configuration parameters in other experiments, the distribution of data in each grid density on the algorithm performance is going to have an impaction of the presented algorithms. Experimental results verify the effectiveness of the proposed algorithm. Experimental results verify the effectiveness of the algorithm, the experimental data show that in the static case, the query efficiency of the algorithm in this paper up to eight times faster than the traditional Top-k algorithm, dynamic cases up to 10 times faster than the existing comparison objects. We will develop the improved Top-k queries to be an approach for big data analysis, and multi-criteria retrieval applications. © 2019, Science Press. All right reserved.

ISSN No.:0254-4164

Translation or Not:no

Date of Publication:2019-08-01

Co-author:Deng, Dan-Ping,bohan li,Zheng, Wei,zz,Liu Liang,lxf

Correspondence Author:qxz,bohan li

Personal information

Professor

Gender:Male

Alma Mater:南京航空学院

Education Level:Graduate with a professional diploma

Degree:Master's Degree in Engineering

School/Department:College of Computer Science and Technology

Click:

Open time:..

The Last Update Time:..


Copyright©2018- Nanjing University of Aeronautics and Astronautics·Informationization Department(Informationization Technology Center)

MOBILE Version