南京航空航天大学主页平台管理系统秦小麟--中文主页-- A Dynamic Top-k Query Based on the Improved Grid Multi-Dimensional Index TTI

秦小麟 ( 教授 ) 赞
的个人主页 http://faculty.nuaa.edu.cn/qxz/zh_CN/index.htm
教授

论文成果当前位置: 中文主页 >> 科学研究 >> 论文成果

A Dynamic Top-k Query Based on the Improved Grid Multi-Dimensional Index TTI

点击次数：
所属单位：计算机科学与技术学院/人工智能学院/软件学院
发表刊物：Jisuanji Xuebao
摘要：Efficient processing of Top-k queries is a crucial requirement in many dynamic environments that involve massive amounts of data. Observed in many real scenario, a Top-k query often consists of two components to reflect a user's preference: selection condition and ranking function. Users can set their own sort function, simultaneously can choose to query different interesting subsets of the data. For traditional database, scholars have studied in the Top-k algorithm. However, previous works do not adapt to the situation when object attribute values is dynamic or the data amount is great. To address the problem of dynamic Top-k query, we propose a Trunk Tree Index (TTI) structure based on grid index. To remedy the lack ability of dealing these data, we describe a novel grid index method and give the GID-Top-k algorithm. The TTI index is determined according to the dominating relation between grids, and the dominating relation is further divided into father and son relation, brother relation and so on. TTI index can acquire the position of the grid subscript when there exists insert, delete and update. Furthermore, the process of pruning in Grid Index based Dynamic Top-k can effectively computing the k dominated capability through the summary of the grid in the Trunk tree Index, and minimizing the number of k dominated capability to be counted by looking for determinant units. Divide the influence area and free area according to the division of the region. Finally, pruning the data set, and if the data changes in the free area, there is no need to recalculate Top-k result. It can reduce the probability of recalculation when data changes frequently. The building blocks and calculation modules of the multi-dimensional TTI index are given, the grid dominance relationships, the relationship between brothers and sons are clarified from the two-dimensional to the multi-dimensional case with TTI. The pruning rules and calculation of free area and influence are related with the operation of searching grid, which avoid the stochastic search in grid using linked list. The cost of building and maintaining of the index is also analyzed with query efficiency. The experiment chooses different datasets to test, which compared with the baseline algorithm Top-k, RankCube and CIA algorithm separately. Because grid index is different from general index, the n dimensions on query space is divided into such as large area of the grid, under the condition of the same configuration parameters in other experiments, the distribution of data in each grid density on the algorithm performance is going to have an impaction of the presented algorithms. Experimental results verify the effectiveness of the proposed algorithm. Experimental results verify the effectiveness of the algorithm, the experimental data show that in the static case, the query efficiency of the algorithm in this paper up to eight times faster than the traditional Top-k algorithm, dynamic cases up to 10 times faster than the existing comparison objects. We will develop the improved Top-k queries to be an approach for big data analysis, and multi-criteria retrieval applications. © 2019, Science Press. All right reserved.
ISSN号：0254-4164
是否译文：否
发表时间：2019-08-01
合写作者：Deng, Dan-Ping,李博涵,Zheng, Wei,郑玮,刘亮,李雪飞
通讯作者：秦小麟,李博涵

上一条：空间多关键词Skyline查询算法

下一条： Effective Multi-Objective Scheduling Strategy of Dataflow in Cloud