Computer Science

Efficient and Robust Top-k Algorithms for Big Data IoT

Ruifan Yang, Cornell University
Zheng Zhou, Boston College
Lewis Tseng, Boston College
Moayad Aloqaily, Al Ain University
Azzedine Boukerche, University of Ottawa

Abstract

Top-k considers as a technique to retrieve, from a hypothetically big data set, only the \mathrm{k}(k\geq 1) best (most relevant/important) candidates. Top-k query processing is a decisive necessity in various collaborative environments that comprise big data such as the Internet of Things (IoT) networks. Particularly, efficient top-k processing in large-scale distributed systems has shown a positively noticeable effect on their performance. This paper considers the distributed approximate top-k processing algorithms dedicated to the IoT-based networks and improve the accuracy of algorithms introduced previously. We then propose a safety-based fault-tolerance notation and contribute to improving a known algorithm in terms of accuracy. Our algorithms have been evaluated using simulation and real-world data and show superiority over conventional methods.