Computer Science
From CountMin to Super kJoin Sketches for Flow Spread Estimation
Document Type
Article
Abstract
Real-time data stream processing is key to many Internet applications ranging from e-commerce, social networks, to network monitoring. Sketches (compact data structures) are often used to track large, high-rate streams and estimate their statistics. However, the traditional model for measuring flow size, e.g., number of packets in each flow, is unsuitable for more sophisticated applications that require flow spreads, i.e., number of distinct elements in each flow. When there are numerous flows, most existing work modifies CountMin sketches — which were designed to measure flow size — for measuring flow spread. They inherit a similar design and a commonly-used min operation from the CountMin sketches. This paper casts doubt on such a solution path, arguing that the inherited sketch design and the min operation were both intended for a different purpose and are not the best for flow spread measurement. Replacing the min operation with new position-aware operations, we propose the kJoin sketches that achieve much better average accuracy. We mathematically derive the estimation error bound. We also apply our sketches for network-wide measurement. Trace-driven experiments on software/GPU/FPGA platforms demonstrate our work significantly improves estimation accuracy, streaming rate (recording throughput), and query throughput, compared to the state-of-the-art. IEEE
Publication Title
IEEE Transactions on Network Science and Engineering
Publication Date
5-2024
Volume
11
Issue
3
First Page
2353
Last Page
2370
ISSN
2327-4697
DOI
10.1109/TNSE.2023.3279665
Keywords
arrays, data models, frequency measurement, monitoring, network-wide measurement, real-time systems, registers, size measurement, spread measurement, traffic measurement
Repository Citation
Ma, Chaoyi; Odegbile, Olufemi O.; Melissourgos, Dimitrios; Wang, Haibo; and Chen, Shiping, "From CountMin to Super kJoin Sketches for Flow Spread Estimation" (2024). Computer Science. 5.
https://commons.clarku.edu/faculty_computer_sciences/5
APA Citation
Ma, C., Odegbile, O. O., Melissourgos, D., Wang, H., & Chen, S. (2023). From CountMin to Super kJoin Sketches for Flow Spread Estimation. IEEE Transactions on Network Science and Engineering.