Computer Science

From CountMin to Super kJoin Sketches for Flow Spread Estimation

Document Type



Real-time data stream processing is key to many Internet applications ranging from e-commerce, social networks, to network monitoring. Sketches (compact data structures) are often used to track large, high-rate streams and estimate their statistics. However, the traditional model for measuring flow size, e.g., number of packets in each flow, is unsuitable for more sophisticated applications that require flow spreads, i.e., number of distinct elements in each flow. When there are numerous flows, most existing work modifies CountMin sketches — which were designed to measure flow size — for measuring flow spread. They inherit a similar design and a commonly-used min operation from the CountMin sketches. This paper casts doubt on such a solution path, arguing that the inherited sketch design and the min operation were both intended for a different purpose and are not the best for flow spread measurement. Replacing the min operation with new position-aware operations, we propose the kJoin sketches that achieve much better average accuracy. We mathematically derive the estimation error bound. We also apply our sketches for network-wide measurement. Trace-driven experiments on software/GPU/FPGA platforms demonstrate our work significantly improves estimation accuracy, streaming rate (recording throughput), and query throughput, compared to the state-of-the-art. IEEE

Publication Title

IEEE Transactions on Network Science and Engineering

Publication Date


First Page


Last Page







arrays, data models, frequency measurement, monitoring, network-wide measurement, real-time systems, registers, size measurement, spread measurement, traffic measurement

APA Citation

Ma, C., Odegbile, O. O., Melissourgos, D., Wang, H., & Chen, S. (2023). From CountMin to Super kJoin Sketches for Flow Spread Estimation. IEEE Transactions on Network Science and Engineering.