"Noise measurement and removal for data streaming algorithms with netwo" by Chaoyi Ma, Haibo Wang et al.
 

Computer Science

Noise measurement and removal for data streaming algorithms with network applications

Chaoyi Ma, Herbert Wertheim College of Engineering
Haibo Wang, Herbert Wertheim College of Engineering
Olufemi Odegbile, Herbert Wertheim College of Engineering
Shigang Chen, Herbert Wertheim College of Engineering

Abstract

Data streaming has multiple applications on the Internet including traffic measurement and intrusion detection. The bedrock underlying these applications is a set of data streaming algorithms that extract useful information from network packet stream, estimate the needed statistics such as the frequencies of TCP flows, and feed them to application software. Among such algorithms, counting sketches are most prevalent, which are very compact but do so at the cost of errors in their estimations. The dominant error-control method that has been widely accepted for more than a decade is to take the min error from multiple independent estimations. This method produces a positively-biased error and the error can grow large under stringent performance and resource conditions, but no existing work makes an intensive study of this error. This paper investigates the property of the error, which is also known as noise, and claims that it can be measured and removed so as to make the estimations unbiased. We introduce two new ideas, d-smallest noise and artificial data items for measuring the noise. Based on these two ideas, we propose four noise measurement methods. The mathematical analysis and experimental results based on real network traces show that by removing the measured noise, the error of estimations will be reduced to a much lower level than what the state of the art can do.