Distributed Machine Learning and In-Network Anomaly Detection

Overview:

The area of distributed computing systems provides a promising domain for applications of machine learning methods. In this project, we develop approximate online scheme and distributed tracking protocol for varied anomaly detection algorithms, including Cumulative Sum, PCA, Support Vector Machines (SVM) classification, etc. In brief, the approximate scheme involves a set of local monitors that maintain parameterized sliding filters. These sliding filters yield quantized data streams that are sent to a coordinator. The coordinator makes global decisions based on these quantized data streams. Using matrix perturbation theory and system sensitive analysis, I both assess the impact of quantization on the accuracy of anomaly detection, and design a method that selects filter parameters in a way that bounds the detection error. In many applications, my detection scheme can reduce 80-90% of data while still doing accurate detection.

Publications:

  • In-Network PCA and Anomaly Detection, [longer version]. Ling Huang, XuanLong Nguyen, Minos Garofalakis, Anthony Joseph, Michael Jordan and Nina Taft. In Advances in Neural Information Processing Systems (NIPS) 19. Vancouver, B.C, December 2006.
  • Talks:

    People: