Top-k correlated subgraph query for data streams

Abstract

Given a query graph q, correlated subgraph query intends to find graph structures which are mostly correlated to the query q. This problem is fundamental for many pattern recognition applications involving structured data like graphs. Current available studies on correlation mining from graph data are all designed for static datasets. However, in real-life applications, data may arrive continuously in a streaming fashion with high speed. In this paper we investigate the problem of top-k correlated subgraph query over stream. By employing Hoeffding bound into the candidate discovery process and carefully maintaining a candidate list over stream, a novel algorithm, Hoe-PG, is proposed to incrementally identify the top-k correlated subgraphs in a sliding window over stream. Experiments show that the proposed method is several times more efficient than its peer with respect to the runtime and the memory consumption, and is able to maintain high precision and recall for stream-based graph query.

Publication
ICPR 2012 - 21st International Conference on Pattern Recognition
Shirui Pan
Shirui Pan
Professor | ARC Future Fellow

My research interests include data mining, machine learning, and graph analysis.