The identification of important nodes is a ubiquitous problem in the analysis of social networks. Centrality indices (such as degree centrality, closeness centrality, betweenness centrality, PageRank, and others) are used across many domains to accomplish this task. However, the computation of such indices is expensive on large graphs. Moreover, evolving graphs are becoming increasingly important in many applications. It is therefore desirable to develop on-line algorithms that can approximate centrality measures using memory sublinear in the size of the graph. We discuss the challenges facing the semi-streaming computation of many centrality indices. In particular, we apply recent advances in the streaming and sketching literature to provide a preliminary streaming approximation algorithm for degree centrality utilizing CountSketch and a multi-pass semi-streaming approximation algorithm for closeness centrality leveraging a spanner obtained through iteratively sketching the vertex-edge adjacency matrix. We also discuss possible ways forward for approximating betweenness centrality, as well as spectral measures of centrality. We provide a preliminary result using sketched low-rank approximations to approximate the output of the HITS algorithm.
|