Instructors
- Aristides Gionis, Aalto University
- Charalampos E. Tsourakakis, Harvard University
Motivation
Finding dense subgraphs is a fundamental graph-theoretic problem, that lies in the heart of numerous graph-mining applications, ranging from finding communities in social networks, to detecting regulatory motifs in DNA, and to identifying real-time stories in news. The problem of finding dense subgraphs has been studied extensively in theoretical computer science, and recently, due to the relevance of the problem in real-world applications, it has attracted considerable attention in the data-mining community. In this tutorial we aim to provide a comprehensive overview of (i) major algorithmic techniques for finding dense subgraphs in large graphs and (ii) graph mining applications that rely on dense subgraph extraction.
We will present fundamental concepts and algorithms that date back to 80’s, as well as the latest advances in the area, from theoretical and from practical point-of-view. We will motivate the problem of finding dense subgraphs by discussing how it can be used in real-world applications. We will discuss different density definitions and the complexity of the corresponding optimization problems. We will also present efficient algorithms for different density measures and under different computational models. Specifically, we will focus on scalable streaming, distributed and MapReduce algorithms. Finally we will discuss problem variants, extensions, and will provide pointers for future research directions.
Links to relevant papers
Measures of density in graphs
- V. E. Lee, N. Ruan, R. Jin, and C. Aggarwal. A survey of algorithms for dense subgraph discovery. Managing and Mining Graph Data, 2010
- A. V. Goldberg. Finding a maximum density subgraph. UCB technical report, 1984
Efficient algorithms for static graphs
- A. V. Goldberg. Finding a maximum density subgraph. UCB technical report, 1984
- M. Charikar. Greedy approximation algorithms for finding dense components in a graph. APPROX, 2000
- S. Khuller and B. Saha. On finding dense subgraphs. Automata, Languages and Programming (ICALP), 2009
- R. Andersen and K. Chellapilla. Finding dense subgraphs with size bounds. Workshop on Algorithms and Models for the Web-graph (WAW), 2009
- C. Tsourakakis, F. Bonchi, A. Gionis, F. Gullo, and M. Tsiarli. Denser than the densest subgraph: Extracting optimal quasi-cliques with quality guarantees. Knowledge Discovery and Data Mining (KDD), 2013
- C. Tsourakakis. The k-clique densest subgraph problem. International World Wide Web Conference, 2015
- N. Tatti and A. Gionis. Density-friendly graph decomposition. International World Wide Web Conference, 2015
- B. Bahmani, R. Kumar, and S. Vassilvitskii. Densest subgraph in streaming and map-reduce. Proceedings of the VLDB Endowment, 2012
Efficient algorithms for dynamic graphs
- A. Epasto, S. Lattanzi, M. Sozio. Efficient densest subgraph computation in evolving graphs. International World Wide Web Conference, 2015
- A. McGregor, D. Tench, S. Vorotnikova, H. T. Vu. Densest subgraph in dynamic graph streams. CoRR abs/1506.04417, 2015
- M. Bhattacharya, Sayan Henzinger, D. Nanongkai, and C. Tsourakakis. Space- and time-efficient algorithms for maintaining dense subgraphs on one-pass dynamic streams. Symposium on Theory of Computing (STOC), 2015
- P. Rozenshtein, N. Tatti, and A. Gionis. Discovering dynamic communities in interaction networks. ECML PKDD, 2014
Problem variants
- H. Tong and C. Faloutsos. Center-piece subgraphs: problem definition and fast solutions. Knowledge Discovery and Data Mining (KDD), 2006.
- M. Sozio and A. Gionis. The community-search problem and how to plan a successful cocktail party. Knowledge Discovery and Data Mining (KDD), 2010
- P. Rozenshtein, A. Anagnostopoulos, A. Gionis, and N. Tatti. Event detection in activity networks. Knowledge Discovery and Data Mining (KDD), 2014
- E. Valari, M. Kontaki, A. Papadopoulos. Discovery of top-k dense subgraphs in dynamic graph collections. Scientific and Statistical Database Management, 2012
- O. D. Balalau, F. Bonchi, T. Chan, F. Gullo, and M. Sozio. Finding subgraphs with maximum total density and limited overlap. Web Search and Data Mining (WSDM), 2015
Applications
- E. Fratkin, B. T. Naughton, D. Brutlag, S. Batzoglou. Motifcut: regulatory motifs finding with maximum density subgraphs. Bioinformatics, 22(14):e150–e157, 2006
- A. Angel, N. Sarkas, N. Koudas, and D. Srivastava. Dense subgraph maintenance under streaming edge weight updates for real-time story identification. Proceedings of the VLDB Endowment, 2012
- A. Gionis, F. Junqueira, V. Leroy, M. Serafini, I. Weber. Piggybacking on social networks. Proceedings of the VLDB Endowment, 2013