This paper studies k-plexes, a well known pseudo-clique model for network communities. In a k-plex, each node can miss at most k-1 links. Our goal is to detect large communities in today's real-world graphs which can have hundreds of millions of edges. While many have tried, this task has been elusive so far due to its computationally challenging nature: k-plexes and other pseudo-cliques are harder to find and more numerous than cliques, a well known hard problem. We present D2K, which is the first algorithm able to find large k-plexes of very large graphs in just a few minutes. The good performance of our algorithm follows from a combination of graph-theoretical concepts, careful algorithm engineering and a high-performance implementation. In particular, we exploit the low degeneracy of real-world graphs, and the fact that large enough k-plexes have diameter 2. We validate a sequential and a parallel/distributed implementation of D2K on real graphs with up to half a billion edges.

D2K: Scalable Community Detection in Massive Networks via Small-Diameter K-Plexes / Alessio Conte; Tiziano De Matteis; Daniele De Sensi; Roberto Grossi; Andrea Marino; Luca Versari. - STAMPA. - (2018), pp. 1272-1281. (Intervento presentato al convegno Proceedings of the 24th ACM SIGKDD International Conference onKnowledge Discovery & Data Mining, KDD 2018) [10.1145/3219819.3220093].

D2K: Scalable Community Detection in Massive Networks via Small-Diameter K-Plexes

Roberto Grossi;Andrea Marino;
2018

Abstract

This paper studies k-plexes, a well known pseudo-clique model for network communities. In a k-plex, each node can miss at most k-1 links. Our goal is to detect large communities in today's real-world graphs which can have hundreds of millions of edges. While many have tried, this task has been elusive so far due to its computationally challenging nature: k-plexes and other pseudo-cliques are harder to find and more numerous than cliques, a well known hard problem. We present D2K, which is the first algorithm able to find large k-plexes of very large graphs in just a few minutes. The good performance of our algorithm follows from a combination of graph-theoretical concepts, careful algorithm engineering and a high-performance implementation. In particular, we exploit the low degeneracy of real-world graphs, and the fact that large enough k-plexes have diameter 2. We validate a sequential and a parallel/distributed implementation of D2K on real graphs with up to half a billion edges.
2018
Proceedings of the 24th ACM SIGKDD International Conference onKnowledge Discovery & Data Mining, KDD 2018, London, UK, August19-23, 2018
Proceedings of the 24th ACM SIGKDD International Conference onKnowledge Discovery & Data Mining, KDD 2018
Alessio Conte; Tiziano De Matteis; Daniele De Sensi; Roberto Grossi; Andrea Marino; Luca Versari
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1149213
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 54
  • ???jsp.display-item.citation.isi??? 40
social impact