Counting the number of subgraphs, or patterns, of a certain kind is at the heart of data mining, and st-paths are one of the most basic graph patterns to express connectivity. The problem of counting the number of st-paths in a graph, both directed and undirected, has been studied since the 70s, and is one of the original #P-complete problems introduced by Valiant [25]. However, counting can be a heavy task and known algorithms already struggle on graphs with hundreds of nodes. For this reason we propose a novel approach: we assess whether the number of st-paths of an undirected graph is at least a given number z. Instead of finding paths one-by-one (i.e., listing), our algorithm is based on decomposing and collapsing computational tasks arranged in a tree-like structure to enhance the effectiveness of each step in growing the number of paths found. Extensive experimental results on real-world datasets show the algorithm scaling to graphs with millions of nodes and edges, with z in the trillions. Its performance is orders of magnitude better than state-of-the-art listing algorithms adapted to this task.

An Efficient Algorithm for Assessing the Number of st-Paths in Large Graphs / Giulia Punzi, Alessio Conte, Roberto Grossi, Andrea Marino. - STAMPA. - (2023), pp. 289-297. (Intervento presentato al convegno SIAM International Conference on Data Mining).

An Efficient Algorithm for Assessing the Number of st-Paths in Large Graphs

Andrea Marino
2023

Abstract

Counting the number of subgraphs, or patterns, of a certain kind is at the heart of data mining, and st-paths are one of the most basic graph patterns to express connectivity. The problem of counting the number of st-paths in a graph, both directed and undirected, has been studied since the 70s, and is one of the original #P-complete problems introduced by Valiant [25]. However, counting can be a heavy task and known algorithms already struggle on graphs with hundreds of nodes. For this reason we propose a novel approach: we assess whether the number of st-paths of an undirected graph is at least a given number z. Instead of finding paths one-by-one (i.e., listing), our algorithm is based on decomposing and collapsing computational tasks arranged in a tree-like structure to enhance the effectiveness of each step in growing the number of paths found. Extensive experimental results on real-world datasets show the algorithm scaling to graphs with millions of nodes and edges, with z in the trillions. Its performance is orders of magnitude better than state-of-the-art listing algorithms adapted to this task.
2023
Proceedings of the 2023 SIAM International Conference on Data Mining
SIAM International Conference on Data Mining
Giulia Punzi, Alessio Conte, Roberto Grossi, Andrea Marino
File in questo prodotto:
File Dimensione Formato  
paper.pdf

accesso aperto

Tipologia: Pdf editoriale (Version of record)
Licenza: Open Access
Dimensione 862.18 kB
Formato Adobe PDF
862.18 kB Adobe PDF

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1335431
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact