In the context of textual analysis, network-based procedures for topic detection are gaining attention, also as an alternative to classical topic models. These procedures are based on the idea that documents can be represented as word co-occurrence networks, where topics are defined as groups of strongly connected words. Although many works have used network-based procedures for topic detection, there is a lack of systematic analysis of how dierent design choices, such as building the word co-occurrence matrix and selecting the community detection algorithm, aect the final results in terms of detected topics. Another unexplored question about network-based topic detection concerns its relationship with classical topic models, such as the Latent Dirichlet Allocation (LDA) model. Therefore, this thesis aims to address these questions by developing a deeper understanding of optimal design choices for network-based procedures for topic detection, showing how and to what extent the choices made during the design phase aect the results, and contextually comparing these procedures with classical topic models.
Extracting knowledge from text news: A systematic evaluation of network-based topic detection / Carla Galluccio. - (2023).
Extracting knowledge from text news: A systematic evaluation of network-based topic detection
Carla Galluccio
2023
Abstract
In the context of textual analysis, network-based procedures for topic detection are gaining attention, also as an alternative to classical topic models. These procedures are based on the idea that documents can be represented as word co-occurrence networks, where topics are defined as groups of strongly connected words. Although many works have used network-based procedures for topic detection, there is a lack of systematic analysis of how dierent design choices, such as building the word co-occurrence matrix and selecting the community detection algorithm, aect the final results in terms of detected topics. Another unexplored question about network-based topic detection concerns its relationship with classical topic models, such as the Latent Dirichlet Allocation (LDA) model. Therefore, this thesis aims to address these questions by developing a deeper understanding of optimal design choices for network-based procedures for topic detection, showing how and to what extent the choices made during the design phase aect the results, and contextually comparing these procedures with classical topic models.File | Dimensione | Formato | |
---|---|---|---|
Tesi_Galluccio.pdf
accesso aperto
Tipologia:
Pdf editoriale (Version of record)
Licenza:
Open Access
Dimensione
13.01 MB
Formato
Adobe PDF
|
13.01 MB | Adobe PDF |
I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.