Analysis of community resilience during natural disasters using data mining on massive social networks exchanges

Franceschini, Rachele

Mass media are a new and important source of information for any natural disaster, mass emergency, pandemic, economic or political event, or extreme weather event affecting one or more communities in a country, providing a relatively high temporal and spatial resolution. The major goal of this research is to show how useful and capable social media is for detecting events in places without actual sensors that could immediately identify a natural hazard. For the entire Italian area, many analytical techniques have been used to determine the spatial and chronological distribution of newspaper articles from Google News about floods and in particular on landslide events. A landslide and flood inventory derived from social media was used as a base proxy to correlate rainfall data and impacts of landslides in an attempt to show how social media in combination with other sources can be utilized to assist government authorities with a better knowledge of the landslide hazard of a territory. Such analysis, further, allowed to outline the resilience at regional scale, considering the number of articles published respect to natural event. In the second and third section of this work, a new data mining technique within Twitter and deep learning technique have been applied. Twitter has been used as source of data for many natural events, but it has rarely been applied to the automatic extraction of landslide events. For filling this gap, Twitter has been investigated to detect data about landslide events in Italian-language. Over 13.000 data were extracted within Twitter considering five keywords referring to landslide news. The dataset was classified manually, providing a solid base for applying the deep learning. The main aim is to obtain an automatic classification on the basis of information about natural hazard. The transformer architecture has been chosen for text classification using the method XLM-RoBERTa. In this work the “Bert For Information on Landslide Events” (BEFILE) model was developed and was tested using 3 different data pre-processing approaches. BEFILE applied without pre-processing showed the highest values of accuracy, equal to 96% and AUC of 0.95.

Analysis of community resilience during natural disasters using data mining on massive social networks exchanges / Rachele Franceschini. - (2023).