Fast and accurate flood forecasting models are fundamental for managing flood risk and mitigating the negative impacts that floods can have on the society and the environment. For the pan-European area, currently, the European Flood Awareness System (EFAS) is the official flood forecasting and early warning system. It’s forecasts derive from a process-based rainfall-runoff model (LISFLOOD), which requires large amounts of high quality hydro-meteorological data, that usually are not uniformly available, affecting the quality of the outputs. Running process-based models at high spatial resolution for large spatial scales requires high-performance computer clusters to deliver timely forecasts. Recently, the use of machine learning-based forecasting models like long short-term memory (LSTM) as surrogate models for traditional process-based models has gained popularity. Compared to traditional process-based models, machine learning-based forecasting models offer the advantage of lower computational resource requirements and greater tolerance for variations in input data quality. Additionally, machine learning-based compression models, such as convolutional autoencoder (CAE), can further reduce computational costs by compressing the data. Moreover, large-scale models, frequently exhibit lower accuracy in forecasting river discharge in smaller watersheds, in which data may be less available and rivers are smaller, but still significant, and cannot be overlooked. These watersheds exhibit a fast response to rainfall events, thus requiring fast river discharge forecasts to guarantee enough lead time to take action in case of an imminent flood. To enhance forecasting accuracy, data assimilation (DA)—a technique that integrates data from multiple sources to optimize forecasting outcomes—can be effectively employed to improve the precision of river discharge forecasting. In this work, we propose a latent three-dimensional variational data assimilation (3D-Var) method combined with machine learning models to deliver fast and accurate river discharge forecasting. We tested our method on the real-world datasets (EFAS and Lamah-CE) and achieved an average 53.6% improvement in forecasting accuracy measured by Mean Squared Error (MSE) compared to LSTM forecasting, while delivering one-day lead-time river discharge forecasting in approximately one minute for an area of around 30,000 km2.
Latent Three-Dimensional Variational Data Assimilation with Convolutional Autoencoder and LSTM for Flood Forecasting / Wang, Kun; Bertoli, Gabriele; Schröter, Kai; Caporali, Enrica; Piggott, Matthew D.; Wang, Yanghua; Arcucci, Rossella. - ELETTRONICO. - 15910 LNCS:(2025), pp. 43-56. ( Workshops on Computational Science, which were co-organized with the 25th International Conference on Computational Science, ICCS 2025 Singapore 2025) [10.1007/978-3-031-97567-7_4].
Latent Three-Dimensional Variational Data Assimilation with Convolutional Autoencoder and LSTM for Flood Forecasting
Bertoli, Gabriele
Writing – Original Draft Preparation
;Caporali, EnricaWriting – Review & Editing
;
2025
Abstract
Fast and accurate flood forecasting models are fundamental for managing flood risk and mitigating the negative impacts that floods can have on the society and the environment. For the pan-European area, currently, the European Flood Awareness System (EFAS) is the official flood forecasting and early warning system. It’s forecasts derive from a process-based rainfall-runoff model (LISFLOOD), which requires large amounts of high quality hydro-meteorological data, that usually are not uniformly available, affecting the quality of the outputs. Running process-based models at high spatial resolution for large spatial scales requires high-performance computer clusters to deliver timely forecasts. Recently, the use of machine learning-based forecasting models like long short-term memory (LSTM) as surrogate models for traditional process-based models has gained popularity. Compared to traditional process-based models, machine learning-based forecasting models offer the advantage of lower computational resource requirements and greater tolerance for variations in input data quality. Additionally, machine learning-based compression models, such as convolutional autoencoder (CAE), can further reduce computational costs by compressing the data. Moreover, large-scale models, frequently exhibit lower accuracy in forecasting river discharge in smaller watersheds, in which data may be less available and rivers are smaller, but still significant, and cannot be overlooked. These watersheds exhibit a fast response to rainfall events, thus requiring fast river discharge forecasts to guarantee enough lead time to take action in case of an imminent flood. To enhance forecasting accuracy, data assimilation (DA)—a technique that integrates data from multiple sources to optimize forecasting outcomes—can be effectively employed to improve the precision of river discharge forecasting. In this work, we propose a latent three-dimensional variational data assimilation (3D-Var) method combined with machine learning models to deliver fast and accurate river discharge forecasting. We tested our method on the real-world datasets (EFAS and Lamah-CE) and achieved an average 53.6% improvement in forecasting accuracy measured by Mean Squared Error (MSE) compared to LSTM forecasting, while delivering one-day lead-time river discharge forecasting in approximately one minute for an area of around 30,000 km2.| File | Dimensione | Formato | |
|---|---|---|---|
|
978-3-031-97567-7_4.pdf
accesso aperto
Tipologia:
Pdf editoriale (Version of record)
Licenza:
Creative commons
Dimensione
7.21 MB
Formato
Adobe PDF
|
7.21 MB | Adobe PDF |
I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.



