Seeing the Heat: Vision Transformers for Spatiotemporal Temperature Prediction

Russo, Paolo; Di Ciaccio, Fabiana

doi:10.1007/978-3-032-11317-7_31

Forecasting land surface temperature (LST) is a key task in environmental monitoring. While Transformer-based architectures have recently shown promise in modeling temperature time series, it remains unclear whether representing geophysical signals as raw sequences or as visual encodings offers better predictive accuracy. In this work, we investigate this question by comparing two autoregressive architectures: a standard sequence-based Transformer operating on raw LST values, and a video-based model that leverages visual encodings of temperature fields on a pre-trained encoder. To this end, we convert hourly LST grids from the Copernicus Land Monitoring Service and the ERA5 datasets into color-coded heatmaps using OpenCV to further train a TimeSformer-based model that performs spatiotemporal attention across the resulting video-like sequences. The visual backbone extracts structured features from both space and time, which are then passed to an autoregressive decoder to forecast the next 72 h of temperature evolution. Our framework is evaluated on a multi-year dataset covering the region of Florence (Italy), and compared against a previously validated Transformer model trained directly on numerical signals. Experimental results show that the vision-based model achieves competitive performance with respect to the numeric baseline. The study highlights the potential of vision transformers for environmental forecasting tasks, bridging computer vision and climate modeling.

Seeing the Heat: Vision Transformers for Spatiotemporal Temperature Prediction / Russo, Paolo; Di Ciaccio, Fabiana. - ELETTRONICO. - 16169:(2026), pp. 365-376. ( Workshops and competitions hosted by the 23rd International Conference on Image Analysis and Processing, ICIAP 2025 Rome, Italy 2025) [10.1007/978-3-032-11317-7_31].

Seeing the Heat: Vision Transformers for Spatiotemporal Temperature Prediction

Russo, Paolo;Di Ciaccio, Fabiana

2026

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Data di pubblicazione
	
				2026
			
	Titolo del volume degli atti
	
				Lecture Notes in Computer Science
			
	Collana/Serie
	
				LECTURE NOTES IN COMPUTER SCIENCE
			
	Titolo del congresso
	
				Workshops and competitions hosted by the 23rd International Conference on Image Analysis and Processing, ICIAP 2025
			
	Luogo del congresso
	
				Rome, Italy
			
	Data del congresso
	
				2025
			
	Tutti gli autori
	
						Russo, Paolo; Di Ciaccio, Fabiana
					
	Appare nelle tipologie:
	
				4a - Articolo in atti di congresso

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1451995

Seeing the Heat: Vision Transformers for Spatiotemporal Temperature Prediction

Russo, Paolo;Di Ciaccio, Fabiana

2026

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

Citazioni

social impact

Seeing the Heat: Vision Transformers for Spatiotemporal Temperature Prediction

Russo, Paolo;Di Ciaccio, Fabiana

2026

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Informazioni

Citazioni

social impact

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)