Road scene perception under camera viewpoint shifts

Pjetri, Aurel

Road accidents are one of the leading causes of death in the world, which has prompted all major legislators to adopt safety initiatives and mandate Advanced Driving Assistance Systems (ADAS) in new vehicles. While the success of ADAS is largely due to impressive advances in computer vision, many challenges remain, hindering their potential impact on road safety. In real-world applications, such as smart dashcams, viewpoint shifts pose a significant challenge. Vehicles of different sizes, combined with cameras installed in various positions and orientations, require computer vision models to handle highly diverse viewpoints. In this PhD thesis we systematically study the effects of viewpoint shifts on different perception tasks, including depth estimation and bird's eye view representation. We first collect a real-world depth dataset from dashcams installed in different positions on the windshield. Instead of using expensive lidar sensors, we devise a new ground-truth strategy based on homographies and object detection. The collected data enables us to quantitatively measure the effects of the different viewpoints on self-supervised monocular depth estimators and to identify the main cause of performance degradation in deformed scale perception. Benchmarks on complex and large foundation models highlight their generalisation capabilities, but also their high computational requirements, so we apply knowledge distillation to transfer this ability to smaller models. In particular, with the aim of disentangling viewpoint generalisation from data acquisition, we propose a distillation method based on simulated rotations. Experimental and qualitative results highlight the effectiveness of our distillation. Moreover, we develop a new synthetic dataset for the study of viewpoint shifts on more complex tasks, such as bird's eye view semantic segmentation. Our experiments show the large impact of specific camera orientations on the model and highlight the positive effect of including different viewpoints during training. Finally, we tackle the problem of accident anticipation with a more holistic approach that does not directly consider viewpoint shift effects. Instead, we devise a self-supervised loss function that enables training on a large private dataset with different vehicles and camera positions, with satisfactory results.

Road scene perception under camera viewpoint shifts

Aurel Pjetri

2026

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

Citazioni

social impact

Road scene perception under camera viewpoint shifts

Aurel Pjetri

2026

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Informazioni

Citazioni

social impact

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)