Vehicle viewpoint estimation from monocular images is a crucial component for autonomous driving vehicles and for fleet management applications. In this paper, we make several contributions to advance the state-of-the-art on this problem. We show the effectiveness of applying a smoothing filter to the output neurons of a Convolutional Neural Network (CNN) when estimating vehicle viewpoint. We point out the overlooked fact that, under the same viewpoint, the appearance of a vehicle is strongly influenced by its position in the image plane, which renders viewpoint estimation from appearance an ill-posed problem. We show how, by inserting in the model a CoordConv layer to provide the coordinates of the vehicle, we are able to solve such ambiguity and greatly increase performance. Finally, we introduce a new data augmentation technique that improves viewpoint estimation on vehicles that are closer to the camera or partially occluded. All these improvements let a lightweight CNN reach optimal results while keeping inference time low. An extensive evaluation on a viewpoint estimation benchmark (Pascal3D+) and on actual vehicle camera data (nuScenes) shows that our method significantly outperforms the state-of-the-art in vehicle viewpoint estimation, both in terms of accuracy and memory footprint.

Lightweight and Effective Convolutional Neural Networks for Vehicle Viewpoint Estimation From Monocular Images / Magistri, Simone; Boschi, Marco; Sambo, Francesco; de Andrade, Douglas Coimbra; Simoncini, Matteo; Kubin, Luca; Taccari, Leonardo; de Luigi, Luca; Salti, Samuele. - In: IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS. - ISSN 1524-9050. - ELETTRONICO. - 24:(2023), pp. 1.191-1.200. [10.1109/tits.2022.3216359]

Lightweight and Effective Convolutional Neural Networks for Vehicle Viewpoint Estimation From Monocular Images

Magistri, Simone
;
Kubin, Luca;
2023

Abstract

Vehicle viewpoint estimation from monocular images is a crucial component for autonomous driving vehicles and for fleet management applications. In this paper, we make several contributions to advance the state-of-the-art on this problem. We show the effectiveness of applying a smoothing filter to the output neurons of a Convolutional Neural Network (CNN) when estimating vehicle viewpoint. We point out the overlooked fact that, under the same viewpoint, the appearance of a vehicle is strongly influenced by its position in the image plane, which renders viewpoint estimation from appearance an ill-posed problem. We show how, by inserting in the model a CoordConv layer to provide the coordinates of the vehicle, we are able to solve such ambiguity and greatly increase performance. Finally, we introduce a new data augmentation technique that improves viewpoint estimation on vehicles that are closer to the camera or partially occluded. All these improvements let a lightweight CNN reach optimal results while keeping inference time low. An extensive evaluation on a viewpoint estimation benchmark (Pascal3D+) and on actual vehicle camera data (nuScenes) shows that our method significantly outperforms the state-of-the-art in vehicle viewpoint estimation, both in terms of accuracy and memory footprint.
2023
24
191
200
Magistri, Simone; Boschi, Marco; Sambo, Francesco; de Andrade, Douglas Coimbra; Simoncini, Matteo; Kubin, Luca; Taccari, Leonardo; de Luigi, Luca; Sal...espandi
File in questo prodotto:
File Dimensione Formato  
Lightweight_and_Effective_Convolutional_Neural_Networks_for_Vehicle_Viewpoint_Estimation_From_Monocular_Images.pdf

accesso aperto

Tipologia: Pdf editoriale (Version of record)
Licenza: Open Access
Dimensione 3.92 MB
Formato Adobe PDF
3.92 MB Adobe PDF

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1460555
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 2
social impact