Lightweight and Effective Convolutional Neural Networks for Vehicle Viewpoint Estimation From Monocular Images

Magistri, Simone; Boschi, Marco; Sambo, Francesco; De Andrade, Douglas Coimbra; Simoncini, Matteo; Kubin, Luca; Taccari, Leonardo; De Luigi, Luca; Salti, Samuele

doi:10.1109/tits.2022.3216359

Vehicle viewpoint estimation from monocular images is a crucial component for autonomous driving vehicles and for fleet management applications. In this paper, we make several contributions to advance the state-of-the-art on this problem. We show the effectiveness of applying a smoothing filter to the output neurons of a Convolutional Neural Network (CNN) when estimating vehicle viewpoint. We point out the overlooked fact that, under the same viewpoint, the appearance of a vehicle is strongly influenced by its position in the image plane, which renders viewpoint estimation from appearance an ill-posed problem. We show how, by inserting in the model a CoordConv layer to provide the coordinates of the vehicle, we are able to solve such ambiguity and greatly increase performance. Finally, we introduce a new data augmentation technique that improves viewpoint estimation on vehicles that are closer to the camera or partially occluded. All these improvements let a lightweight CNN reach optimal results while keeping inference time low. An extensive evaluation on a viewpoint estimation benchmark (Pascal3D+) and on actual vehicle camera data (nuScenes) shows that our method significantly outperforms the state-of-the-art in vehicle viewpoint estimation, both in terms of accuracy and memory footprint.

Lightweight and Effective Convolutional Neural Networks for Vehicle Viewpoint Estimation From Monocular Images / Magistri, S., Boschi, M., Sambo, F., de Andrade, D.C., Simoncini, M., Kubin, L., Taccari, L., de Luigi, L., Salti, S.. - In: IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS. - ISSN 1524-9050. - ELETTRONICO. - 24:(2023), pp. 1.191-1.200. [10.1109/tits.2022.3216359]