Explaining autonomous driving by learning end-to-end visual attention

Cultrera, L.; Seidenari, L.; Becattini, F.; Pala, P.; Del Bimbo, A.

doi:10.1109/CVPRW50498.2020.00178

Current deep learning based autonomous driving approaches yield impressive results also leading to inproduction deployment in certain controlled scenarios. One of the most popular and fascinating approaches relies on learning vehicle controls directly from data perceived by sensors. This end-to-end learning paradigm can be applied both in classical supervised settings and using reinforcement learning. Nonetheless the main drawback of this approach as also in other learning problems is the lack of ex- plainability. Indeed, a deep network will act as a black-box outputting predictions depending on previously seen driving patterns without giving any feedback on why such decisions were taken.While to obtain optimal performance it is not critical to obtain explainable outputs from a learned agent, especially in such a safety critical field, it is of paramount importance to understand how the network behaves. This is particularly relevant to interpret failures of such systems.In this work we propose to train an imitation learning based agent equipped with an attention model. The attention model allows us to understand what part of the image has been deemed most important. Interestingly, the use of attention also leads to superior performance in a standard benchmark using the CARLA driving simulator.

Explaining autonomous driving by learning end-to-end visual attention / Cultrera L.; Seidenari L.; Becattini F.; Pala P.; Del Bimbo A.. - ELETTRONICO. - 2020-:(2020), pp. 1389-1398. (Intervento presentato al convegno 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2020 tenutosi a usa nel 2020) [10.1109/CVPRW50498.2020.00178].