Recognizing the state of relevant traffic lights when a vehicle is approaching an intersection is a challenging task for advanced driver-assistance systems (ADAS) that requires the continuous processing of frames recorded by onboard cameras. This type of task can be energy-intensive and may result in significant battery consumption and overheating. To address this problem, we propose a novel architecture, called Cooperative Frame Classifier (CFC), that efficiently exploits temporal redundancies between consecutive frames to maintain high performance while greatly reducing the average number of operations needed to process the video frames. The architecture consists of two convolutional neural network models with different parameter sizes and input resolutions. Each frame is processed by only one of the models. The model with lower input resolution and parameter size leverages the saliency maps generated by the larger model. Experiments show that our approach is not only more accurate than existing ones based on object detection, but it is also an order of magnitude more efficient, and therefore it is better suited for practical applications.
Cross-Model Temporal Cooperation Via Saliency Maps for Efficient Recognition and Classification of Relevant Traffic Lights / Trinci, Tomaso; Bianconcini, Tommaso; Sarti, Leonardo; Taccari, Leonardo; Sambo, Francesco. - ELETTRONICO. - (2023), pp. 2758-2763. (Intervento presentato al convegno 26th IEEE International Conference on Intelligent Transportation Systems, ITSC 2023 tenutosi a esp nel 2023) [10.1109/itsc57777.2023.10421938].
Cross-Model Temporal Cooperation Via Saliency Maps for Efficient Recognition and Classification of Relevant Traffic Lights
Trinci, Tomaso;Bianconcini, Tommaso;Sarti, Leonardo;
2023
Abstract
Recognizing the state of relevant traffic lights when a vehicle is approaching an intersection is a challenging task for advanced driver-assistance systems (ADAS) that requires the continuous processing of frames recorded by onboard cameras. This type of task can be energy-intensive and may result in significant battery consumption and overheating. To address this problem, we propose a novel architecture, called Cooperative Frame Classifier (CFC), that efficiently exploits temporal redundancies between consecutive frames to maintain high performance while greatly reducing the average number of operations needed to process the video frames. The architecture consists of two convolutional neural network models with different parameter sizes and input resolutions. Each frame is processed by only one of the models. The model with lower input resolution and parameter size leverages the saliency maps generated by the larger model. Experiments show that our approach is not only more accurate than existing ones based on object detection, but it is also an order of magnitude more efficient, and therefore it is better suited for practical applications.I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.