Efficient Architectures and Incremental Methodologies for Deep Learning

Magistri, Simone; Bagdanov, ANDREW DAVID

Deep Neural Networks (DNNs) are fundamental to Artificial Intelligence (AI) applications, demonstrating remarkable performance across a diverse range of tasks in industries such as transportation, robotics, healthcare, and multimedia forensics. However, as these models have developed, the resources required for their training have increased significantly, which has dramatically impacted both costs and environmental sustainability. Currently, DNNs struggle to adapt to the dynamic and ever-changing nature of our world, often necessitating updates to address new situations or tasks. Traditionally, updating these models for novel situations involves re-training them from scratch, integrating both previous and current task data, a process known as joint-incremental training. This method exacerbates development costs and environmental impact. Relying solely on new task data for updates leads to catastrophic forgetting of previously acquired knowledge. To counter this, the main goal of incremental learning is to update these models effectively and efficiently, thereby mitigating catastrophic forgetting. Additionally, due to the high number of operations and parameters, deploying DNNs on edge devices is challenging; thus, they are often hosted on cloud servers for inference. This cloud-based inference presents challenges in terms of responsiveness and raises privacy concerns in real-world applications, such as intelligent vehicles. This thesis tackles both the challenges of incremental learning and designing efficient DNNs for edge computing. For incremental learning, we propose a method named "Elastic Feature Consolidation", which contributes meaningfully towards bridging the performance gap between existing incremental learning approaches and joint-incremental training, focusing on using only the current task data to prevent forgetting. We emphasize the importance of incremental learning in Multimedia Forensics by establishing a benchmark for incremental social network identification and assessing a broad range of incremental techniques, contributing to future research in this field. In terms of efficient architecture design, we develop effective and lightweight vehicle viewpoint estimation DNN models, optimized for edge applications in intelligent vehicles.

Efficient Architectures and Incremental Methodologies for Deep Learning / Simone Magistri; Andrew David Bagdanov. - (2024).