Humans and animals, during their life, continuously acquire new knowledge over time while making new experiences. They learn new concepts without forgetting what already learned, they typically use a few training examples (i.e. a child could recognize a giraffe after seeing a single picture) and they are able to discern what is known from what is unknown (i.e. unknown faces). In contrast, current supervised learning systems, work under the assumption that all data is known and available during learning, training is performed offline and a test dataset is typically required. What is missing in current research is a way to bridge the human learning capabilities in an artificial learning system where learning is performed incrementally from a data stream of infinite length (i.e. lifelong learning). This is a challenging task that is not sufficiently studied in the literature. According to this, in this thesis, we investigated different aspects of Deep Neural Network models (DNNs) to obtain stationary representations. Similar to fixed representations these representations can remain compatible between learning steps and are therefore well suited for incremental learning. Specifically, in the first part of the thesis, we propose a memory-based approach that collects and preserves all the past visual information observed so far, building a comprehensive and cumulative representation. We exploit a pre-trained fixed representation for the task of learning the appearance of face identities from unconstrained video streams leveraging temporal-coherence as a form of self-supervision. In this task, the representation allows us to learn from a few images and to detect unknown subjects similar to how humans learn. As the proposed approach makes use of a pre-trained fixed representation, learning is somewhat limited. This is due to the fact that the features stored in the memory bank remain fixed (i.e. they are not undergoing learning) and only the memory bank is learned. To address this issue, in the second part of the thesis, we propose a representation learning approach that can be exploited to learn both the feature and the memory without considering their joint learning (computationally prohibitive). The intuition is that every time the internal feature representation changes the memory bank must be relearned from scratch. The proposed method can mitigate the need of feature relearning by keeping the compatibility of features between learning steps thanks to feature stationarity. We show that the stationarity of the internal representation can be achieved with a fixed classifier by setting the classifier weights according to values taken from the coordinate vertices of the regular polytopes in high dimensional space. In the last part of the thesis, we apply the previously stationary representation method in the task of class incremental learning. We show that the method is as effective as the standard approaches while exhibiting novel stationarity properties of the internal feature representation that are otherwise non-existent. The approach exploits future unseen classes as negative examples and learns features that do not change their geometric configuration as novel classes are incorporated in the learning model. We show that a large number of classes can be learned with no loss of accuracy allowing the method to meet the underlying assumption of incremental lifelong learning.

Incremental Learning of Stationary Representations / Matteo Bruni. - (2021).

Incremental Learning of Stationary Representations

Matteo Bruni
2021

Abstract

Humans and animals, during their life, continuously acquire new knowledge over time while making new experiences. They learn new concepts without forgetting what already learned, they typically use a few training examples (i.e. a child could recognize a giraffe after seeing a single picture) and they are able to discern what is known from what is unknown (i.e. unknown faces). In contrast, current supervised learning systems, work under the assumption that all data is known and available during learning, training is performed offline and a test dataset is typically required. What is missing in current research is a way to bridge the human learning capabilities in an artificial learning system where learning is performed incrementally from a data stream of infinite length (i.e. lifelong learning). This is a challenging task that is not sufficiently studied in the literature. According to this, in this thesis, we investigated different aspects of Deep Neural Network models (DNNs) to obtain stationary representations. Similar to fixed representations these representations can remain compatible between learning steps and are therefore well suited for incremental learning. Specifically, in the first part of the thesis, we propose a memory-based approach that collects and preserves all the past visual information observed so far, building a comprehensive and cumulative representation. We exploit a pre-trained fixed representation for the task of learning the appearance of face identities from unconstrained video streams leveraging temporal-coherence as a form of self-supervision. In this task, the representation allows us to learn from a few images and to detect unknown subjects similar to how humans learn. As the proposed approach makes use of a pre-trained fixed representation, learning is somewhat limited. This is due to the fact that the features stored in the memory bank remain fixed (i.e. they are not undergoing learning) and only the memory bank is learned. To address this issue, in the second part of the thesis, we propose a representation learning approach that can be exploited to learn both the feature and the memory without considering their joint learning (computationally prohibitive). The intuition is that every time the internal feature representation changes the memory bank must be relearned from scratch. The proposed method can mitigate the need of feature relearning by keeping the compatibility of features between learning steps thanks to feature stationarity. We show that the stationarity of the internal representation can be achieved with a fixed classifier by setting the classifier weights according to values taken from the coordinate vertices of the regular polytopes in high dimensional space. In the last part of the thesis, we apply the previously stationary representation method in the task of class incremental learning. We show that the method is as effective as the standard approaches while exhibiting novel stationarity properties of the internal feature representation that are otherwise non-existent. The approach exploits future unseen classes as negative examples and learns features that do not change their geometric configuration as novel classes are incorporated in the learning model. We show that a large number of classes can be learned with no loss of accuracy allowing the method to meet the underlying assumption of incremental lifelong learning.
2021
Alberto Del Bimbo, Federico Pernici
ITALIA
Matteo Bruni
File in questo prodotto:
File Dimensione Formato  
PhD_Thesis___Incremental_Learning_of_Stationary_Representations.pdf

accesso aperto

Descrizione: Tesi di Dottorato
Tipologia: Tesi di dottorato
Licenza: Open Access
Dimensione 11.69 MB
Formato Adobe PDF
11.69 MB Adobe PDF

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1237986
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact