Pedestrian detection is a core problem in computer vision due to its centrality to a range of applications such as robotics, video surveillance, and advanced driving assistance systems. Despite its broad application and interest, it remains a challenging problem in part due to the vast range of conditions under which it must be robust. In particular, pedestrian detectors must be robust and reliable at nighttime and in adverse weather conditions, which are some reasons why thermal and multispectral approaches have become popular in recent years. Moreover, thermal imagery offers more privacy-preserving affordances than visible-spectrum surveillance images. However, pedestrian detection in the thermal domain remains a non-trivial task with much room for improvement. Thermal detection helps ameliorate some of the disadvantages of RGB detectors -- such as illumination variation and the various complications of detection at nighttime. However, detection using only thermal imagery still faces numerous challenges and overall lack of information in thermal images. Thermal images are typically low-resolution, which in turn leads to more challenging detection of small pedestrians. Finally, there is a general lack of thermal imagery for training state-of-the-art detectors for thermal detection. The best pedestrian detectors available today work in the visible spectrum. In this thesis, we present three new types of domain adaptation approaches for pedestrian detection in thermal imagery and demonstrate how we can mitigate the above challenges such as privacy-preserving, illumination, lacking thermal data for training, and lacking feature information in thermal images and advance the state-of-the-art. Our first contribution is two \emph{bottom-up domain adaptation} approaches. We first show that simple bottom-up domain adaptation strategies with a pre-trained \emph{adapter} segment can better preserve features from source domains when doing transfer learning of pre-trained models to the thermal domain. In a similar vein, we then show that bottom-up and \emph{layer-wise} adaptation consistently results in more effective domain transfer. Experimental results demonstrate efficiency, flexibility, as well as the potential of both bottom-up domain adaptation approaches. Our second contribution, which addresses some limitations of domain adaptation to thermal imagery, is an approach based on task-conditioned networks that simultaneously solve two related tasks. A detection network is augmented with an auxiliary classification pipeline, which is tasked with classifying whether an input image was acquired during the day or at nighttime. The feature representation learned to solve this auxiliary classification task is then used to \emph{condition} convolutional layers in the main detector network. The experimental results of task-conditioned domain adaptation indicate that task conditioning is an effective way to balance the trade-off between the effectiveness of thermal imagery at night and its weaknesses during the day. Finally, our third contribution addresses the acute lack of training data for thermal domain pedestrian detection. We propose an approach using GANs to generate synthetic thermal imagery as a type of generative data augmentation. Our experimental results demonstrate that synthetically generated thermal imagery can be used to significantly reduce the need for massive amounts of annotated thermal pedestrian data. Pedestrian detection in thermal imagery remains challenging. However, in this thesis, we have shown that our bottom-up and layer-wise domain adaptation methods -- especially the proposed task-conditioned network -- can lead to robust pedestrian detection results via using thermal-only representations at detection time. This shows the potential of our proposed methods not only for domain adaptation of pedestrian detectors but also for other tasks. Moreover, our results using generated synthetic thermal images also illustrate the potential of generative data augmentation for domain adaptation to thermal imagery.

Deep Domain Adaptation for Pedestrian Detection in Thermal Imagery / Kieu My. - (2021).

Deep Domain Adaptation for Pedestrian Detection in Thermal Imagery

Kieu My
2021

Abstract

Pedestrian detection is a core problem in computer vision due to its centrality to a range of applications such as robotics, video surveillance, and advanced driving assistance systems. Despite its broad application and interest, it remains a challenging problem in part due to the vast range of conditions under which it must be robust. In particular, pedestrian detectors must be robust and reliable at nighttime and in adverse weather conditions, which are some reasons why thermal and multispectral approaches have become popular in recent years. Moreover, thermal imagery offers more privacy-preserving affordances than visible-spectrum surveillance images. However, pedestrian detection in the thermal domain remains a non-trivial task with much room for improvement. Thermal detection helps ameliorate some of the disadvantages of RGB detectors -- such as illumination variation and the various complications of detection at nighttime. However, detection using only thermal imagery still faces numerous challenges and overall lack of information in thermal images. Thermal images are typically low-resolution, which in turn leads to more challenging detection of small pedestrians. Finally, there is a general lack of thermal imagery for training state-of-the-art detectors for thermal detection. The best pedestrian detectors available today work in the visible spectrum. In this thesis, we present three new types of domain adaptation approaches for pedestrian detection in thermal imagery and demonstrate how we can mitigate the above challenges such as privacy-preserving, illumination, lacking thermal data for training, and lacking feature information in thermal images and advance the state-of-the-art. Our first contribution is two \emph{bottom-up domain adaptation} approaches. We first show that simple bottom-up domain adaptation strategies with a pre-trained \emph{adapter} segment can better preserve features from source domains when doing transfer learning of pre-trained models to the thermal domain. In a similar vein, we then show that bottom-up and \emph{layer-wise} adaptation consistently results in more effective domain transfer. Experimental results demonstrate efficiency, flexibility, as well as the potential of both bottom-up domain adaptation approaches. Our second contribution, which addresses some limitations of domain adaptation to thermal imagery, is an approach based on task-conditioned networks that simultaneously solve two related tasks. A detection network is augmented with an auxiliary classification pipeline, which is tasked with classifying whether an input image was acquired during the day or at nighttime. The feature representation learned to solve this auxiliary classification task is then used to \emph{condition} convolutional layers in the main detector network. The experimental results of task-conditioned domain adaptation indicate that task conditioning is an effective way to balance the trade-off between the effectiveness of thermal imagery at night and its weaknesses during the day. Finally, our third contribution addresses the acute lack of training data for thermal domain pedestrian detection. We propose an approach using GANs to generate synthetic thermal imagery as a type of generative data augmentation. Our experimental results demonstrate that synthetically generated thermal imagery can be used to significantly reduce the need for massive amounts of annotated thermal pedestrian data. Pedestrian detection in thermal imagery remains challenging. However, in this thesis, we have shown that our bottom-up and layer-wise domain adaptation methods -- especially the proposed task-conditioned network -- can lead to robust pedestrian detection results via using thermal-only representations at detection time. This shows the potential of our proposed methods not only for domain adaptation of pedestrian detectors but also for other tasks. Moreover, our results using generated synthetic thermal images also illustrate the potential of generative data augmentation for domain adaptation to thermal imagery.
2021
Andrew D. Bagdanov
VIETNAM
Goal 9: Industry, Innovation, and Infrastructure
Kieu My
File in questo prodotto:
File Dimensione Formato  
PhD_thesis_KieuMy.pdf

accesso aperto

Descrizione: PhD thesis includes for domain adaptation approaches and detail experiments as well as comparison results with State-of-the-art
Tipologia: Tesi di dottorato
Licenza: Open Access
Dimensione 30.75 MB
Formato Adobe PDF
30.75 MB Adobe PDF

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1238097
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact