Batch normalization is an essential component of all state-of-the-art neural networks architectures. However, since it introduces many practical issues, much recent research has been devoted to designing normalization-free architectures. In this brief, we show that weights initialization is key to train ResNet-like normalization-free networks. In particular, we propose a slight modification to the summation operation of a block output to the skip-connection branch, so that the whole network is correctly initialized. We show that this modified architecture achieves competitive results on CIFAR-10, CIFAR-100 and ImageNet without further regularization nor algorithmic modifications.

A Robust Initialization of Residual Blocks for Effective ResNet Training Without Batch Normalization / Civitelli, Enrico; Sortino, Alessio; Lapucci, Matteo; Bagattini, Francesco; Galvan, Giulio. - In: IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS. - ISSN 2162-237X. - ELETTRONICO. - PP:(2023), pp. 1-6. [10.1109/TNNLS.2023.3325541]

A Robust Initialization of Residual Blocks for Effective ResNet Training Without Batch Normalization

Civitelli, Enrico
Methodology
;
Sortino, Alessio
Software
;
Lapucci, Matteo
Formal Analysis
;
Bagattini, Francesco
Membro del Collaboration Group
;
Galvan, Giulio
Conceptualization
2023

Abstract

Batch normalization is an essential component of all state-of-the-art neural networks architectures. However, since it introduces many practical issues, much recent research has been devoted to designing normalization-free architectures. In this brief, we show that weights initialization is key to train ResNet-like normalization-free networks. In particular, we propose a slight modification to the summation operation of a block output to the skip-connection branch, so that the whole network is correctly initialized. We show that this modified architecture achieves competitive results on CIFAR-10, CIFAR-100 and ImageNet without further regularization nor algorithmic modifications.
2023
PP
1
6
Civitelli, Enrico; Sortino, Alessio; Lapucci, Matteo; Bagattini, Francesco; Galvan, Giulio
File in questo prodotto:
File Dimensione Formato  
A_Robust_Initialization_of_Residual_Blocks_for_Effective_ResNet_Training_Without_Batch_Normalization.pdf

accesso aperto

Tipologia: Pdf editoriale (Version of record)
Licenza: Open Access
Dimensione 3.65 MB
Formato Adobe PDF
3.65 MB Adobe PDF

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1342851
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact