In this thesis, we discuss the problem of efficiency in Machine Learning from a general point of view. Specifically, we relate efficiency to the resources/time used in training or testing and propose four possible ways to increase it. Generally speaking, efficiency can be increased using two main strategies: reducing the model's dimensions or selecting a subset of good input features. The first part of this dissertation concerns the problem of best subset selection in logistic regression. As a matter of fact, in some contexts, acquiring the input features can be hard. In these situations, we need to apply methods designed to obtain good prediction performances using only a subset of features, since trying to reduce the model complexity has no point. In this thesis we propose a feature selection algorithm based on a piece-wise linear approximation of the logistic function in conjunction with an optimization algorithm solved by means of a two-block decomposition strategy. The second part of this thesis is concerned with the problem of pruning the nodes of fully connected or the filters in convolutional layers in neural network architectures. Nowadays, deep learning techniques are applied in many different fields. For this reason, having algorithms designed to reduce the model's complexity is extremely useful for researchers to increase the applicability of their models. In this second part of the thesis, we develop two algorithms based on different approaches. Specifically, node pruning is built on top of the recently released neural network training on the spectral domain. In contrast, channel pruning is a preliminary analysis based on a novel bilevel approach that takes inspiration from neural architectural search approaches. The third part of this thesis concerns the problem of removing Batch Normalization in ResNet-like models. We show that weight initialization is key to training ResNet-like normalization-free networks. In particular, we propose an effective initialization strategy for a slightly modified residual block. For all these parts, we show both theoretical and empirical results to strengthen the soundness of the proposed approaches.
Novel approaches to improve the efficiency of Machine Learning models / Enrico Civitelli, Fabio Schoen, Marco Sciandrone. - (2023).
Novel approaches to improve the efficiency of Machine Learning models
Enrico Civitelli
;Fabio SchoenSupervision
;Marco SciandroneSupervision
2023
Abstract
In this thesis, we discuss the problem of efficiency in Machine Learning from a general point of view. Specifically, we relate efficiency to the resources/time used in training or testing and propose four possible ways to increase it. Generally speaking, efficiency can be increased using two main strategies: reducing the model's dimensions or selecting a subset of good input features. The first part of this dissertation concerns the problem of best subset selection in logistic regression. As a matter of fact, in some contexts, acquiring the input features can be hard. In these situations, we need to apply methods designed to obtain good prediction performances using only a subset of features, since trying to reduce the model complexity has no point. In this thesis we propose a feature selection algorithm based on a piece-wise linear approximation of the logistic function in conjunction with an optimization algorithm solved by means of a two-block decomposition strategy. The second part of this thesis is concerned with the problem of pruning the nodes of fully connected or the filters in convolutional layers in neural network architectures. Nowadays, deep learning techniques are applied in many different fields. For this reason, having algorithms designed to reduce the model's complexity is extremely useful for researchers to increase the applicability of their models. In this second part of the thesis, we develop two algorithms based on different approaches. Specifically, node pruning is built on top of the recently released neural network training on the spectral domain. In contrast, channel pruning is a preliminary analysis based on a novel bilevel approach that takes inspiration from neural architectural search approaches. The third part of this thesis concerns the problem of removing Batch Normalization in ResNet-like models. We show that weight initialization is key to training ResNet-like normalization-free networks. In particular, we propose an effective initialization strategy for a slightly modified residual block. For all these parts, we show both theoretical and empirical results to strengthen the soundness of the proposed approaches.File | Dimensione | Formato | |
---|---|---|---|
Efficient_Machine_Learning_Models.pdf
accesso aperto
Descrizione: Novel approaches to improve the efficiency of Machine Learning models
Tipologia:
Tesi di dottorato
Licenza:
Creative commons
Dimensione
4.32 MB
Formato
Adobe PDF
|
4.32 MB | Adobe PDF |
I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.