In this paper we consider two relevant optimization problems: the problem of selecting the best sparse linear regression model and the problem of optimally identifying the parameters of auto-regressive (AR) models based on time series data. Usually these problems, which although different are indeed related, are solved through a sequence of separate steps, alternating between choosing a subset of features and then finding a best fit regression. In this paper we propose to model both problems as mixed integer non linear optimization ones and propose numerical procedures based on state of the art optimization tools in order to solve both of them. The proposed approach has the advantage of considering both model selection as well as parameter estimation as a single optimization problem. Numerical experiments performed on widely available datasets as well as on synthetic ones confirm the high quality of our approach, both in terms of the quality of the resulting models and in terms of CPU time.
An Efficient Optimization Approach for Best Subset Selection in Linear Regression, with Application to Model Selection and Fitting in Autoregressive Time-Series / Leonardo Di Gangi, Matteo Lapucci, Fabio Schoen, Alessio Sortino. - In: COMPUTATIONAL OPTIMIZATION AND APPLICATIONS. - ISSN 1573-2894. - STAMPA. - 74:(2019), pp. 919-948. [10.1007/s10589-019-00134-5]
An Efficient Optimization Approach for Best Subset Selection in Linear Regression, with Application to Model Selection and Fitting in Autoregressive Time-Series
DI GANGI, LEONARDO;LAPUCCI, MATTEO;Fabio Schoen
;SORTINO, ALESSIO
2019
Abstract
In this paper we consider two relevant optimization problems: the problem of selecting the best sparse linear regression model and the problem of optimally identifying the parameters of auto-regressive (AR) models based on time series data. Usually these problems, which although different are indeed related, are solved through a sequence of separate steps, alternating between choosing a subset of features and then finding a best fit regression. In this paper we propose to model both problems as mixed integer non linear optimization ones and propose numerical procedures based on state of the art optimization tools in order to solve both of them. The proposed approach has the advantage of considering both model selection as well as parameter estimation as a single optimization problem. Numerical experiments performed on widely available datasets as well as on synthetic ones confirm the high quality of our approach, both in terms of the quality of the resulting models and in terms of CPU time.File | Dimensione | Formato | |
---|---|---|---|
main.pdf
accesso aperto
Tipologia:
Preprint (Submitted version)
Licenza:
Open Access
Dimensione
1.19 MB
Formato
Adobe PDF
|
1.19 MB | Adobe PDF |
I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.