In this paper we consider two relevant optimization problems: the problem of selecting the best sparse linear regression model and the problem of optimally identifying the parameters of auto-regressive models based on time series data. Usually these problems, which although different are indeed related, are solved through a sequence of separate steps, alternating between choosing a subset of features and then finding a best fit regression. In this paper we propose to model both problems as mixed integer non linear optimization ones and propose numerical procedures based on state of the art optimization tools in order to solve both of them. The proposed approach has the advantage of considering both model selection as well as parameter estimation as a single optimization problem. Numerical experiments performed on widely available datasets as well as on synthetic ones confirm the high quality of our approach, both in terms of the quality of the resulting models and in terms of CPU time.
An efficient optimization approach for best subset selection in linear regression, with application to model selection and fitting in autoregressive time-series / Di Gangi, L; Lapucci, M; Schoen, F; Sortino, A. - In: COMPUTATIONAL OPTIMIZATION AND APPLICATIONS. - ISSN 0926-6003. - STAMPA. - 74:(2019), pp. 919-948. [10.1007/s10589-019-00134-5]
An efficient optimization approach for best subset selection in linear regression, with application to model selection and fitting in autoregressive time-series
Di Gangi, L;Lapucci, M;Schoen, F
;Sortino, A
2019
Abstract
In this paper we consider two relevant optimization problems: the problem of selecting the best sparse linear regression model and the problem of optimally identifying the parameters of auto-regressive models based on time series data. Usually these problems, which although different are indeed related, are solved through a sequence of separate steps, alternating between choosing a subset of features and then finding a best fit regression. In this paper we propose to model both problems as mixed integer non linear optimization ones and propose numerical procedures based on state of the art optimization tools in order to solve both of them. The proposed approach has the advantage of considering both model selection as well as parameter estimation as a single optimization problem. Numerical experiments performed on widely available datasets as well as on synthetic ones confirm the high quality of our approach, both in terms of the quality of the resulting models and in terms of CPU time.File | Dimensione | Formato | |
---|---|---|---|
s10589-019-00134-5.pdf
Accesso chiuso
Descrizione: stmpa
Tipologia:
Pdf editoriale (Version of record)
Licenza:
Tutti i diritti riservati
Dimensione
1.04 MB
Formato
Adobe PDF
|
1.04 MB | Adobe PDF | Richiedi una copia |
I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.