In this paper we consider two relevant optimization problems: the problem of selecting the best sparse linear regression model and the problem of optimally identifying the parameters of auto-regressive models based on time series data. Usually these problems, which although different are indeed related, are solved through a sequence of separate steps, alternating between choosing a subset of features and then finding a best fit regression. In this paper we propose to model both problems as mixed integer non linear optimization ones and propose numerical procedures based on state of the art optimization tools in order to solve both of them. The proposed approach has the advantage of considering both model selection as well as parameter estimation as a single optimization problem. Numerical experiments performed on widely available datasets as well as on synthetic ones confirm the high quality of our approach, both in terms of the quality of the resulting models and in terms of CPU time.

An efficient optimization approach for best subset selection in linear regression, with application to model selection and fitting in autoregressive time-series / Di Gangi, L; Lapucci, M; Schoen, F; Sortino, A. - In: COMPUTATIONAL OPTIMIZATION AND APPLICATIONS. - ISSN 0926-6003. - STAMPA. - 74:(2019), pp. 919-948. [10.1007/s10589-019-00134-5]

An efficient optimization approach for best subset selection in linear regression, with application to model selection and fitting in autoregressive time-series

Di Gangi, L;Lapucci, M;Schoen, F
;
Sortino, A
2019

Abstract

In this paper we consider two relevant optimization problems: the problem of selecting the best sparse linear regression model and the problem of optimally identifying the parameters of auto-regressive models based on time series data. Usually these problems, which although different are indeed related, are solved through a sequence of separate steps, alternating between choosing a subset of features and then finding a best fit regression. In this paper we propose to model both problems as mixed integer non linear optimization ones and propose numerical procedures based on state of the art optimization tools in order to solve both of them. The proposed approach has the advantage of considering both model selection as well as parameter estimation as a single optimization problem. Numerical experiments performed on widely available datasets as well as on synthetic ones confirm the high quality of our approach, both in terms of the quality of the resulting models and in terms of CPU time.
2019
74
919
948
Di Gangi, L; Lapucci, M; Schoen, F; Sortino, A
File in questo prodotto:
File Dimensione Formato  
s10589-019-00134-5.pdf

Accesso chiuso

Descrizione: stmpa
Tipologia: Pdf editoriale (Version of record)
Licenza: Tutti i diritti riservati
Dimensione 1.04 MB
Formato Adobe PDF
1.04 MB Adobe PDF   Richiedi una copia

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1327931
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 12
  • ???jsp.display-item.citation.isi??? 8
social impact