In many applications, the presence of interactions or even mild non--linearities can affect inference and predictions. For that reason, we suggest the use of a class of models laying between statistics and machine learning and we propose a learning procedure. The models combine a linear part and a tree component that is selected via an evolutionary algorithm, and they can be adopted for any kind of response, such as, for instance, continuous, categorical, ordinal responses, and survival times. They are inherently interpretable but more flexible than standard regression models, as they easily capture non--linear and interaction effects. The proposed genetic--like learning algorithm allows avoiding a greedy search of the tree component. In a simulation study, we show that the proposed approach has a performance comparable with other machine learning algorithms, with a substantial gain in interpretability and transparency, and we illustrate the method on a real data set.

An evolutionary estimation procedure for Generalized Semilinear Regression Trees / Giulia Vannucci; Anna Gottard. - In: COMPUTATIONAL STATISTICS. - ISSN 0943-4062. - ELETTRONICO. - (2022), pp. 1-23. [10.1007/s00180-022-01302-8]

An evolutionary estimation procedure for Generalized Semilinear Regression Trees

Giulia Vannucci
Membro del Collaboration Group
;
Anna Gottard
Membro del Collaboration Group
2022

Abstract

In many applications, the presence of interactions or even mild non--linearities can affect inference and predictions. For that reason, we suggest the use of a class of models laying between statistics and machine learning and we propose a learning procedure. The models combine a linear part and a tree component that is selected via an evolutionary algorithm, and they can be adopted for any kind of response, such as, for instance, continuous, categorical, ordinal responses, and survival times. They are inherently interpretable but more flexible than standard regression models, as they easily capture non--linear and interaction effects. The proposed genetic--like learning algorithm allows avoiding a greedy search of the tree component. In a simulation study, we show that the proposed approach has a performance comparable with other machine learning algorithms, with a substantial gain in interpretability and transparency, and we illustrate the method on a real data set.
2022
1
23
Giulia Vannucci; Anna Gottard
File in questo prodotto:
File Dimensione Formato  
s00180-022-01302-8.pdf

Accesso chiuso

Tipologia: Versione finale referata (Postprint, Accepted manuscript)
Licenza: Tutti i diritti riservati
Dimensione 1.75 MB
Formato Adobe PDF
1.75 MB Adobe PDF   Richiedi una copia

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1288864
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact