In many applications, the presence of interactions or even mild non--linearities can affect inference and predictions. For that reason, we suggest the use of a class of models laying between statistics and machine learning and we propose a learning procedure. The models combine a linear part and a tree component that is selected via an evolutionary algorithm, and they can be adopted for any kind of response, such as, for instance, continuous, categorical, ordinal responses, and survival times. They are inherently interpretable but more flexible than standard regression models, as they easily capture non--linear and interaction effects. The proposed genetic--like learning algorithm allows avoiding a greedy search of the tree component. In a simulation study, we show that the proposed approach has a performance comparable with other machine learning algorithms, with a substantial gain in interpretability and transparency, and we illustrate the method on a real data set.
An evolutionary estimation procedure for Generalized Semilinear Regression Trees / Giulia Vannucci; Anna Gottard. - In: COMPUTATIONAL STATISTICS. - ISSN 0943-4062. - ELETTRONICO. - (2022), pp. 1-23. [10.1007/s00180-022-01302-8]
An evolutionary estimation procedure for Generalized Semilinear Regression Trees
Giulia Vannucci
Membro del Collaboration Group
;Anna GottardMembro del Collaboration Group
2022
Abstract
In many applications, the presence of interactions or even mild non--linearities can affect inference and predictions. For that reason, we suggest the use of a class of models laying between statistics and machine learning and we propose a learning procedure. The models combine a linear part and a tree component that is selected via an evolutionary algorithm, and they can be adopted for any kind of response, such as, for instance, continuous, categorical, ordinal responses, and survival times. They are inherently interpretable but more flexible than standard regression models, as they easily capture non--linear and interaction effects. The proposed genetic--like learning algorithm allows avoiding a greedy search of the tree component. In a simulation study, we show that the proposed approach has a performance comparable with other machine learning algorithms, with a substantial gain in interpretability and transparency, and we illustrate the method on a real data set.File | Dimensione | Formato | |
---|---|---|---|
s00180-022-01302-8.pdf
Accesso chiuso
Tipologia:
Versione finale referata (Postprint, Accepted manuscript)
Licenza:
Tutti i diritti riservati
Dimensione
1.75 MB
Formato
Adobe PDF
|
1.75 MB | Adobe PDF | Richiedi una copia |
I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.