Tree-based regression models are a class of statistical models for predicting continuous response variables when the shape of the regression function is unknown. They naturally take into account both non-linearities and interactions. However, they struggle with linear and quasi-linear effects and assume iid data. This article proposes two new algorithms for jointly estimating an interpretable predictive mixed-effect model with two components: a linear part, capturing the main effects, and a non-parametric component consisting of three trees for capturing non-linearities and interactions among individual-level predictors, among cluster-level predictors or cross-level. The first proposed algorithm focuses on prediction. The second one is an extension which implements a post-selection inference strategy to provide valid inference. The performance of the two algorithms is validated via Monte Carlo studies. An application on INVALSI data illustrates the potentiality of the proposed approach.
Mixed-effect models with trees / Anna Gottard, Giulia Vannucci, Leonardo Grilli, Carla Rampichini. - In: ADVANCES IN DATA ANALYSIS AND CLASSIFICATION. - ISSN 1862-5355. - ELETTRONICO. - 17:(2023), pp. 431-461. [10.1007/s11634-022-00509-3]
Mixed-effect models with trees
Anna Gottard
;Giulia Vannucci;Leonardo Grilli;Carla Rampichini
2023
Abstract
Tree-based regression models are a class of statistical models for predicting continuous response variables when the shape of the regression function is unknown. They naturally take into account both non-linearities and interactions. However, they struggle with linear and quasi-linear effects and assume iid data. This article proposes two new algorithms for jointly estimating an interpretable predictive mixed-effect model with two components: a linear part, capturing the main effects, and a non-parametric component consisting of three trees for capturing non-linearities and interactions among individual-level predictors, among cluster-level predictors or cross-level. The first proposed algorithm focuses on prediction. The second one is an extension which implements a post-selection inference strategy to provide valid inference. The performance of the two algorithms is validated via Monte Carlo studies. An application on INVALSI data illustrates the potentiality of the proposed approach.File | Dimensione | Formato | |
---|---|---|---|
s11634-022-00509-3.pdf
accesso aperto
Tipologia:
Pdf editoriale (Version of record)
Licenza:
Open Access
Dimensione
725.4 kB
Formato
Adobe PDF
|
725.4 kB | Adobe PDF |
I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.