Tree-based methods refer to a class of predictive models largely employed in many scientific areas. Regression trees partition the variable space into a set of hyper- rectangles, and fit a model within each of them. They are conceptually simple, ap- parently easy to interpret and capable to deal with non linearities and interactions. Random forests are an ensemble of regression trees constructed on subsamples of statistical units and on a subset of explanatory variables randomly selected. The prediction is a combination of this kind of trees. Despite the loss in interpretability, thanks to their high predictive performance, random forests have achieved great success. The aim of this thesis is to propose a class of models combining a linear component and a tree, able to discover the relevant variables directly influencing a response. The proposal is a semilinear model that can handle linear and non linear dependencies and maintains a good predictive performance, while ensuring a simple and intuitive interpretation in a generative model sense. Moreover, two different algorithms for estimation, a two-stage estimation procedure based on a backfitting algorithm and one based on evolutionary algorithms are proposed.

Interpretable semilinear regression trees / Giulia Vannucci. - (2019).

Interpretable semilinear regression trees

Giulia Vannucci
2019

Abstract

Tree-based methods refer to a class of predictive models largely employed in many scientific areas. Regression trees partition the variable space into a set of hyper- rectangles, and fit a model within each of them. They are conceptually simple, ap- parently easy to interpret and capable to deal with non linearities and interactions. Random forests are an ensemble of regression trees constructed on subsamples of statistical units and on a subset of explanatory variables randomly selected. The prediction is a combination of this kind of trees. Despite the loss in interpretability, thanks to their high predictive performance, random forests have achieved great success. The aim of this thesis is to propose a class of models combining a linear component and a tree, able to discover the relevant variables directly influencing a response. The proposal is a semilinear model that can handle linear and non linear dependencies and maintains a good predictive performance, while ensuring a simple and intuitive interpretation in a generative model sense. Moreover, two different algorithms for estimation, a two-stage estimation procedure based on a backfitting algorithm and one based on evolutionary algorithms are proposed.
2019
Anna Gottard
ITALIA
Giulia Vannucci
File in questo prodotto:
File Dimensione Formato  
Thesis_Phd_GiuliaVannucci.pdf

accesso aperto

Tipologia: Tesi di dottorato
Licenza: Open Access
Dimensione 2.39 MB
Formato Adobe PDF
2.39 MB Adobe PDF

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1150170
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact