Many environments within the human body host a collection of micro-organisms called microbiota. Recent findings have linked the composition of the microbiota to the development of different human diseases, including cancer. Motivated by a recent colorectal cancer (crc) study, we investigate the effect of clinical factors and diet-related covariates on the microbiota compositions; for the patients enrolled in this study, microbiota abundance counts are collected from three different districts, namely, tumor, fecal and salivary samples. Building upon the Dirichlet-multinomial regression framework, we develop a high-dimensional Bayesian hierarchical model that exploits subject-specific regression coefficients to simultaneously borrow strength across districts and include complex interactions between diet and clinical factors if supported by the data. The proposed method identifies relevant associations through model selection priors and thresholding mechanisms. Posterior inference is performed via a Markov chain Monte Carlo algorithm. We use simulation studies to assess the performance of our method, and found our approach to outperform competing methods that do not account for complex interactions. Finally, a thorough analysis of the crc data illustrates the benefits of the proposed approach.

SUBJECT-SPECIFIC DIRICHLET-MULTINOMIAL REGRESSION FOR MULTI-DISTRICT MICROBIOTA DATA ANALYSIS / Pedone M.; Amedei A.; Stingo F.C.. - In: THE ANNALS OF APPLIED STATISTICS. - ISSN 1932-6157. - STAMPA. - 17:(2023), pp. 539-559. [10.1214/22-AOAS1641]

SUBJECT-SPECIFIC DIRICHLET-MULTINOMIAL REGRESSION FOR MULTI-DISTRICT MICROBIOTA DATA ANALYSIS

Pedone M.;Amedei A.;Stingo F. C.
2023

Abstract

Many environments within the human body host a collection of micro-organisms called microbiota. Recent findings have linked the composition of the microbiota to the development of different human diseases, including cancer. Motivated by a recent colorectal cancer (crc) study, we investigate the effect of clinical factors and diet-related covariates on the microbiota compositions; for the patients enrolled in this study, microbiota abundance counts are collected from three different districts, namely, tumor, fecal and salivary samples. Building upon the Dirichlet-multinomial regression framework, we develop a high-dimensional Bayesian hierarchical model that exploits subject-specific regression coefficients to simultaneously borrow strength across districts and include complex interactions between diet and clinical factors if supported by the data. The proposed method identifies relevant associations through model selection priors and thresholding mechanisms. Posterior inference is performed via a Markov chain Monte Carlo algorithm. We use simulation studies to assess the performance of our method, and found our approach to outperform competing methods that do not account for complex interactions. Finally, a thorough analysis of the crc data illustrates the benefits of the proposed approach.
2023
17
539
559
Pedone M.; Amedei A.; Stingo F.C.
File in questo prodotto:
File Dimensione Formato  
AOAS2023_Pedone.pdf

Accesso chiuso

Tipologia: Pdf editoriale (Version of record)
Licenza: Tutti i diritti riservati
Dimensione 320.41 kB
Formato Adobe PDF
320.41 kB Adobe PDF   Richiedi una copia

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1299319
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 0
social impact