Many environments within the human body host a collection of micro-organisms called microbiota. Recent findings have linked the composition of the microbiota to the development of different human diseases, including cancer. Motivated by a recent colorectal cancer (crc) study, we investigate the effect of clinical factors and diet-related covariates on the microbiota compositions; for the patients enrolled in this study, microbiota abundance counts are collected from three different districts, namely, tumor, fecal and salivary samples. Building upon the Dirichlet-multinomial regression framework, we develop a high-dimensional Bayesian hierarchical model that exploits subject-specific regression coefficients to simultaneously borrow strength across districts and include complex interactions between diet and clinical factors if supported by the data. The proposed method identifies relevant associations through model selection priors and thresholding mechanisms. Posterior inference is performed via a Markov chain Monte Carlo algorithm. We use simulation studies to assess the performance of our method, and found our approach to outperform competing methods that do not account for complex interactions. Finally, a thorough analysis of the crc data illustrates the benefits of the proposed approach.
SUBJECT-SPECIFIC DIRICHLET-MULTINOMIAL REGRESSION FOR MULTI-DISTRICT MICROBIOTA DATA ANALYSIS / Pedone M.; Amedei A.; Stingo F.C.. - In: THE ANNALS OF APPLIED STATISTICS. - ISSN 1932-6157. - STAMPA. - 17:(2023), pp. 539-559. [10.1214/22-AOAS1641]
SUBJECT-SPECIFIC DIRICHLET-MULTINOMIAL REGRESSION FOR MULTI-DISTRICT MICROBIOTA DATA ANALYSIS
Pedone M.;Amedei A.;Stingo F. C.
2023
Abstract
Many environments within the human body host a collection of micro-organisms called microbiota. Recent findings have linked the composition of the microbiota to the development of different human diseases, including cancer. Motivated by a recent colorectal cancer (crc) study, we investigate the effect of clinical factors and diet-related covariates on the microbiota compositions; for the patients enrolled in this study, microbiota abundance counts are collected from three different districts, namely, tumor, fecal and salivary samples. Building upon the Dirichlet-multinomial regression framework, we develop a high-dimensional Bayesian hierarchical model that exploits subject-specific regression coefficients to simultaneously borrow strength across districts and include complex interactions between diet and clinical factors if supported by the data. The proposed method identifies relevant associations through model selection priors and thresholding mechanisms. Posterior inference is performed via a Markov chain Monte Carlo algorithm. We use simulation studies to assess the performance of our method, and found our approach to outperform competing methods that do not account for complex interactions. Finally, a thorough analysis of the crc data illustrates the benefits of the proposed approach.File | Dimensione | Formato | |
---|---|---|---|
AOAS2023_Pedone.pdf
Accesso chiuso
Tipologia:
Pdf editoriale (Version of record)
Licenza:
Tutti i diritti riservati
Dimensione
320.41 kB
Formato
Adobe PDF
|
320.41 kB | Adobe PDF | Richiedi una copia |
I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.