Motivation: Discriminant analysis is an effective tool for the classification of experimental units into groups. Here we consider the typical problem of classifying subjects according to phenotypes via gene expression data and propose a method that incorporates variable selection into the inferential procedure, for the identification of the important biomarkers. To achieve this goal we build upon a conjugate normal discriminant model, both linear and quadratic, and include a stochastic search variable selection procedure via an MCMC algorithm. Furthermore, we incorporate into the model prior information on the relationships among the genes as described by a gene-gene network. We use a Markov random field (MRF) prior to map the network connections among genes. Our prior model assumes that neighboring genes in the network are more likely to have a joint effect on the relevant biological processes. Results: We use simulated data to assess performances of our method. In particular we compare the MRF prior to a situation where independent Bernoulli priors are chosen for the individual predictors. We also illustrate the method on benchmark datasets for gene expression. Our simulation studies show that employing the MRF prior improves on selection accuracy. In real data applications, in addition to identifying markers and improving prediction accuracy, we show how the integration of existing biological knowledge into the prior model results in an increased ability to identify genes with strong discriminatory power and also aids the interpretation of the results.

Variable Selection for Discriminant Analysis with Markov RandomField Priors for the Analysis of Microarray Data / F.C. Stingo; M. Vannucci. - In: BIOINFORMATICS. - ISSN 1367-4803. - STAMPA. - 27 (4):(2011), pp. 495-501.

Variable Selection for Discriminant Analysis with Markov RandomField Priors for the Analysis of Microarray Data

STINGO, FRANCESCO CLAUDIO;
2011

Abstract

Motivation: Discriminant analysis is an effective tool for the classification of experimental units into groups. Here we consider the typical problem of classifying subjects according to phenotypes via gene expression data and propose a method that incorporates variable selection into the inferential procedure, for the identification of the important biomarkers. To achieve this goal we build upon a conjugate normal discriminant model, both linear and quadratic, and include a stochastic search variable selection procedure via an MCMC algorithm. Furthermore, we incorporate into the model prior information on the relationships among the genes as described by a gene-gene network. We use a Markov random field (MRF) prior to map the network connections among genes. Our prior model assumes that neighboring genes in the network are more likely to have a joint effect on the relevant biological processes. Results: We use simulated data to assess performances of our method. In particular we compare the MRF prior to a situation where independent Bernoulli priors are chosen for the individual predictors. We also illustrate the method on benchmark datasets for gene expression. Our simulation studies show that employing the MRF prior improves on selection accuracy. In real data applications, in addition to identifying markers and improving prediction accuracy, we show how the integration of existing biological knowledge into the prior model results in an increased ability to identify genes with strong discriminatory power and also aids the interpretation of the results.
2011
27 (4)
495
501
F.C. Stingo; M. Vannucci
File in questo prodotto:
File Dimensione Formato  
Bioinformatics-2010-Stingo-bioinformatics_btq690[1].pdf

Accesso chiuso

Tipologia: Pdf editoriale (Version of record)
Licenza: Tutti i diritti riservati
Dimensione 210.85 kB
Formato Adobe PDF
210.85 kB Adobe PDF   Richiedi una copia

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/422269
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 52
  • ???jsp.display-item.citation.isi??? ND
social impact