Bayesian networks offer an extremely flexible environment for knowledge representation, so that they are often claimed to be the best statistical tools to support medical diagnosis. An Acyclic Directed Graph encodes causal relationships and provides a factorization of the joint probability distribution according to conditional independence properties. Although knowledge for the specification of the Acyclic Directed Graph is easilyretrievable, the information useful to develop the quantitative part of the network is typically scattered and varying in quality. For instance, medical literature seldom covers all the aspects of interest and clinical data are typically sparse. This is why the most relevant applications of Bayesian networks to medical diagnosis are entirely built from expert knowledge. However, when this is the case, the accuracy of the quantitative part remains arguable, since the the elicited information is unlikely to be fully trustworthy. In this thesis, the quantitative part of a Bayesian network for the diagnosis of cardiopulmonary diseases is estimated by combining elicitation from medical experts and clinical data. An original elicitation framework is developed to accurately quantify expert uncertainty on parameters, then prior distributions are updated in the light of data by means of Markov Chain Monte Carlo methods. The framework includes several generalizations of the Noisy-Or model and a Generalized Beta regression which are exploited to avoid polytomization of continuous variables. Parsimony in the number of parameters is dramatically improved with respect to the traditional framework, while a rescaling procedure based on a distinction between normal and pathological values makes parameters meaningful for medical experts. As such, the framework allows to either incorporate the sample size of clinical studies into the prior distributions or, when no study provides sufficiently detailed information, to compute expert uncertainty on assessments as if they were based on a virtual experiment. By taking advantage of two different sources of knowledge, the consistency between the model and data is readily checked by inspecting prior-to-posterior divergence. This enables a proper refinement of the Bayesian network in a cyclic-iterated fashion, possibly questioning the specification of the Acyclic Directed Graph, a feature which is beyond the capability of applications entirely built from either expert knowledge or data.

A Bayesian network for the diagnosis of cardiopulmonary diseases: Learning from medical experts and clinical data / Alessandro Magrini. - (2014).

A Bayesian network for the diagnosis of cardiopulmonary diseases: Learning from medical experts and clinical data

MAGRINI, ALESSANDRO
2014

Abstract

Bayesian networks offer an extremely flexible environment for knowledge representation, so that they are often claimed to be the best statistical tools to support medical diagnosis. An Acyclic Directed Graph encodes causal relationships and provides a factorization of the joint probability distribution according to conditional independence properties. Although knowledge for the specification of the Acyclic Directed Graph is easilyretrievable, the information useful to develop the quantitative part of the network is typically scattered and varying in quality. For instance, medical literature seldom covers all the aspects of interest and clinical data are typically sparse. This is why the most relevant applications of Bayesian networks to medical diagnosis are entirely built from expert knowledge. However, when this is the case, the accuracy of the quantitative part remains arguable, since the the elicited information is unlikely to be fully trustworthy. In this thesis, the quantitative part of a Bayesian network for the diagnosis of cardiopulmonary diseases is estimated by combining elicitation from medical experts and clinical data. An original elicitation framework is developed to accurately quantify expert uncertainty on parameters, then prior distributions are updated in the light of data by means of Markov Chain Monte Carlo methods. The framework includes several generalizations of the Noisy-Or model and a Generalized Beta regression which are exploited to avoid polytomization of continuous variables. Parsimony in the number of parameters is dramatically improved with respect to the traditional framework, while a rescaling procedure based on a distinction between normal and pathological values makes parameters meaningful for medical experts. As such, the framework allows to either incorporate the sample size of clinical studies into the prior distributions or, when no study provides sufficiently detailed information, to compute expert uncertainty on assessments as if they were based on a virtual experiment. By taking advantage of two different sources of knowledge, the consistency between the model and data is readily checked by inspecting prior-to-posterior divergence. This enables a proper refinement of the Bayesian network in a cyclic-iterated fashion, possibly questioning the specification of the Acyclic Directed Graph, a feature which is beyond the capability of applications entirely built from either expert knowledge or data.
2014
Federico Mattia Stefanini
Goal 3: Good health and well-being for people
Alessandro Magrini
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/841701
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact