In forensic science, the rare type match problem arises when the matching characteristic from the suspect and the crime scene is not in the reference database; hence, it is difficult to evaluate the likelihood ratio that compares the defense and prosecution hypotheses. A recent solution consists of modeling the ordered population probabilities according to the two-parameter Poisson–Dirichlet distribution, which is a well-known Bayesian nonparametric prior, and plugging the maximum likelihood estimates of the parameters into the likelihood ratio. We demonstrate that this approximation produces a systematic bias that fully Bayesian inference avoids. Motivated by this forensic application, we consider the need to learn the posterior distribution of the parameters that governs the two-parameter Poisson–Dirichlet using two sampling methods: Markov Chain Monte Carlo and approximate Bayesian computation. These methods are evaluated in terms of accuracy and efficiency. Finally, we compare the likelihood ratio that is obtained by our proposal with the existing solution using a database of Y-chromosome haplotypes.

Learning the two parameters of the Poisson–Dirichlet distribution with a forensic application / Giulia Cereda; Fabio Corradi; Cecilia Viscardi. - In: SCANDINAVIAN JOURNAL OF STATISTICS. - ISSN 0303-6898. - ELETTRONICO. - (2022), pp. 1-21. [10.1111/sjos.12575]

Learning the two parameters of the Poisson–Dirichlet distribution with a forensic application

Giulia Cereda
;
Fabio Corradi;Cecilia Viscardi
2022

Abstract

In forensic science, the rare type match problem arises when the matching characteristic from the suspect and the crime scene is not in the reference database; hence, it is difficult to evaluate the likelihood ratio that compares the defense and prosecution hypotheses. A recent solution consists of modeling the ordered population probabilities according to the two-parameter Poisson–Dirichlet distribution, which is a well-known Bayesian nonparametric prior, and plugging the maximum likelihood estimates of the parameters into the likelihood ratio. We demonstrate that this approximation produces a systematic bias that fully Bayesian inference avoids. Motivated by this forensic application, we consider the need to learn the posterior distribution of the parameters that governs the two-parameter Poisson–Dirichlet using two sampling methods: Markov Chain Monte Carlo and approximate Bayesian computation. These methods are evaluated in terms of accuracy and efficiency. Finally, we compare the likelihood ratio that is obtained by our proposal with the existing solution using a database of Y-chromosome haplotypes.
2022
1
21
Giulia Cereda; Fabio Corradi; Cecilia Viscardi
File in questo prodotto:
File Dimensione Formato  
Scandinavian J Statistics - 2022 - Cereda - Learning the two parameters of the Poisson Dirichlet distribution with a.pdf

accesso aperto

Tipologia: Pdf editoriale (Version of record)
Licenza: Open Access
Dimensione 2.18 MB
Formato Adobe PDF
2.18 MB Adobe PDF

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1260650
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact