Learning the two parameters of the Poisson–Dirichlet distribution with a forensic application

Cereda, Giulia; Corradi, Fabio; Viscardi, Cecilia

doi:10.1111/sjos.12575

In forensic science, the rare type match problem arises when the matching characteristic from the suspect and the crime scene is not in the reference database; hence, it is difficult to evaluate the likelihood ratio that compares the defense and prosecution hypotheses. A recent solution consists of modeling the ordered population probabilities according to the two-parameter Poisson–Dirichlet distribution, which is a well-known Bayesian nonparametric prior, and plugging the maximum likelihood estimates of the parameters into the likelihood ratio. We demonstrate that this approximation produces a systematic bias that fully Bayesian inference avoids. Motivated by this forensic application, we consider the need to learn the posterior distribution of the parameters that governs the two-parameter Poisson–Dirichlet using two sampling methods: Markov Chain Monte Carlo and approximate Bayesian computation. These methods are evaluated in terms of accuracy and efficiency. Finally, we compare the likelihood ratio that is obtained by our proposal with the existing solution using a database of Y-chromosome haplotypes.

Learning the two parameters of the Poisson–Dirichlet distribution with a forensic application / Giulia Cereda; Fabio Corradi; Cecilia Viscardi. - In: SCANDINAVIAN JOURNAL OF STATISTICS. - ISSN 0303-6898. - ELETTRONICO. - 50:(2023), pp. 120-141. [10.1111/sjos.12575]