The BioGeographical Ancestry (BGA) of an individual can be inferred from their Deoxyribonucleic Acid (DNA), particularly by using Single-Nucleotide Polymorphism (SNP) markers. This short paper aimed to predict the continental BGA by adopting a supervised Machine Learning (ML) method and relying on an innovative SNP panel. Starting from individuals with known BGA, a model pipeline was applied within a nested cross-validation strategy to perform model selection and assessment. The results showed a good discrimination capacity of the novel panel and plausible misclassification patterns that may be connected more to the complexity of the phenomenon rather than to inference problems, which require a discussion of the BGA uncertainty. These findings laid the groundwork for further research with the ultimate purpose of inferring BGA at a finer level.
Biogeographical Ancestry Prediction via an Innovative Panel: Difficult Task or Complex Phenomenon? / Grazzini, C., Spera, G., Castellana, D., Morelli, S., Pilli, E., Baccini, M., Cereda, G.. - ELETTRONICO. - (2025), pp. 363-368. (Scientific Meeting of the Italian Statistical Society ) [10.1007/978-3-031-95995-0_60].
Biogeographical Ancestry Prediction via an Innovative Panel: Difficult Task or Complex Phenomenon?
Grazzini, Cosimo
;Spera, Giorgia;Castellana, Daniele;Morelli, Stefania;Pilli, Elena;Baccini, Michela;Cereda, Giulia
2025
Abstract
The BioGeographical Ancestry (BGA) of an individual can be inferred from their Deoxyribonucleic Acid (DNA), particularly by using Single-Nucleotide Polymorphism (SNP) markers. This short paper aimed to predict the continental BGA by adopting a supervised Machine Learning (ML) method and relying on an innovative SNP panel. Starting from individuals with known BGA, a model pipeline was applied within a nested cross-validation strategy to perform model selection and assessment. The results showed a good discrimination capacity of the novel panel and plausible misclassification patterns that may be connected more to the complexity of the phenomenon rather than to inference problems, which require a discussion of the BGA uncertainty. These findings laid the groundwork for further research with the ultimate purpose of inferring BGA at a finer level.I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.



