The use of real-world data (RWD) from electronic healthcare databases has become a crucial tool for regulatory decision-making, offering valuable insights into drug safety and effectiveness. However, studies using RWD face significant challenges, including confounding, bias due to misclassification, and, from a study replicability perspective, data diversity. While confounding is not analyzed in this dissertation, the focus is on developing methodological solutions to mitigate the misclassification of study outcomes introduced by case-finding algorithms (CFAs) and to enhance the reliability of RWD-based studies. The first part of the dissertation addresses the issue of low sensitivity in CFAs. A novel approach, integrating both a primary and a screening algorithm, is introduced to improve sensitivity and provide robust correction mechanisms, even when sensitivity is non-differential. Specifically, this work proposes methods to estimate the sensitivity of CFAs, test for differential sensitivity, and correct for misclassification bias in measures such as the number of cases, risk, risk ratio, and risk difference. These methods are tested through simulation studies and, in the second part of the dissertation, are applied in three validation studies conducted at Careggi Hospital in Florence, including two post-authorization safety studies. The third part of the dissertation focuses on the crucial role of RWD in studying drug use during pregnancy. Pregnant women are often excluded from clinical trials, making RWD indispensable for assessing drug safety. One of the main challenges is that pregnancy-finding algorithms may be affected by selection bias, as unfavorable outcomes are challenging to identify. Yet, avoiding selection bias is essential for accurately assessing drug safety. To address this, advanced methodologies, including random forest models, have been developed to enhance the detection and comprehensiveness of pregnancy episodes, even when key outcomes are missing or uncertain. These approaches aim to provide more reliable and robust results in pregnancy studies. Overall, this dissertation contributes to improving the validity of pharmacoepidemiological research based on RWD by addressing key methodological challenges related to misclassification.

Addressing bias in epidemiological studies based on electronic healthcare databases / Giorgio Limoncella. - (2025).

Addressing bias in epidemiological studies based on electronic healthcare databases

Giorgio Limoncella
2025

Abstract

The use of real-world data (RWD) from electronic healthcare databases has become a crucial tool for regulatory decision-making, offering valuable insights into drug safety and effectiveness. However, studies using RWD face significant challenges, including confounding, bias due to misclassification, and, from a study replicability perspective, data diversity. While confounding is not analyzed in this dissertation, the focus is on developing methodological solutions to mitigate the misclassification of study outcomes introduced by case-finding algorithms (CFAs) and to enhance the reliability of RWD-based studies. The first part of the dissertation addresses the issue of low sensitivity in CFAs. A novel approach, integrating both a primary and a screening algorithm, is introduced to improve sensitivity and provide robust correction mechanisms, even when sensitivity is non-differential. Specifically, this work proposes methods to estimate the sensitivity of CFAs, test for differential sensitivity, and correct for misclassification bias in measures such as the number of cases, risk, risk ratio, and risk difference. These methods are tested through simulation studies and, in the second part of the dissertation, are applied in three validation studies conducted at Careggi Hospital in Florence, including two post-authorization safety studies. The third part of the dissertation focuses on the crucial role of RWD in studying drug use during pregnancy. Pregnant women are often excluded from clinical trials, making RWD indispensable for assessing drug safety. One of the main challenges is that pregnancy-finding algorithms may be affected by selection bias, as unfavorable outcomes are challenging to identify. Yet, avoiding selection bias is essential for accurately assessing drug safety. To address this, advanced methodologies, including random forest models, have been developed to enhance the detection and comprehensiveness of pregnancy episodes, even when key outcomes are missing or uncertain. These approaches aim to provide more reliable and robust results in pregnancy studies. Overall, this dissertation contributes to improving the validity of pharmacoepidemiological research based on RWD by addressing key methodological challenges related to misclassification.
2025
Leonardo Grilli
ITALIA
Giorgio Limoncella
File in questo prodotto:
File Dimensione Formato  
Tesi_PhD_Limoncella.pdf

embargo fino al 26/04/2027

Tipologia: Pdf editoriale (Version of record)
Licenza: Open Access
Dimensione 16.29 MB
Formato Adobe PDF
16.29 MB Adobe PDF   Richiedi una copia

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1420676
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact