This article discusses the fundamental principles of causal inference—the area of statistics that estimates the effect of specific occurrences, treatments, interventions, and exposures on a given outcome from experimental and observational data. We explain the key assumptions required to identify causal effects, and highlight the challenges associated with the use of observational data. We emphasize that experimental thinking is crucial in causal inference. The quality of the data (not necessarily the quantity), the study design, the degree to which the assumptions are met, and the rigor of the statistical analysis allow us to credibly infer causal effects. Although we advocate leveraging the use of big data and the application of machine learning (ML) algorithms for estimating causal effects, they are not a substitute for thoughtful study design. Concepts are illustrated via examples.

From Controlled to Undisciplined Data: Estimating Causal Effects in the Era of Data Science Using a Potential Outcome Framework / Dominici, Francesca; Bargagli-Stoffi, Falco J.; Mealli, Fabrizia. - In: HARVARD DATA SCIENCE REVIEW. - ISSN 2644-2353. - ELETTRONICO. - 3:(2021), pp. 0-0. [10.1162/99608f92.8102afed]

From Controlled to Undisciplined Data: Estimating Causal Effects in the Era of Data Science Using a Potential Outcome Framework

Mealli, Fabrizia
2021

Abstract

This article discusses the fundamental principles of causal inference—the area of statistics that estimates the effect of specific occurrences, treatments, interventions, and exposures on a given outcome from experimental and observational data. We explain the key assumptions required to identify causal effects, and highlight the challenges associated with the use of observational data. We emphasize that experimental thinking is crucial in causal inference. The quality of the data (not necessarily the quantity), the study design, the degree to which the assumptions are met, and the rigor of the statistical analysis allow us to credibly infer causal effects. Although we advocate leveraging the use of big data and the application of machine learning (ML) algorithms for estimating causal effects, they are not a substitute for thoughtful study design. Concepts are illustrated via examples.
2021
3
0
0
Dominici, Francesca; Bargagli-Stoffi, Falco J.; Mealli, Fabrizia
File in questo prodotto:
File Dimensione Formato  
HDSR.pdf

accesso aperto

Tipologia: Versione finale referata (Postprint, Accepted manuscript)
Licenza: Open Access
Dimensione 1.2 MB
Formato Adobe PDF
1.2 MB Adobe PDF

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1262520
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact