The growing use of artificial intelligence in education calls for rigorous methods to evaluate fairness in algorithmic decision-making. This paper focuses on calibration as an operationalization of the sufficiency criterion and develops a method to compute it in educational contexts, where prediction is often defined in terms of individual scores rather than a simple success or failure outcome. We first discuss the conceptual foundations of sufficiency and its specific relevance for predictions expressed as scores or probabilities. We then propose a procedure for measuring fairness through calibration, addressing key challenges such as the aggregation of outcomes across multiple predicted values, the ordering of groups in difference measures, and the treatment of cases where data availability is unbalanced across groups. The proposed procedure is grounded in design choices that aim to preserve the interpretive meaning of data in terms of fairness, while relying on established and transparent statistical methods. The method is empirically applied to two real-world student performance datasets using different classification algorithms. The results illustrate both the feasibility of the approach and the methodological implications of the design choices required to operationalize calibration. The contribution of this study lies primarily in providing a structured framework for measuring sufficiency through calibration, enabling researchers and practitioners to better assess fairness in artificial intelligence systems for education.

Operationalizing Calibration for Fair Educational Artificial Intelligence / M. Mancini, D. Merlini, M. C. Verri. - ELETTRONICO. - 16438:(2026), pp. 183-198. [10.1007/978-3-032-17604-2_17]

Operationalizing Calibration for Fair Educational Artificial Intelligence

M. Mancini
;
D. Merlini;M. C. Verri
2026

Abstract

The growing use of artificial intelligence in education calls for rigorous methods to evaluate fairness in algorithmic decision-making. This paper focuses on calibration as an operationalization of the sufficiency criterion and develops a method to compute it in educational contexts, where prediction is often defined in terms of individual scores rather than a simple success or failure outcome. We first discuss the conceptual foundations of sufficiency and its specific relevance for predictions expressed as scores or probabilities. We then propose a procedure for measuring fairness through calibration, addressing key challenges such as the aggregation of outcomes across multiple predicted values, the ordering of groups in difference measures, and the treatment of cases where data availability is unbalanced across groups. The proposed procedure is grounded in design choices that aim to preserve the interpretive meaning of data in terms of fairness, while relying on established and transparent statistical methods. The method is empirically applied to two real-world student performance datasets using different classification algorithms. The results illustrate both the feasibility of the approach and the methodological implications of the design choices required to operationalize calibration. The contribution of this study lies primarily in providing a structured framework for measuring sufficiency through calibration, enabling researchers and practitioners to better assess fairness in artificial intelligence systems for education.
2026
978-3-032-17603-5
978-3-032-17604-2
Artificial Intelligence with and for Learning Sciences. Past, Present, and Future Horizons
183
198
M. Mancini, D. Merlini, M. C. Verri
File in questo prodotto:
File Dimensione Formato  
Wails_2025_camera_ready.pdf

accesso aperto

Descrizione: Operationalizing Calibration for Fair Educational Artificial Intelligence
Tipologia: Pdf editoriale (Version of record)
Licenza: Open Access
Dimensione 426.71 kB
Formato Adobe PDF
426.71 kB Adobe PDF

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1440495
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact