The last decade has seen a surging opportunity for developing autonomous systems (e.g., fully autonomous driving, computer-guided robotic surgery, mobile robots for inspections and surveillance, and manufacturing robots) to relieve the human workforce from time-consuming, laborious, and often hazardous tasks. These cutting-edge innovations require trustworthy and autonomous classification processes to provide their intended functionality without harming the health of people, infrastructures, and the environment, or causing financial losses. Unfortunately, even human experts cannot always act as trustworthy classifiers, let alone computer-guided decision-makers based on machine learning (ML). We advocate that striving for correct software will just lead to a dead end: instead, we should focus on minimizing wrong outputs—misclassifications in case of classifiers—suspecting prediction errors and handling them via tailored software architectures. To such extent, this paper proposes a software architecture that allows for monitoring a black-box software component, triggering a rejection of its output whenever specific uncertainty conditions are detected trough an ensemble of uncertainty measures that are continuously computed. Such Safety wraPper thROugh ensembles of UncertainTy measures (SPROUT) is designed to adapt to any classification task, either binary or multiclass, processing tabular or image data. Experimental results using a wide range of datasets, classifiers, and parameter setups show how our approach can consistently reject a significant portion of misclassifications, even suspecting all incorrect predictions in specific cases. To maximize usage and ease of use, SPROUT is available as an open-source library with pre-trained models and files for effortless case study execution.

Design, framework and benchmark of safety monitors for black-box classifiers / Khokhar, Fahad Ahmed; Zoppi, Tommaso; Cennini, Luigi; Ceccarelli, Andrea; Bondavalli, Andrea. - In: SCIENTIFIC REPORTS. - ISSN 2045-2322. - ELETTRONICO. - (2026), pp. 0-0. [10.1038/s41598-026-45091-2]

Design, framework and benchmark of safety monitors for black-box classifiers

Khokhar, Fahad Ahmed;Zoppi, Tommaso;Cennini, Luigi;Ceccarelli, Andrea;Bondavalli, Andrea
2026

Abstract

The last decade has seen a surging opportunity for developing autonomous systems (e.g., fully autonomous driving, computer-guided robotic surgery, mobile robots for inspections and surveillance, and manufacturing robots) to relieve the human workforce from time-consuming, laborious, and often hazardous tasks. These cutting-edge innovations require trustworthy and autonomous classification processes to provide their intended functionality without harming the health of people, infrastructures, and the environment, or causing financial losses. Unfortunately, even human experts cannot always act as trustworthy classifiers, let alone computer-guided decision-makers based on machine learning (ML). We advocate that striving for correct software will just lead to a dead end: instead, we should focus on minimizing wrong outputs—misclassifications in case of classifiers—suspecting prediction errors and handling them via tailored software architectures. To such extent, this paper proposes a software architecture that allows for monitoring a black-box software component, triggering a rejection of its output whenever specific uncertainty conditions are detected trough an ensemble of uncertainty measures that are continuously computed. Such Safety wraPper thROugh ensembles of UncertainTy measures (SPROUT) is designed to adapt to any classification task, either binary or multiclass, processing tabular or image data. Experimental results using a wide range of datasets, classifiers, and parameter setups show how our approach can consistently reject a significant portion of misclassifications, even suspecting all incorrect predictions in specific cases. To maximize usage and ease of use, SPROUT is available as an open-source library with pre-trained models and files for effortless case study execution.
2026
0
0
Khokhar, Fahad Ahmed; Zoppi, Tommaso; Cennini, Luigi; Ceccarelli, Andrea; Bondavalli, Andrea
File in questo prodotto:
File Dimensione Formato  
s41598-026-45091-2_reference-1.pdf

accesso aperto

Tipologia: Versione finale referata (Postprint, Accepted manuscript)
Licenza: Open Access
Dimensione 2.43 MB
Formato Adobe PDF
2.43 MB Adobe PDF

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1464695
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact