The ability of Machine Learning (ML) algorithms to learn and work with incomplete knowledge has motivated many system manufacturers to include such algorithms in their products. However, some of these systems can be described as Safety-Critical Systems (SCS) since their failure may cause injury or even death to humans. Therefore, the performance of ML algorithms with respect to the safety requirements of such systems must be evaluated before they are used in their operational environment. Although there exist several measures that can be used for evaluating the performance of ML algorithms, most of these measures focus mainly on some properties of interest in the domains where they were developed. For example, Recall, Precision and F-Factor are, usually, used in Information Retrieval (IR) domain, and they mainly focus on correct predictions with less emphasis on incorrect predictions, which are very important in SCS. Accordingly, such measures need to be tuned to fit the needs for evaluating the safe performance of ML algorithms. This position paper presents the authors' view on the inadequacy of existing measures, and it proposes a new set of measures to be used for the evaluation of the safe performance of ML algorithms.

On the evaluation measures for machine learning algorithms for safety-critical systems / Gharib M.; Bondavalli A.. - ELETTRONICO. - (2019), pp. 141-144. ( 15th European Dependable Computing Conference, EDCC 2019 ita 2019) [10.1109/EDCC.2019.00035].

On the evaluation measures for machine learning algorithms for safety-critical systems

Gharib M.
;
Bondavalli A.
2019

Abstract

The ability of Machine Learning (ML) algorithms to learn and work with incomplete knowledge has motivated many system manufacturers to include such algorithms in their products. However, some of these systems can be described as Safety-Critical Systems (SCS) since their failure may cause injury or even death to humans. Therefore, the performance of ML algorithms with respect to the safety requirements of such systems must be evaluated before they are used in their operational environment. Although there exist several measures that can be used for evaluating the performance of ML algorithms, most of these measures focus mainly on some properties of interest in the domains where they were developed. For example, Recall, Precision and F-Factor are, usually, used in Information Retrieval (IR) domain, and they mainly focus on correct predictions with less emphasis on incorrect predictions, which are very important in SCS. Accordingly, such measures need to be tuned to fit the needs for evaluating the safe performance of ML algorithms. This position paper presents the authors' view on the inadequacy of existing measures, and it proposes a new set of measures to be used for the evaluation of the safe performance of ML algorithms.
2019
Proceedings - 2019 15th European Dependable Computing Conference, EDCC 2019
15th European Dependable Computing Conference, EDCC 2019
ita
2019
Goal 9: Industry, Innovation, and Infrastructure
Gharib M.; Bondavalli A.
File in questo prodotto:
File Dimensione Formato  
CR-EDCC19.pdf

accesso aperto

Descrizione: Authors' version
Tipologia: Pdf editoriale (Version of record)
Licenza: Creative commons
Dimensione 86.23 kB
Formato Adobe PDF
86.23 kB Adobe PDF

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1211629
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 16
  • ???jsp.display-item.citation.isi??? 12
social impact