On the evaluation measures for machine learning algorithms for safety-critical systems

Gharib, M.; Bondavalli, A.

doi:10.1109/EDCC.2019.00035

The ability of Machine Learning (ML) algorithms to learn and work with incomplete knowledge has motivated many system manufacturers to include such algorithms in their products. However, some of these systems can be described as Safety-Critical Systems (SCS) since their failure may cause injury or even death to humans. Therefore, the performance of ML algorithms with respect to the safety requirements of such systems must be evaluated before they are used in their operational environment. Although there exist several measures that can be used for evaluating the performance of ML algorithms, most of these measures focus mainly on some properties of interest in the domains where they were developed. For example, Recall, Precision and F-Factor are, usually, used in Information Retrieval (IR) domain, and they mainly focus on correct predictions with less emphasis on incorrect predictions, which are very important in SCS. Accordingly, such measures need to be tuned to fit the needs for evaluating the safe performance of ML algorithms. This position paper presents the authors' view on the inadequacy of existing measures, and it proposes a new set of measures to be used for the evaluation of the safe performance of ML algorithms.

On the evaluation measures for machine learning algorithms for safety-critical systems / Gharib M., Bondavalli A.. - ELETTRONICO. - (2019), pp. 141-144. (15th European Dependable Computing Conference, EDCC 2019 ita 2019) [10.1109/EDCC.2019.00035].