Trust in Robot Benchmarking and Benchmarking for Trustworthy Robots

Thoduka, Santosh; Nair, Deebul; Caleb-Solly, Praminda; Dragone, Mauro; Cavallo, Filippo; Hochgeschwender, Nico

doi:10.1007/978-3-031-55817-7_3

Trustworthy evaluation of robots is necessary for them to be deployed and accepted in society. Scientific benchmarking competitions provide a way to evaluate robots outside of lab conditions. We propose a progressive and iterative benchmarking process through competitions, which incorporates an objective dataset-based evaluation, evaluation on a remote robot, and field evaluations for individual robot functionalities and complete tasks, in a cyclical process similar to the machine learning lifecycle, with a view to achieving trustworthy evaluation. The inclusion of out-of-distribution data, failure scenarios and user studies as part of the benchmarking process addresses the necessity to evaluate robot systems on non-functional qualities such as fault tolerance, adaptability, social acceptance, in addition to their functional abilities to improve trustworthiness.

Trust in Robot Benchmarking and Benchmarking for Trustworthy Robots / Thoduka, S., Nair, D., Caleb-Solly, P., Dragone, M., Cavallo, F., Hochgeschwender, N.. - ELETTRONICO. - 1150:(2024), pp. 31-52. [10.1007/978-3-031-55817-7_3]