Speaking with mask in the COVID-19 era: Multiclass machine learning classification of acoustic and perceptual parameters

Calà, F; Manfredi, C; Battilocchi, L; Frassineti, L; Cantarella, G

doi:10.1121/10.0017244

The intensive use of personal protective equipment often requires increasing voice intensity, with possible development of voice disorders. This paper exploits machine learning approaches to investigate the impact of different types of masks on sustained vowels /a/, /i/, and /u/ and the sequence /a'jw/ inside a standardized sentence. Both objective acoustical parameters and subjective ratings were used for statistical analysis, multiple comparisons, and in multivariate machine learning classification experiments. Significant differences were found between mask+shield configuration and no-mask and between mask and mask+shield conditions. Power spectral density decreases with statistical significance above 1.5 kHz when wearing masks. Subjective ratings confirmed increasing discomfort from no-mask condition to protective masks and shield. Machine learning techniques proved that masks alter voice production: in a multiclass experiment, random forest (RF) models were able to distinguish amongst seven masks conditions with up to 94% validation accuracy, separating masked from unmasked conditions with up to 100% validation accuracy and detecting the shield presence with up to 86% validation accuracy. Moreover, an RF classifier allowed distinguishing male from female subject in masked conditions with 100% validation accuracy. Combining acoustic and perceptual analysis represents a robust approach to characterize masks configurations and quantify the corresponding level of discomfort.

Speaking with mask in the COVID-19 era: Multiclass machine learning classification of acoustic and perceptual parameters / Calà, F; Manfredi, C; Battilocchi, L; Frassineti, L; Cantarella, G. - In: THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA. - ISSN 0001-4966. - STAMPA. - 153:(2023), pp. 1204-1218. [10.1121/10.0017244]