Establishing standardized methods for a consistent analysis of spectral data remains a largely underexplored aspect in surface-enhanced Raman spectroscopy (SERS), particularly applied to biological and bio-medical research. Here we propose a Machine Learning (ML) based approach for classification of protein species. Principal Component Analysis (PCA), t-distributed stochastic neighbour embedding (t-SNE) and Uniform Manifold Approximation and Projection (UMAP) where used for dimensionality reduction, along with supervised and unsupervised methods to quantify how closely resembled SERS spectral profiles belonging to different species (Albumin from bovine serum, Albumin from human serum, Lysozyme, Human holo-transferrin, Human apo-transferrin) are. In particular, ML algorithms such as Support Vector Machine, K-Nearest Neighbours, Linear Discriminant Analysis and the unsupervised K-means were applied to original and multipeak fitting on SERS spectra respectively. This strategy simultaneously assures a fast, full and successful discrimination of proteins and a thorough characterization of the chemo-structural differences among them, ultimately opening up new routes for SERS evolution toward sensing applications and diagnostics of interest in life sciences.

A Machine Learning approach to the classification of chemo-structural determinants in label-free SERS detection of proteins / Barucci, A; D'Andrea, C; Farnesi, E; Banchelli, M; Amicucci, C; De Angelis, M; Marzi, C; Pini, R; Hwang, B; Matteini, P. - STAMPA. - (2022), pp. 1-4. (Intervento presentato al convegno Italian conference on optics and photonics, ICOP 2022) [10.1109/ICOP56156.2022.9911735].

A Machine Learning approach to the classification of chemo-structural determinants in label-free SERS detection of proteins

Marzi, C;
2022

Abstract

Establishing standardized methods for a consistent analysis of spectral data remains a largely underexplored aspect in surface-enhanced Raman spectroscopy (SERS), particularly applied to biological and bio-medical research. Here we propose a Machine Learning (ML) based approach for classification of protein species. Principal Component Analysis (PCA), t-distributed stochastic neighbour embedding (t-SNE) and Uniform Manifold Approximation and Projection (UMAP) where used for dimensionality reduction, along with supervised and unsupervised methods to quantify how closely resembled SERS spectral profiles belonging to different species (Albumin from bovine serum, Albumin from human serum, Lysozyme, Human holo-transferrin, Human apo-transferrin) are. In particular, ML algorithms such as Support Vector Machine, K-Nearest Neighbours, Linear Discriminant Analysis and the unsupervised K-means were applied to original and multipeak fitting on SERS spectra respectively. This strategy simultaneously assures a fast, full and successful discrimination of proteins and a thorough characterization of the chemo-structural differences among them, ultimately opening up new routes for SERS evolution toward sensing applications and diagnostics of interest in life sciences.
2022
Italian conference on optics and photonics, ICOP 2022
Italian conference on optics and photonics, ICOP 2022
Barucci, A; D'Andrea, C; Farnesi, E; Banchelli, M; Amicucci, C; De Angelis, M; Marzi, C; Pini, R; Hwang, B; Matteini, P
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1308701
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact