BackgroundInferential statistical methods failed in identifying reliable biomarkers and risk factors for relapsing giant cell arteritis (GCA) after glucocorticoids (GCs) tapering. A ML approach allows to handle complex non-linear relationships between patient attributes that are hard to model with traditional statistical methods, merging them to output a forecast or a probability for a given outcome. ObjectiveThe objective of the study was to assess whether ML algorithms can predict GCA relapse after GCs tapering. MethodsGCA patients who underwent GCs therapy and regular follow-up visits for at least 12 months, were retrospectively analyzed and used for implementing 3 ML algorithms, namely, Logistic Regression (LR), Decision Tree (DT), and Random Forest (RF). The outcome of interest was disease relapse within 3 months during GCs tapering. After a ML variable selection method, based on a XGBoost wrapper, an attribute core set was used to train and test each algorithm using 5-fold cross-validation. The performance of each algorithm in both phases was assessed in terms of accuracy and area under receiver operating characteristic curve (AUROC). ResultsThe dataset consisted of 107 GCA patients (73 women, 68.2%) with mean age ( +/- SD) 74.1 ( +/- 8.5) years at presentation. GCA flare occurred in 40/107 patients (37.4%) within 3 months after GCs tapering. As a result of ML wrapper, the attribute core set with the least number of variables used for algorithm training included presence/absence of diabetes mellitus and concomitant polymyalgia rheumatica as well as erythrocyte sedimentation rate level at GCs baseline. RF showed the best performance, being significantly superior to other algorithms in accuracy (RF 71.4% vs LR 70.4% vs DT 62.9%). Consistently, RF precision (72.1%) was significantly greater than those of LR (62.6%) and DT (50.8%). Conversely, LR was superior to RF and DT in recall (RF 60% vs LR 62.5% vs DT 47.5%). Moreover, RF AUROC (0.76) was more significant compared to LR (0.73) and DT (0.65). ConclusionsRF algorithm can predict GCA relapse after GCs tapering with sufficient accuracy. To date, this is one of the most accurate predictive modelings for such outcome. This ML method represents a reproducible tool, capable of supporting clinicians in GCA patient management.
Validity of Machine Learning in Predicting Giant Cell Arteritis Flare After Glucocorticoids Tapering / Venerito, Vincenzo; Emmi, Giacomo; Cantarini, Luca; Leccese, Pietro; Fornaro, Marco; Fabiani, Claudia; Lascaro, Nancy; Coladonato, Laura; Mattioli, Irene; Righetti, Giulia; Malandrino, Danilo; Tangaro, Sabina; Palermo, Adalgisa; Urban, Maria Letizia; Conticini, Edoardo; Frediani, Bruno; Iannone, Florenzo; Lopalco, Giuseppe. - In: FRONTIERS IN IMMUNOLOGY. - ISSN 1664-3224. - ELETTRONICO. - 13:(2022), pp. 1-1. [10.3389/fimmu.2022.860877]
Validity of Machine Learning in Predicting Giant Cell Arteritis Flare After Glucocorticoids Tapering
Emmi, Giacomo;Mattioli, Irene;Malandrino, Danilo;Palermo, Adalgisa;Urban, Maria Letizia;
2022
Abstract
BackgroundInferential statistical methods failed in identifying reliable biomarkers and risk factors for relapsing giant cell arteritis (GCA) after glucocorticoids (GCs) tapering. A ML approach allows to handle complex non-linear relationships between patient attributes that are hard to model with traditional statistical methods, merging them to output a forecast or a probability for a given outcome. ObjectiveThe objective of the study was to assess whether ML algorithms can predict GCA relapse after GCs tapering. MethodsGCA patients who underwent GCs therapy and regular follow-up visits for at least 12 months, were retrospectively analyzed and used for implementing 3 ML algorithms, namely, Logistic Regression (LR), Decision Tree (DT), and Random Forest (RF). The outcome of interest was disease relapse within 3 months during GCs tapering. After a ML variable selection method, based on a XGBoost wrapper, an attribute core set was used to train and test each algorithm using 5-fold cross-validation. The performance of each algorithm in both phases was assessed in terms of accuracy and area under receiver operating characteristic curve (AUROC). ResultsThe dataset consisted of 107 GCA patients (73 women, 68.2%) with mean age ( +/- SD) 74.1 ( +/- 8.5) years at presentation. GCA flare occurred in 40/107 patients (37.4%) within 3 months after GCs tapering. As a result of ML wrapper, the attribute core set with the least number of variables used for algorithm training included presence/absence of diabetes mellitus and concomitant polymyalgia rheumatica as well as erythrocyte sedimentation rate level at GCs baseline. RF showed the best performance, being significantly superior to other algorithms in accuracy (RF 71.4% vs LR 70.4% vs DT 62.9%). Consistently, RF precision (72.1%) was significantly greater than those of LR (62.6%) and DT (50.8%). Conversely, LR was superior to RF and DT in recall (RF 60% vs LR 62.5% vs DT 47.5%). Moreover, RF AUROC (0.76) was more significant compared to LR (0.73) and DT (0.65). ConclusionsRF algorithm can predict GCA relapse after GCs tapering with sufficient accuracy. To date, this is one of the most accurate predictive modelings for such outcome. This ML method represents a reproducible tool, capable of supporting clinicians in GCA patient management.File | Dimensione | Formato | |
---|---|---|---|
fimmu-13-860877.pdf
accesso aperto
Tipologia:
Pdf editoriale (Version of record)
Licenza:
Open Access
Dimensione
1.18 MB
Formato
Adobe PDF
|
1.18 MB | Adobe PDF |
I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.