Purpose: Accurate discrimination of benign and pathogenic rare variation remains a priority for clinical genome interpretation. State-of-the-art machine learning variant prioritization tools are imprecise and ignore important parameters defining gene–disease relationships, e.g., distinct consequences of gain-of-function versus loss-of-function variants. We hypothesized that incorporating disease-specific information would improve tool performance. Methods: We developed a disease-specific variant classifier, CardioBoost, that estimates the probability of pathogenicity for rare missense variants in inherited cardiomyopathies and arrhythmias. We assessed CardioBoost’s ability to discriminate known pathogenic from benign variants, prioritize disease-associated variants, and stratify patient outcomes. Results: CardioBoost has high global discrimination accuracy (precision recall area under the curve [AUC] 0.91 for cardiomyopathies; 0.96 for arrhythmias), outperforming existing tools (4–24% improvement). CardioBoost obtains excellent accuracy (cardiomyopathies 90.2%; arrhythmias 91.9%) for variants classified with >90% confidence, and increases the proportion of variants classified with high confidence more than twofold compared with existing tools. Variants classified as disease-causing are associated with both disease status and clinical severity, including a 21% increased risk (95% confidence interval [CI] 11–29%) of severe adverse outcomes by age 60 in patients with hypertrophic cardiomyopathy. Conclusions: A disease-specific variant classifier outperforms state-of-the-art genome-wide tools for rare missense variants in inherited cardiac conditions (https://www.cardiodb.org/cardioboost/), highlighting broad opportunities for improved pathogenicity prediction through disease specificity.

Disease-specific variant pathogenicity prediction significantly improves variant interpretation in inherited cardiac conditions / Zhang X.; Walsh R.; Whiffin N.; Buchan R.; Midwinter W.; Wilk A.; Govind R.; Li N.; Ahmad M.; Mazzarotto F.; Roberts A.; Theotokis P.I.; Mazaika E.; Allouba M.; de Marvao A.; Pua C.J.; Day S.M.; Ashley E.; Colan S.D.; Michels M.; Pereira A.C.; Jacoby D.; Ho C.Y.; Olivotto I.; Gunnarsson G.T.; Jefferies J.L.; Semsarian C.; Ingles J.; O'Regan D.P.; Aguib Y.; Yacoub M.H.; Cook S.A.; Barton P.J.R.; Bottolo L.; Ware J.S.. - In: GENETICS IN MEDICINE. - ISSN 1098-3600. - STAMPA. - 23:(2021), pp. 69-79. [10.1038/s41436-020-00972-3]

Disease-specific variant pathogenicity prediction significantly improves variant interpretation in inherited cardiac conditions

Mazzarotto F.;Olivotto I.;Yacoub M. H.;
2021

Abstract

Purpose: Accurate discrimination of benign and pathogenic rare variation remains a priority for clinical genome interpretation. State-of-the-art machine learning variant prioritization tools are imprecise and ignore important parameters defining gene–disease relationships, e.g., distinct consequences of gain-of-function versus loss-of-function variants. We hypothesized that incorporating disease-specific information would improve tool performance. Methods: We developed a disease-specific variant classifier, CardioBoost, that estimates the probability of pathogenicity for rare missense variants in inherited cardiomyopathies and arrhythmias. We assessed CardioBoost’s ability to discriminate known pathogenic from benign variants, prioritize disease-associated variants, and stratify patient outcomes. Results: CardioBoost has high global discrimination accuracy (precision recall area under the curve [AUC] 0.91 for cardiomyopathies; 0.96 for arrhythmias), outperforming existing tools (4–24% improvement). CardioBoost obtains excellent accuracy (cardiomyopathies 90.2%; arrhythmias 91.9%) for variants classified with >90% confidence, and increases the proportion of variants classified with high confidence more than twofold compared with existing tools. Variants classified as disease-causing are associated with both disease status and clinical severity, including a 21% increased risk (95% confidence interval [CI] 11–29%) of severe adverse outcomes by age 60 in patients with hypertrophic cardiomyopathy. Conclusions: A disease-specific variant classifier outperforms state-of-the-art genome-wide tools for rare missense variants in inherited cardiac conditions (https://www.cardiodb.org/cardioboost/), highlighting broad opportunities for improved pathogenicity prediction through disease specificity.
2021
23
69
79
Zhang X.; Walsh R.; Whiffin N.; Buchan R.; Midwinter W.; Wilk A.; Govind R.; Li N.; Ahmad M.; Mazzarotto F.; Roberts A.; Theotokis P.I.; Mazaika E.; Allouba M.; de Marvao A.; Pua C.J.; Day S.M.; Ashley E.; Colan S.D.; Michels M.; Pereira A.C.; Jacoby D.; Ho C.Y.; Olivotto I.; Gunnarsson G.T.; Jefferies J.L.; Semsarian C.; Ingles J.; O'Regan D.P.; Aguib Y.; Yacoub M.H.; Cook S.A.; Barton P.J.R.; Bottolo L.; Ware J.S.
File in questo prodotto:
File Dimensione Formato  
s41436-020-00972-3.pdf

Accesso chiuso

Tipologia: Pdf editoriale (Version of record)
Licenza: Tutti i diritti riservati
Dimensione 943.61 kB
Formato Adobe PDF
943.61 kB Adobe PDF   Richiedi una copia

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1261178
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 27
  • ???jsp.display-item.citation.isi??? 26
social impact