The discovery of genomic structural variants (SVs), such as copy number variants (CNVs), is essential to understand genetic variation of human populations and complex diseases. Over recent years, the advent of new high-throughput sequencing (HTS) platforms has opened many opportunities for SVs discovery, and a very promising approach consists in measuring the depth of coverage (DOC) of reads aligned to the human reference genome. At present, few computational methods have been developed for the analysis of DOC data and all of these methods allow to analyse only one sample at time. For these reasons, we developed a novel algorithm (JointSLM) that allows to detect common CNVs among individuals by analysing DOC data from multiple samples simultaneously. We test JointSLM performance on synthetic and real data and we show its unprecedented resolution that enables the detection of recurrent CNV regions as small as 500 bp in size. When we apply JointSLM to analyse chromosome one of eight genomes with different ancestry, we identify 3000 regions with recurrent CNVs of different frequency and size: hierarchical clustering on these regions segregates the eight individuals in two groups that reflect their ancestry, demonstrating the potential utility of JointSLM for population genetics studies.

Detecting common copy number variants in high-throughput sequencing data by using JointSLM algorithm / A. Magi;M. Benelli;S. Yoon;F. Roviello;F. Torricelli. - In: NUCLEIC ACIDS RESEARCH. - ISSN 0305-1048. - ELETTRONICO. - 39(10):e65:(2011), pp. 0-0. [10.1093/nar/gkr068]

Detecting common copy number variants in high-throughput sequencing data by using JointSLM algorithm.

MAGI, ALBERTO;BENELLI, MATTEO;TORRICELLI, FRANCESCA
2011

Abstract

The discovery of genomic structural variants (SVs), such as copy number variants (CNVs), is essential to understand genetic variation of human populations and complex diseases. Over recent years, the advent of new high-throughput sequencing (HTS) platforms has opened many opportunities for SVs discovery, and a very promising approach consists in measuring the depth of coverage (DOC) of reads aligned to the human reference genome. At present, few computational methods have been developed for the analysis of DOC data and all of these methods allow to analyse only one sample at time. For these reasons, we developed a novel algorithm (JointSLM) that allows to detect common CNVs among individuals by analysing DOC data from multiple samples simultaneously. We test JointSLM performance on synthetic and real data and we show its unprecedented resolution that enables the detection of recurrent CNV regions as small as 500 bp in size. When we apply JointSLM to analyse chromosome one of eight genomes with different ancestry, we identify 3000 regions with recurrent CNVs of different frequency and size: hierarchical clustering on these regions segregates the eight individuals in two groups that reflect their ancestry, demonstrating the potential utility of JointSLM for population genetics studies.
2011
39(10):e65
0
0
A. Magi;M. Benelli;S. Yoon;F. Roviello;F. Torricelli
File in questo prodotto:
File Dimensione Formato  
Magi_NAR_JointSLM_2011.pdf

accesso aperto

Descrizione: Articolo principale
Tipologia: Pdf editoriale (Version of record)
Licenza: Open Access
Dimensione 551.14 kB
Formato Adobe PDF
551.14 kB Adobe PDF

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/430460
Citazioni
  • ???jsp.display-item.citation.pmc??? 34
  • Scopus 57
  • ???jsp.display-item.citation.isi??? 53
social impact