Multi-platform​ ​ discovery​ ​ of​ ​ haplotype-resolved structural​ ​ variation​ ​ in​ ​ human​ ​ genomes

Research output: Contribution to journalArticleAcademic

3 Downloads (Pure)

Abstract

The incomplete identification of structural variants from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long- and short-read, strand-specific sequencing technologies, optical mapping, and variant discovery algorithms to comprehensively analyze three human parent-child trios to define the full spectrum of human genetic variation in a haplotype-resolved manner. We identify 818,181 indel variants (<50 bp) and 31,599 structural variants (≥50 bp) per human genome, a seven fold increase in structural variation compared to previous reports, including from the 1000 Genomes Project. We also discovered 156 inversions per genome, most of which previously escaped detection, as well as large unbalanced chromosomal rearrangements. We provide near-complete, haplotype-resolved structural variation for three genomes that can now be used as a gold standard for the scientific community and we make specific recommendations for maximizing structural variation sensitivity for future large-scale genome sequencing studies.
Original languageEnglish
Number of pages23
JournalbioRxiv
DOIs
Publication statusPublished - 23-Sep-2017

Cite this