Improved imputation quality of low-frequency and rare variants in European samples using the 'Genome of The Netherlands'

Patrick Deelen, Androniki Menelaou, Elisabeth M. van Leeuwen, Alexandros Kanterakis, Freerk van Dijk, Carolina Medina-Gomez, Laurent C. Francioli, Jouke Jan Hottenga, Lennart C. Karssen, Karol Estrada, Eskil Kreiner-Moller, Fernando Rivadeneira, Jessica van Setten, Javier Gutierrez-Achury, Harm-Jan Westra, Lude Franke, David van Enckevort, Martijn Dijkstra, Heorhiy Byelas, Cornelia M. van DuijnPaul I. W. de Bakker, Cisca Wijmenga, Morris A. Swertz*, Genome Netherlands Consortium

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

65 Citations (Scopus)
27 Downloads (Pure)

Abstract

Although genome-wide association studies (GWAS) have identified many common variants associated with complex traits, low-frequency and rare variants have not been interrogated in a comprehensive manner. Imputation from dense reference panels, such as the 1000 Genomes Project (1000G), enables testing of ungenotyped variants for association. Here we present the results of imputation using a large, new population-specific panel: the Genome of The Netherlands (GoNL). We benchmarked the performance of the 1000G and GoNL reference sets by comparing imputation genotypes with 'true' genotypes typed on ImmunoChip in three European populations (Dutch, British, and Italian). GoNL showed significant improvement in the imputation quality for rare variants (MAF 0.05-0.5%) compared with 1000G. In Dutch samples, the mean observed Pearson correlation, r(2), increased from 0.61 to 0.71. We also saw improved imputation accuracy for other European populations (in the British samples, r(2) improved from 0.58 to 0.65, and in the Italians from 0.43 to 0.47). A combined reference set comprising 1000G and GoNL improved the imputation of rare variants even further. The Italian samples benefitted the most from this combined reference (the mean r(2) increased from 0.47 to 0.50). We conclude that the creation of a large population-specific reference is advantageous for imputing rare variants and that a combined reference panel across multiple populations yields the best imputation results.

Original languageEnglish
Pages (from-to)1321-1326
Number of pages6
JournalEuropean Journal of Human Genetics
Volume22
Issue number11
DOIs
Publication statusPublished - Nov-2014

Keywords

  • genotype imputation
  • GWAS
  • GoNL
  • rare variants
  • reference sets
  • reference panel
  • GENOTYPE IMPUTATION
  • WIDE ASSOCIATION
  • DISEASE
  • COMMON
  • ARRAY
  • POWER
  • LOCI

Cite this