reGenotyper: Detecting mislabeled samples in genetic data

Konrad Zych, L. Basten Snoek, Mark Elvin, Miriam Rodriguez, K. Joeri van der Velde, Danny Arends, Harm-Jan Westra, Morris A. Swertz, Gino Poulin, Jan E. Kammenga, Rainer Breitling, Ritsert C. Jansen, Yang Li*

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

10 Citations (Scopus)
237 Downloads (Pure)

Abstract

In high-throughput molecular profiling studies, genotype labels can be wrongly assigned at various experimental steps; the resulting mislabeled samples seriously reduce the power to detect the genetic basis of phenotypic variation. We have developed an approach to detect potential mislabeling, recover the "ideal" genotype and identify "best-matched " labels for mislabeled samples. On average, we identified 4% of samples as mislabeled in eight published datasets, highlighting the necessity of applying a "data cleaning" step before standard data analysis.

Original languageEnglish
Article numbere0171324
Number of pages11
JournalPLoS ONE
Volume12
Issue number2
DOIs
Publication statusPublished - 13-Feb-2017

Keywords

  • GENOME-WIDE ASSOCIATION
  • NATURAL VARIATION DATA
  • C. ELEGANS
  • MIX-UPS
  • EXPRESSION
  • QTL
  • DISEASE
  • IDENTIFICATION
  • PERTURBATION
  • POPULATIONS

Fingerprint

Dive into the research topics of 'reGenotyper: Detecting mislabeled samples in genetic data'. Together they form a unique fingerprint.

Cite this