Haplotype resolved genomes: Computational challenges and applications

David Porubský

    Research output: ThesisThesis fully internal (DIV)

    1932 Downloads (Pure)


    Genomes of diploid organisms, like humans, are organized in pairs of chromosomes, one inherited from the father and one from the mother. Each homologous chromosome harbors a specific set of parental alleles, called haplotype. Unfortunately, to obtain haplotype information using current methods remains challenging. Here we introduce a single cell DNA template strand sequencing (Strand-seq) as a novel haplotyping approach able to separate parental alleles along the entire length of all chromosomes. We demonstrate this by building a complete haplotypes for HapMap individual (NA12878) at high accuracy (concordance 99.3%), without using generational information or statistical inference. Furthermore we mapped all meiotic recombination events in a family trio with high resolution (median range ~14 kb), and phased larger structural variants like deletions, indels as well as balanced rearrangements like inversions. The single cell resolution of Strand-seq allowed us to observe loss of heterozygosity regions in a small number of cells, a significant advantage for studies of heterogeneous cell populations, such as cancer cells. Lastly, we prove that integration of Strand-seq with other whole-genome sequencing methods brings significant increase in haplotype completeness while reducing sequencing costs. The implementation of Strand-seq and our analysis pipeline brings a powerful, high-throughput approach to assemble haplotypes that will open up new possibilities to study diploid architecture of human genomes in health and disease.
    Original languageEnglish
    QualificationDoctor of Philosophy
    Awarding Institution
    • University of Groningen
    • Lansdorp, Peter, Supervisor
    • Guryev, Victor, Co-supervisor
    • Bevova, Marianna, Co-supervisor
    Award date27-Mar-2017
    Place of Publication[Groningen]
    Print ISBNs978-90-367-9637-8
    Electronic ISBNs978-90-367-9636-1
    Publication statusPublished - 2017

    Cite this