Compression/decompression method and apparatus for genomic variant call data

Onderzoeksoutput

2 Downloads (Pure)

Samenvatting

Methods and apparatus for compressing and decompressing genetic information from an individual are disclosed. In one arrangement, a data compression method generates a compressed representation of at least a portion of an individual's genome. The method comprises receiving an input file comprising a representation of the at least a portion of the individual's genome in the form of a sequence of variants defined relative to a reference genome. A reference database comprising a plurality of reference lists of genetic variants from other individuals is accessed. Each reference list comprises a sequence of genetic variants from a single, phased haplotype. Two mosaics of segments from the reference lists are identified which match the at least a portion of the individual's genome to within a threshold accuracy. Each mosaic represents a single one of the two haplotypes of the individual's genome in the at least a portion of the individual's genome for which the compressed representation is to be generated. Each of the segments comprises a portion of the sequence of genetic variants from one of the reference lists. The compressed representation is generated by encoding the two mosaics and deviations from the two mosaics.
Originele taal-2English
OctrooinummerWO2017158330
Prioriteitsdatum15/03/2016
Indieningsdatum13/03/2017
StatusPublished - 21-sep.-2017

Vingerafdruk

Duik in de onderzoeksthema's van 'Compression/decompression method and apparatus for genomic variant call data'. Samen vormen ze een unieke vingerafdruk.

Citeer dit