SVNN: An efficient PacBio-specific pipeline for structural variations calling using neural networks

Shaya Akbarinejad, Mostafa Hadadian Nejad Yousefi, Maziar Goudarzi*

*Corresponding author voor dit werk

OnderzoeksoutputAcademicpeer review

1 Citaat (Scopus)
62 Downloads (Pure)

Samenvatting

BackgroundOnce aligned, long-reads can be a useful source of information to identify the type and position of structural variations. However, due to the high sequencing error of long reads, long-read structural variation detection methods are far from precise in low-coverage cases. To be accurate, they need to use high-coverage data, which in turn, results in an extremely time-consuming pipeline, especially in the alignment phase. Therefore, it is of utmost importance to have a structural variation calling pipeline which is both fast and precise for low-coverage data.ResultsIn this paper, we present SVNN, a fast yet accurate, structural variation calling pipeline for PacBio long-reads that takes raw reads as the input and detects structural variants of size larger than 50 bp. Our pipeline utilizes state-of-the-art long-read aligners, namely NGMLR and Minimap2, and structural variation callers, videlicet Sniffle and SVIM. We found that by using a neural network, we can extract features from Minimap2 output to detect a subset of reads that provide useful information for structural variation detection. By only mapping this subset with NGMLR, which is far slower than Minimap2 but better serves downstream structural variation detection, we can increase the sensitivity in an efficient way. As a result of using multiple tools intelligently, SVNN achieves up to 20 percentage points of sensitivity improvement in comparison with state-of-the-art methods and is three times faster than a naive combination of state-of-the-art tools to achieve almost the same accuracy.ConclusionSince prohibitive costs of using high-coverage data have impeded long-read applications, with SVNN, we provide the users with a much faster structural variation detection platform for PacBio reads with high precision and sensitivity in low-coverage scenarios.

Originele taal-2English
Artikelnummer335
Aantal pagina's17
TijdschriftBmc Bioinformatics
Volume22
Nummer van het tijdschrift1
DOI's
StatusPublished - 19-jun.-2021

Vingerafdruk

Duik in de onderzoeksthema's van 'SVNN: An efficient PacBio-specific pipeline for structural variations calling using neural networks'. Samen vormen ze een unieke vingerafdruk.

Citeer dit