Lossy Compression of Quality Values in Next-Generation Sequencing Data

Date

Authors

Suaste Morales, Veronica

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

In this work we address the compression of SAM files which is the standard output file for DNA assembly. We specifically study lossy compression techniques used for quality values reported in the SAM file and we analyse the impact of such lossy techniques in the CRAM format. We also study the impact of these lossy techniques in the SNP calling process. Our results show that lossy techniques allow a better compression ratio than the one obtained with the original quality values. We also show that SNP calling performance is not negatively affected. Moreover we confirmed that some of the lossy techniques can even boost the SNP calling performance.

Description

Citation