Search | VHL Regional Portal

Minimizing Reference Bias with an Impute-First Approach.

Vaddadi, Naga Sai Kavya; Mun, Taher; Langmead, Ben.

bioRxiv ; 2023 Dec 02.

Article in English | MEDLINE | ID: mdl-38076784

ABSTRACT

Pangenome indexes reduce reference bias in sequencing data analysis. However, a greater reduction in bias can be achieved using a personalized reference, e.g. a diploid human reference constructed to match a donor individual's alleles. We present a novel impute-first alignment framework that combines elements of genotype imputation and pangenome alignment. It begins by genotyping the individual from a subsample of the input reads. It next uses a reference panel and efficient imputation algorithm to impute a personalized diploid reference. Finally, it indexes the personalized reference and applies a read aligner, which could be a linear or graph aligner, to align the full read set to the personalized reference. This framework has higher variant-calling recall (99.54% vs. 99.37%), precision (99.36% vs. 99.18%), and F1 (99.45% vs. 99.28%) compared to a graph-based pangenome. The personalized reference is also smaller and faster to query compared to a pangenome index, making it an overall advantageous choice for whole-genome DNA sequencing experiments.

Pangenomic genotyping with the marker array.

Mun, Taher; Vaddadi, Naga Sai Kavya; Langmead, Ben.

Algorithms Mol Biol ; 18(1): 2, 2023 May 05.

Article in English | MEDLINE | ID: mdl-37147657

ABSTRACT

We present a new method and software tool called rowbowt that applies a pangenome index to the problem of inferring genotypes from short-read sequencing data. The method uses a novel indexing structure called the marker array. Using the marker array, we can genotype variants with respect from large panels like the 1000 Genomes Project while reducing the reference bias that results when aligning to a single linear reference. rowbowt can infer accurate genotypes in less time and memory compared to existing graph-based methods. The method is implemented in the open source software tool rowbowt available at https://github.com/alshai/rowbowt .

Pangenomic Genotyping with the Marker Array.

Mun, Taher; Vaddadi, Naga Sai Kavya; Langmead, Ben.

Algorithms Bioinform ; 2422022 Sep.

Article in English | MEDLINE | ID: mdl-36409181

ABSTRACT

We present a new method and software tool called rowbowt that applies a pangenome index to the problem of inferring genotypes from short-read sequencing data. The method uses a novel indexing structure called the marker array. Using the marker array, we can genotype variants with respect from large panels like the 1000 Genomes Project while avoiding the reference bias that results when aligning to a single linear reference. rowbowt can infer accurate genotypes in less time and memory compared to existing graph-based methods.

ABSTRACT

ABSTRACT

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL