Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 4 de 4
Filter
Add more filters










Database
Language
Publication year range
1.
PLoS Genet ; 20(7): e1011092, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38959269

ABSTRACT

Haplotype estimation, or phasing, has gained significant traction in large-scale projects due to its valuable contributions to population genetics, variant analysis, and the creation of reference panels for imputation and phasing of new samples. To scale with the growing number of samples, haplotype estimation methods designed for population scale rely on highly optimized statistical models to phase genotype data, and usually ignore read-level information. Statistical methods excel in resolving common variants, however, they still struggle at rare variants due to the lack of statistical information. In this study we introduce SAPPHIRE, a new method that leverages whole-genome sequencing data to enhance the precision of haplotype calls produced by statistical phasing. SAPPHIRE achieves this by refining haplotype estimates through the realignment of sequencing reads, particularly targeting low-confidence phase calls. Our findings demonstrate that SAPPHIRE significantly enhances the accuracy of haplotypes obtained from state of the art methods and also provides the subset of phase calls that are validated by sequencing reads. Finally, we show that our method scales to large data sets by its successful application to the extensive 3.6 Petabytes of sequencing data of the last UK Biobank 200,031 sample release.


Subject(s)
Genetics, Population , Haplotypes , Whole Genome Sequencing , Whole Genome Sequencing/methods , Humans , Genetics, Population/methods , Genome, Human , Polymorphism, Single Nucleotide/genetics , Genome-Wide Association Study/methods , Algorithms
2.
Bioinform Adv ; 3(1): vbad021, 2023.
Article in English | MEDLINE | ID: mdl-36908398

ABSTRACT

Summary: The positional Burrows-Wheeler transform (PBWT) data structure allows for efficient haplotype data matching and compression. Its performance makes it a powerful tool for bioinformatics. However, existing algorithms do not exploit parallelism due to inner dependencies. We introduce a new method to break the dependencies and show how to fully exploit modern multi-core processors. Availability and implementation: Source code and applications are available at https://github.com/rwk-unil/parallel_pbwt. Supplementary information: Supplementary data are available at Bioinformatics Advances online.

3.
Bioinformatics ; 38(15): 3778-3784, 2022 08 02.
Article in English | MEDLINE | ID: mdl-35748697

ABSTRACT

MOTIVATION: Generation of genotype data has been growing exponentially over the last decade. With the large size of recent datasets comes a storage and computational burden with ever increasing costs. To reduce this burden, we propose XSI, a file format with reduced storage footprint that also allows computation on the compressed data and we show how this can improve future analyses. RESULTS: We show that xSqueezeIt (XSI) allows for a file size reduction of 4-20× compared with compressed BCF and demonstrate its potential for 'compressive genomics' on the UK Biobank whole-genome sequencing genotypes with 8× faster loading times, 5× faster run of homozygozity computation, 30× faster dot products computation and 280× faster allele counts. AVAILABILITY AND IMPLEMENTATION: The XSI file format specifications, API and command line tool are released under open-source (MIT) license and are available at https://github.com/rwk-unil/xSqueezeIt. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Data Compression , Software , Biological Specimen Banks , Genomics , Genotype
4.
IEEE Trans Biomed Circuits Syst ; 15(4): 743-755, 2021 08.
Article in English | MEDLINE | ID: mdl-34280107

ABSTRACT

In this paper we present SpikeOnChip, a custom embedded platform for neuronal activity recording and online analysis. The SpikeOnChip platform was developed in the context of automated drug testing and toxicology assessments on neural tissue made from human induced pluripotent stem cells. The system was developed with the following goals: to be small, autonomous and low power, to handle micro-electrode arrays with up to 256 electrodes, to reduce the amount of data generated from the recording, to be able to do computation during acquisition, and to be customizable. This led to the choice of a Field Programmable Gate Array System-On-Chip platform. This paper focuses on the embedded system for acquisition and processing with key features being the ability to record electrophysiological signals from multiple electrodes, detect biological activity on all channels online for recording, and do frequency domain spectral energy analysis online on all channels during acquisition. Development methodologies are also presented. The platform is finally illustrated in a concrete experiment with bicuculline being administered to grown human neural tissue through microfluidics, resulting in measurable effects in the spike recordings and activity. The presented platform provides a valuable new experimental instrument that can be further extended thanks to the programmable hardware and software.


Subject(s)
Induced Pluripotent Stem Cells , Electrodes , Electrophysiological Phenomena , Humans , Neurons , Software
SELECTION OF CITATIONS
SEARCH DETAIL
...