Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 3 de 3
Filter
Add more filters










Database
Language
Publication year range
1.
BMC Bioinformatics ; 17 Suppl 7: 239, 2016 Jul 25.
Article in English | MEDLINE | ID: mdl-27454357

ABSTRACT

BACKGROUND: Analyzing next-generation sequencing data is difficult because datasets are large, second generation sequencing platforms have high error rates, and because each position in the target genome (exome, transcriptome, etc.) is sequenced multiple times. Given these challenges, numerous bioinformatic algorithms have been developed to analyze these data. These algorithms aim to find an appropriate balance between data loss, errors, analysis time, and memory footprint. Typical analysis pipelines require multiple steps. If one or more of these steps is unnecessary, it would significantly decrease compute time and data manipulation to remove the step. One step in many pipelines is PCR duplicate removal, where PCR duplicates arise from multiple PCR products from the same template molecule binding on the flowcell. These are often removed because there is concern they can lead to false positive variant calls. Picard (MarkDuplicates) and SAMTools (rmdup) are the two main softwares used for PCR duplicate removal. RESULTS: Approximately 92 % of the 17+ million variants called were called whether we removed duplicates with Picard or SAMTools, or left the PCR duplicates in the dataset. There were no significant differences between the unique variant sets when comparing the transition/transversion ratios (p = 1.0), percentage of novel variants (p = 0.99), average population frequencies (p = 0.99), and the percentage of protein-changing variants (p = 1.0). Results were similar for variants in the American College of Medical Genetics genes. Genotype concordance between NGS and SNP chips was above 99 % for all genotype groups (e.g., homozygous reference). CONCLUSIONS: Our results suggest that PCR duplicate removal has minimal effect on the accuracy of subsequent variant calls.


Subject(s)
High-Throughput Nucleotide Sequencing/methods , Polymorphism, Single Nucleotide , Sequence Analysis, DNA/methods , Software , Data Accuracy , Genome, Human , Genomics/methods , Humans , Polymerase Chain Reaction
2.
PLoS Genet ; 10(10): e1004758, 2014 Oct.
Article in English | MEDLINE | ID: mdl-25340798

ABSTRACT

Cerebrospinal fluid (CSF) 42 amino acid species of amyloid beta (Aß42) and tau levels are strongly correlated with the presence of Alzheimer's disease (AD) neuropathology including amyloid plaques and neurodegeneration and have been successfully used as endophenotypes for genetic studies of AD. Additional CSF analytes may also serve as useful endophenotypes that capture other aspects of AD pathophysiology. Here we have conducted a genome-wide association study of CSF levels of 59 AD-related analytes. All analytes were measured using the Rules Based Medicine Human DiscoveryMAP Panel, which includes analytes relevant to several disease-related processes. Data from two independently collected and measured datasets, the Knight Alzheimer's Disease Research Center (ADRC) and Alzheimer's Disease Neuroimaging Initiative (ADNI), were analyzed separately, and combined results were obtained using meta-analysis. We identified genetic associations with CSF levels of 5 proteins (Angiotensin-converting enzyme (ACE), Chemokine (C-C motif) ligand 2 (CCL2), Chemokine (C-C motif) ligand 4 (CCL4), Interleukin 6 receptor (IL6R) and Matrix metalloproteinase-3 (MMP3)) with study-wide significant p-values (p<1.46×10-10) and significant, consistent evidence for association in both the Knight ADRC and the ADNI samples. These proteins are involved in amyloid processing and pro-inflammatory signaling. SNPs associated with ACE, IL6R and MMP3 protein levels are located within the coding regions of the corresponding structural gene. The SNPs associated with CSF levels of CCL4 and CCL2 are located in known chemokine binding proteins. The genetic associations reported here are novel and suggest mechanisms for genetic control of CSF and plasma levels of these disease-related proteins. Significant SNPs in ACE and MMP3 also showed association with AD risk. Our findings suggest that these proteins/pathways may be valuable therapeutic targets for AD. Robust associations in cognitively normal individuals suggest that these SNPs also influence regulation of these proteins more generally and may therefore be relevant to other diseases.


Subject(s)
Alzheimer Disease/genetics , Amyloid beta-Peptides/genetics , Matrix Metalloproteinase 3/genetics , Renin/genetics , Alzheimer Disease/blood , Alzheimer Disease/cerebrospinal fluid , Alzheimer Disease/pathology , Amyloid beta-Peptides/cerebrospinal fluid , Blood Proteins/genetics , Chemokine CCL2/genetics , Chemokine CCL4/genetics , Female , Genome-Wide Association Study , Humans , Male , Nerve Growth Factor/genetics , Polymorphism, Single Nucleotide , Receptors, Interleukin-6/genetics , Receptors, Lipoprotein/genetics , tau Proteins/cerebrospinal fluid , tau Proteins/genetics
3.
BMC Bioinformatics ; 15 Suppl 7: S12, 2014.
Article in English | MEDLINE | ID: mdl-25080132

ABSTRACT

BACKGROUND: Since the advent of next-generation sequencing many previously untestable hypotheses have been realized. Next-generation sequencing has been used for a wide range of studies in diverse fields such as population and medical genetics, phylogenetics, microbiology, and others. However, this novel technology has created unanticipated challenges such as the large numbers of genetic variants. Each caucasian genome has more than four million single nucleotide variants, insertions and deletions, copy number variants, and structural variants. Several formats have been suggested for storing these variants; however, the variant call format (VCF) has become the community standard. RESULTS: We developed new software called the Variant Tool Chest (VTC) to provide much needed tools to work with VCF files. VTC provides a variety of tools for manipulating, comparing, and analyzing VCF files beyond the functionality of existing tools. In addition, VTC was written to be easily extended with new tools. CONCLUSIONS: Variant Tool Chest brings new and important functionality that complements and integrates well with existing software. VTC is available at https://github.com/mebbert/VariantToolChest.


Subject(s)
High-Throughput Nucleotide Sequencing/methods , Software , Databases, Genetic , Genetic Variation , Genome, Human , Genotype , Humans
SELECTION OF CITATIONS
SEARCH DETAIL
...