Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 35
Filter
1.
Front Plant Sci ; 11: 89, 2020.
Article in English | MEDLINE | ID: mdl-32153607

ABSTRACT

Early seeding has been suggested as a method of increasing the grain yield and grain yield stability of wheat (Triticum aestivum L.) in the Northern Great Plains. The point at which early seeding results in a decrease in grain yield has not been clearly identified. Changes in climatic conditions have increased frost-free periods and increased temperatures during grain filling, which can either be taken advantage of or avoided by seeding earlier. Field trials were conducted in western Canada from 2015 to 2018 to evaluate an ultra-early wheat planting system based on soil temperature triggers as opposed to calendar dates. Planting began when soil temperatures at 5 cm depth reached 0°C and continued at 2°C intervals until 10°C, regardless of calendar date. Conventional commercial spring wheat genetics and newly identified cold tolerant spring wheat lines were evaluated to determine if ultra-early wheat seeding systems required further development of specialized varieties to maintain system stability. Ultra-early seeding resulted in no detrimental effect on grain yield. Grain yield increased at sites south of 51° latitude N, and was unaffected by ultra-early seeding at sites north of 51° latitude N. Grain protein content, kernel weight, and bulk density were not affected by ultra-early seeding. Optimal seeding time was identified between 2 and 6°C soil temperatures. A greater reduction in grain yield was observed from delaying planting until soils reached 10°C than from seeding into 0°C soils; this was despite extreme environmental conditions after initial seeding, including air temperatures as low as -10.2°C, and as many as 37 nights with air temperatures below 0°C. Wheat emergence ranged from 55 to 70%, and heads m-2 decreased with delayed seeding while heads plant-1 did not change. Cold tolerant wheat lines did not increase stability of the ultra-early wheat seeding system relative to the conventional spring wheat check, and are therefore not required for growers to adopt ultra-early seeding. The results of this study indicate that growers in western Canada can successfully begin seeding wheat earlier, with few changes to their current management practices, and endure less risk than delaying seeding until soil temperatures reach 10°C or greater.

2.
Cell ; 179(4): 984-1002.e36, 2019 10 31.
Article in English | MEDLINE | ID: mdl-31675503

ABSTRACT

Genomic studies in African populations provide unique opportunities to understand disease etiology, human diversity, and population history. In the largest study of its kind, comprising genome-wide data from 6,400 individuals and whole-genome sequences from 1,978 individuals from rural Uganda, we find evidence of geographically correlated fine-scale population substructure. Historically, the ancestry of modern Ugandans was best represented by a mixture of ancient East African pastoralists. We demonstrate the value of the largest sequence panel from Africa to date as an imputation resource. Examining 34 cardiometabolic traits, we show systematic differences in trait heritability between European and African populations, probably reflecting the differential impact of genes and environment. In a multi-trait pan-African GWAS of up to 14,126 individuals, we identify novel loci associated with anthropometric, hematological, lipid, and glycemic traits. We find that several functionally important signals are driven by Africa-specific variants, highlighting the value of studying diverse populations across the region.


Subject(s)
Black People/genetics , Genetic Predisposition to Disease , Genome, Human/genetics , Genomics , Female , Gene Frequency/genetics , Genome-Wide Association Study , Humans , Male , Polymorphism, Single Nucleotide/genetics , Uganda/epidemiology , Whole Genome Sequencing
3.
Nat Genet ; 51(2): 343-353, 2019 02.
Article in English | MEDLINE | ID: mdl-30692680

ABSTRACT

Loci discovered by genome-wide association studies predominantly map outside protein-coding genes. The interpretation of the functional consequences of non-coding variants can be greatly enhanced by catalogs of regulatory genomic regions in cell lines and primary tissues. However, robust and readily applicable methods are still lacking by which to systematically evaluate the contribution of these regions to genetic variation implicated in diseases or quantitative traits. Here we propose a novel approach that leverages genome-wide association studies' findings with regulatory or functional annotations to classify features relevant to a phenotype of interest. Within our framework, we account for major sources of confounding not offered by current methods. We further assess enrichment of genome-wide association studies for 19 traits within Encyclopedia of DNA Elements- and Roadmap-derived regulatory regions. We characterize unique enrichment patterns for traits and annotations driving novel biological insights. The method is implemented in standalone software and an R package, to facilitate its application by the research community.


Subject(s)
Disease/genetics , Genome/genetics , Genome-Wide Association Study/methods , Genomics/methods , Humans , Molecular Sequence Annotation/methods , Phenotype , Polymorphism, Single Nucleotide/genetics , Quantitative Trait Loci/genetics , Regulatory Sequences, Nucleic Acid/genetics , Software
4.
Hum Mol Genet ; 26(19): 3850-3858, 2017 10 01.
Article in English | MEDLINE | ID: mdl-28934396

ABSTRACT

Osteoarthritis (OA) is a common complex disease with high public health burden and no curative therapy. High bone mineral density (BMD) is associated with an increased risk of developing OA, suggesting a shared underlying biology. Here, we performed the first systematic overlap analysis of OA and BMD on a genome wide scale. We used summary statistics from the GEFOS consortium for lumbar spine (n = 31,800) and femoral neck (n = 32,961) BMD, and from the arcOGEN consortium for three OA phenotypes (hip, ncases=3,498; knee, ncases=3,266; hip and/or knee, ncases=7,410; ncontrols=11,009). Performing LD score regression we found a significant genetic correlation between the combined OA phenotype (hip and/or knee) and lumbar spine BMD (rg=0.18, P = 2.23 × 10-2), which may be driven by the presence of spinal osteophytes. We identified 143 variants with evidence for cross-phenotype association which we took forward for replication in independent large-scale OA datasets, and subsequent meta-analysis with arcOGEN for a total sample size of up to 23,425 cases and 236,814 controls. We found robustly replicating evidence for association with OA at rs12901071 (OR 1.08 95% CI 1.05-1.11, Pmeta=3.12 × 10-10), an intronic variant in the SMAD3 gene, which is known to play a role in bone remodeling and cartilage maintenance. We were able to confirm expression of SMAD3 in intact and degraded cartilage of the knee and hip. Our findings provide the first systematic evaluation of pleiotropy between OA and BMD, highlight genes with biological relevance to both traits, and establish a robust new OA genetic risk locus at SMAD3.


Subject(s)
Bone Density/genetics , Osteoarthritis/genetics , Smad3 Protein/genetics , Databases, Nucleic Acid , Femur Neck/chemistry , Femur Neck/physiology , Genetic Association Studies/methods , Genetic Pleiotropy/genetics , Humans , Lumbar Vertebrae/physiology , Osteoarthritis/etiology , Osteoarthritis, Hip/genetics , Osteoarthritis, Knee/genetics , Risk Factors , Smad3 Protein/metabolism
5.
Sci Rep ; 7(1): 8935, 2017 08 21.
Article in English | MEDLINE | ID: mdl-28827734

ABSTRACT

Osteoarthritis (OA) is a common disease characterized by cartilage degeneration and joint remodeling. The underlying molecular changes underpinning disease progression are incompletely understood. We investigated genes and pathways that mark OA progression in isolated primary chondrocytes taken from paired intact versus degraded articular cartilage samples across 38 patients undergoing joint replacement surgery (discovery cohort: 12 knee OA, replication cohorts: 17 knee OA, 9 hip OA patients). We combined genome-wide DNA methylation, RNA sequencing, and quantitative proteomics data. We identified 49 genes differentially regulated between intact and degraded cartilage in at least two -omics levels, 16 of which have not previously been implicated in OA progression. Integrated pathway analysis implicated the involvement of extracellular matrix degradation, collagen catabolism and angiogenesis in disease progression. Using independent replication datasets, we showed that the direction of change is consistent for over 90% of differentially expressed genes and differentially methylated CpG probes. AQP1, COL1A1 and CLEC3B were significantly differentially regulated across all three -omics levels, confirming their differential expression in human disease. Through integration of genome-wide methylation, gene and protein expression data in human primary chondrocytes, we identified consistent molecular players in OA progression that replicated across independent datasets and that have translational potential.


Subject(s)
Aquaporin 1/genetics , Chondrocytes/metabolism , Collagen Type I/genetics , DNA Methylation , Lectins, C-Type/genetics , Osteoarthritis, Hip/surgery , Osteoarthritis, Knee/surgery , Aquaporin 1/metabolism , Arthroplasty, Replacement, Hip , Arthroplasty, Replacement, Knee , Case-Control Studies , Chondrocytes/chemistry , Chromatography, Liquid , Collagen Type I/metabolism , Collagen Type I, alpha 1 Chain , Disease Progression , Epigenesis, Genetic , Epigenomics/methods , Gene Expression Profiling/methods , Gene Regulatory Networks , Humans , Lectins, C-Type/metabolism , Male , Mass Spectrometry , Osteoarthritis, Hip/genetics , Osteoarthritis, Hip/metabolism , Osteoarthritis, Knee/genetics , Osteoarthritis, Knee/metabolism , Proteomics/methods , Sequence Analysis, RNA
6.
Am J Hum Genet ; 100(6): 865-884, 2017 Jun 01.
Article in English | MEDLINE | ID: mdl-28552196

ABSTRACT

Deep sequence-based imputation can enhance the discovery power of genome-wide association studies by assessing previously unexplored variation across the common- and low-frequency spectra. We applied a hybrid whole-genome sequencing (WGS) and deep imputation approach to examine the broader allelic architecture of 12 anthropometric traits associated with height, body mass, and fat distribution in up to 267,616 individuals. We report 106 genome-wide significant signals that have not been previously identified, including 9 low-frequency variants pointing to functional candidates. Of the 106 signals, 6 are in genomic regions that have not been implicated with related traits before, 28 are independent signals at previously reported regions, and 72 represent previously reported signals for a different anthropometric trait. 71% of signals reside within genes and fine mapping resolves 23 signals to one or two likely causal variants. We confirm genetic overlap between human monogenic and polygenic anthropometric traits and find signal enrichment in cis expression QTLs in relevant tissues. Our results highlight the potential of WGS strategies to enhance biologically relevant discoveries across the frequency spectrum.


Subject(s)
Anthropometry , Genome, Human , Genome-Wide Association Study , Quantitative Trait Loci/genetics , Sequence Analysis, DNA/methods , Body Height/genetics , Cohort Studies , DNA Methylation/genetics , Databases, Genetic , Female , Genetic Variation , Humans , Lipodystrophy/genetics , Male , Meta-Analysis as Topic , Obesity/genetics , Physical Chromosome Mapping , Sex Characteristics , Syndrome , United Kingdom
7.
Genome Biol ; 17(1): 122, 2016 06 06.
Article in English | MEDLINE | ID: mdl-27268795

ABSTRACT

The Ensembl Variant Effect Predictor is a powerful toolset for the analysis, annotation, and prioritization of genomic variants in coding and non-coding regions. It provides access to an extensive collection of genomic annotation, with a variety of interfaces to suit different requirements, and simple options for configuring and extending analysis. It is open source, free to use, and supports full reproducibility of results. The Ensembl Variant Effect Predictor can simplify and accelerate variant interpretation in a wide range of study designs.


Subject(s)
Genetic Variation , Molecular Sequence Annotation/methods , Software , Computational Biology , Databases, Nucleic Acid , Genomics , Humans , Internet
8.
Nat Genet ; 48(6): 593-9, 2016 06.
Article in English | MEDLINE | ID: mdl-27111036

ABSTRACT

We report the sequences of 1,244 human Y chromosomes randomly ascertained from 26 worldwide populations by the 1000 Genomes Project. We discovered more than 65,000 variants, including single-nucleotide variants, multiple-nucleotide variants, insertions and deletions, short tandem repeats, and copy number variants. Of these, copy number variants contribute the greatest predicted functional impact. We constructed a calibrated phylogenetic tree on the basis of binary single-nucleotide variants and projected the more complex variants onto it, estimating the number of mutations for each class. Our phylogeny shows bursts of extreme expansion in male numbers that have occurred independently among each of the five continental superpopulations examined, at times of known migrations and technological innovations.


Subject(s)
Chromosomes, Human, Y , Demography , Haplotypes , Humans , Male , Mutation , Phylogeny , Polymorphism, Single Nucleotide
9.
BMC Genomics ; 16 Suppl 8: S2, 2015.
Article in English | MEDLINE | ID: mdl-26110515

ABSTRACT

BACKGROUND: A vast amount of DNA variation is being identified by increasingly large-scale exome and genome sequencing projects. To be useful, variants require accurate functional annotation and a wide range of tools are available to this end. McCarthy et al recently demonstrated the large differences in prediction of loss-of-function (LoF) variation when RefSeq and Ensembl transcripts are used for annotation, highlighting the importance of the reference transcripts on which variant functional annotation is based. RESULTS: We describe a detailed analysis of the similarities and differences between the gene and transcript annotation in the GENCODE and RefSeq genesets. We demonstrate that the GENCODE Comprehensive set is richer in alternative splicing, novel CDSs, novel exons and has higher genomic coverage than RefSeq, while the GENCODE Basic set is very similar to RefSeq. Using RNAseq data we show that exons and introns unique to one geneset are expressed at a similar level to those common to both. We present evidence that the differences in gene annotation lead to large differences in variant annotation where GENCODE and RefSeq are used as reference transcripts, although this is predominantly confined to non-coding transcripts and UTR sequence, with at most ~30% of LoF variants annotated discordantly. We also describe an investigation of dominant transcript expression, showing that it both supports the utility of the GENCODE Basic set in providing a smaller set of more highly expressed transcripts and provides a useful, biologically-relevant filter for further reducing the complexity of the transcriptome. CONCLUSIONS: The reference transcripts selected for variant functional annotation do have a large effect on the outcome. The GENCODE Comprehensive transcripts contain more exons, have greater genomic coverage and capture many more variants than RefSeq in both genome and exome datasets, while the GENCODE Basic set shows a higher degree of concordance with RefSeq and has fewer unique features. We propose that the GENCODE Comprehensive set has great utility for the discovery of new variants with functional potential, while the GENCODE Basic set is more suitable for applications demanding less complex interpretation of functional variants.


Subject(s)
Computational Biology , Genome, Human , Molecular Sequence Annotation , Protein Isoforms/metabolism , Software , Alternative Splicing , Databases, Genetic , Humans , Protein Isoforms/genetics , Transcriptome
10.
Nature ; 517(7534): 327-32, 2015 Jan 15.
Article in English | MEDLINE | ID: mdl-25470054

ABSTRACT

Given the importance of Africa to studies of human origins and disease susceptibility, detailed characterization of African genetic diversity is needed. The African Genome Variation Project provides a resource with which to design, implement and interpret genomic studies in sub-Saharan Africa and worldwide. The African Genome Variation Project represents dense genotypes from 1,481 individuals and whole-genome sequences from 320 individuals across sub-Saharan Africa. Using this resource, we find novel evidence of complex, regionally distinct hunter-gatherer and Eurasian admixture across sub-Saharan Africa. We identify new loci under selection, including loci related to malaria susceptibility and hypertension. We show that modern imputation panels (sets of reference genotypes from which unobserved or missing genotypes in study sets can be inferred) can identify association signals at highly differentiated loci across populations in sub-Saharan Africa. Using whole-genome sequencing, we demonstrate further improvements in imputation accuracy, strengthening the case for large-scale sequencing efforts of diverse African haplotypes. Finally, we present an efficient genotype array design capturing common genetic variation in Africa.


Subject(s)
Genetic Variation/genetics , Genetics, Medical/trends , Genome, Human/genetics , Genomics/trends , Africa , Africa South of the Sahara , Asia/ethnology , Europe/ethnology , Humans , Risk Factors , Selection, Genetic/genetics
11.
Bioinformatics ; 31(1): 143-5, 2015 Jan 01.
Article in English | MEDLINE | ID: mdl-25236461

ABSTRACT

MOTIVATION: We present a Web service to access Ensembl data using Representational State Transfer (REST). The Ensembl REST server enables the easy retrieval of a wide range of Ensembl data by most programming languages, using standard formats such as JSON and FASTA while minimizing client work. We also introduce bindings to the popular Ensembl Variant Effect Predictor tool permitting large-scale programmatic variant analysis independent of any specific programming language. AVAILABILITY AND IMPLEMENTATION: The Ensembl REST API can be accessed at http://rest.ensembl.org and source code is freely available under an Apache 2.0 license from http://github.com/Ensembl/ensembl-rest.


Subject(s)
Computational Biology/methods , Databases, Factual , Programming Languages , Software , Genetic Variation , Genomics , Humans
12.
Nat Commun ; 5: 5345, 2014 Nov 06.
Article in English | MEDLINE | ID: mdl-25373335

ABSTRACT

Isolated populations are emerging as a powerful study design in the search for low-frequency and rare variant associations with complex phenotypes. Here we genotype 2,296 samples from two isolated Greek populations, the Pomak villages (HELIC-Pomak) in the North of Greece and the Mylopotamos villages (HELIC-MANOLIS) in Crete. We compare their genomic characteristics to the general Greek population and establish them as genetic isolates. In the MANOLIS cohort, we observe an enrichment of missense variants among the variants that have drifted up in frequency by more than fivefold. In the Pomak cohort, we find novel associations at variants on chr11p15.4 showing large allele frequency increases (from 0.2% in the general Greek population to 4.6% in the isolate) with haematological traits, for example, with mean corpuscular volume (rs7116019, P=2.3 × 10(-26)). We replicate this association in a second set of Pomak samples (combined P=2.0 × 10(-36)). We demonstrate significant power gains in detecting medical trait associations.


Subject(s)
Genetic Drift , Genetic Variation/genetics , Genetics, Population , Genotype , Mutation, Missense/genetics , Population/genetics , Adolescent , Blood Cells/cytology , Cell Size , Cohort Studies , Gene Frequency/genetics , Genome-Wide Association Study , Greece , Haplotypes/genetics , Humans , Phenotype , Social Isolation
13.
Nat Methods ; 11(3): 294-6, 2014 Mar.
Article in English | MEDLINE | ID: mdl-24487584

ABSTRACT

Identifying functionally relevant variants against the background of ubiquitous genetic variation is a major challenge in human genetics. For variants in protein-coding regions, our understanding of the genetic code and splicing allows us to identify likely candidates, but interpreting variants outside genic regions is more difficult. Here we present genome-wide annotation of variants (GWAVA), a tool that supports prioritization of noncoding variants by integrating various genomic and epigenomic annotations.


Subject(s)
Molecular Sequence Annotation , Untranslated Regions/genetics , Algorithms , Computer Simulation , Genetic Variation , Humans
14.
Am J Hum Genet ; 94(2): 176-85, 2014 Feb 06.
Article in English | MEDLINE | ID: mdl-24412096

ABSTRACT

We have investigated the evidence for positive selection in samples of African, European, and East Asian ancestry at 65 loci associated with susceptibility to type 2 diabetes (T2D) previously identified through genome-wide association studies. Selection early in human evolutionary history is predicted to lead to ancestral risk alleles shared between populations, whereas late selection would result in population-specific signals at derived risk alleles. By using a wide variety of tests based on the site frequency spectrum, haplotype structure, and population differentiation, we found no global signal of enrichment for positive selection when we considered all T2D risk loci collectively. However, in a locus-by-locus analysis, we found nominal evidence for positive selection at 14 of the loci. Selection favored the protective and risk alleles in similar proportions, rather than the risk alleles specifically as predicted by the thrifty gene hypothesis, and may not be related to influence on diabetes. Overall, we conclude that past positive selection has not been a powerful influence driving the prevalence of T2D risk alleles.


Subject(s)
Diabetes Mellitus, Type 2/epidemiology , Diabetes Mellitus, Type 2/genetics , Genetic Loci , Genetic Predisposition to Disease , Selection, Genetic , Alleles , Asian People/genetics , Black People/genetics , Gene Frequency , Genome-Wide Association Study , Haplotypes , Humans , Polymorphism, Single Nucleotide , Risk Factors , White People/genetics
15.
Nat Commun ; 4: 2872, 2013.
Article in English | MEDLINE | ID: mdl-24343240

ABSTRACT

Isolated populations can empower the identification of rare variation associated with complex traits through next generation association studies, but the generalizability of such findings remains unknown. Here we genotype 1,267 individuals from a Greek population isolate on the Illumina HumanExome Beadchip, in search of functional coding variants associated with lipids traits. We find genome-wide significant evidence for association between R19X, a functional variant in APOC3, with increased high-density lipoprotein and decreased triglycerides levels. Approximately 3.8% of individuals are heterozygous for this cardioprotective variant, which was previously thought to be private to the Amish founder population. R19X is rare (<0.05% frequency) in outbred European populations. The increased frequency of R19X enables discovery of this lipid traits signal at genome-wide significance in a small sample size. This work exemplifies the value of isolated populations in successfully detecting transferable rare variant associations of high medical relevance.


Subject(s)
Apolipoprotein C-III/genetics , Cardiovascular Diseases/genetics , Cardiovascular Diseases/prevention & control , Genetic Variation , White People/genetics , Adult , Aged , Aged, 80 and over , Apolipoprotein C-III/metabolism , Cardiovascular Diseases/metabolism , Female , Gene Frequency , Genome-Wide Association Study , Greece , Haplotypes , Humans , Lipoproteins, HDL/metabolism , Male , Middle Aged , Polymorphism, Single Nucleotide , Triglycerides/metabolism
16.
Science ; 342(6154): 1235587, 2013 Oct 04.
Article in English | MEDLINE | ID: mdl-24092746

ABSTRACT

Interpreting variants, especially noncoding ones, in the increasing number of personal genomes is challenging. We used patterns of polymorphisms in functionally annotated regions in 1092 humans to identify deleterious variants; then we experimentally validated candidates. We analyzed both coding and noncoding regions, with the former corroborating the latter. We found regions particularly sensitive to mutations ("ultrasensitive") and variants that are disruptive because of mechanistic effects on transcription-factor binding (that is, "motif-breakers"). We also found variants in regions with higher network centrality tend to be deleterious. Insertions and deletions followed a similar pattern to single-nucleotide variants, with some notable exceptions (e.g., certain deletions and enhancers). On the basis of these patterns, we developed a computational tool (FunSeq), whose application to ~90 cancer genomes reveals nearly a hundred candidate noncoding drivers.


Subject(s)
Genetic Variation , Molecular Sequence Annotation/methods , Neoplasms/genetics , Binding Sites/genetics , Genome, Human , Genomics , Humans , Kruppel-Like Transcription Factors/metabolism , Mutation , Polymorphism, Single Nucleotide , Population/genetics , RNA, Untranslated/genetics , Selection, Genetic
17.
Nat Methods ; 10(8): 723-9, 2013 Aug.
Article in English | MEDLINE | ID: mdl-23900255

ABSTRACT

The International Cancer Genome Consortium (ICGC) aims to catalog genomic abnormalities in tumors from 50 different cancer types. Genome sequencing reveals hundreds to thousands of somatic mutations in each tumor but only a minority of these drive tumor progression. We present the result of discussions within the ICGC on how to address the challenge of identifying mutations that contribute to oncogenesis, tumor maintenance or response to therapy, and recommend computational techniques to annotate somatic variants and predict their impact on cancer phenotype.


Subject(s)
Computational Biology/methods , Genome, Human , Neoplasms/genetics , Genetic Variation , Humans , Mutation
18.
Nucleic Acids Res ; 41(Database issue): D48-55, 2013 Jan.
Article in English | MEDLINE | ID: mdl-23203987

ABSTRACT

The Ensembl project (http://www.ensembl.org) provides genome information for sequenced chordate genomes with a particular focus on human, mouse, zebrafish and rat. Our resources include evidenced-based gene sets for all supported species; large-scale whole genome multiple species alignments across vertebrates and clade-specific alignments for eutherian mammals, primates, birds and fish; variation data resources for 17 species and regulation annotations based on ENCODE and other data sets. Ensembl data are accessible through the genome browser at http://www.ensembl.org and through other tools and programmatic interfaces.


Subject(s)
Databases, Genetic , Genomics , Animals , Gene Expression Regulation , Genetic Variation , Humans , Internet , Mice , Molecular Sequence Annotation , Rats , Software , Zebrafish/genetics
19.
Hum Hered ; 73(1): 47-51, 2012.
Article in English | MEDLINE | ID: mdl-22261837

ABSTRACT

AIMS: Next-generation sequencing has opened the possibility of large-scale sequence-based disease association studies. A major challenge in interpreting whole-exome data is predicting which of the discovered variants are deleterious or neutral. To address this question in silico, we have developed a score called Combined Annotation scoRing toOL (CAROL), which combines information from 2 bioinformatics tools: PolyPhen-2 and SIFT, in order to improve the prediction of the effect of non-synonymous coding variants. METHODS: We used a weighted Z method that combines the probabilistic scores of PolyPhen-2 and SIFT. We defined 2 dataset pairs to train and test CAROL using information from the dbSNP: 'HGMD-PUBLIC' and 1000 Genomes Project databases. The training pair comprises a total of 980 positive control (disease-causing) and 4,845 negative control (non-disease-causing) variants. The test pair consists of 1,959 positive and 9,691 negative controls. RESULTS: CAROL has higher predictive power and accuracy for the effect of non-synonymous variants than each individual annotation tool (PolyPhen-2 and SIFT) and benefits from higher coverage. CONCLUSION: The combination of annotation tools can help improve automated prediction of whole-genome/exome non-synonymous variant functional consequences.


Subject(s)
Genomics/methods , Molecular Sequence Annotation/methods , Software , Algorithms , Humans , Polymorphism, Single Nucleotide , ROC Curve
20.
Nucleic Acids Res ; 40(Database issue): D84-90, 2012 Jan.
Article in English | MEDLINE | ID: mdl-22086963

ABSTRACT

The Ensembl project (http://www.ensembl.org) provides genome resources for chordate genomes with a particular focus on human genome data as well as data for key model organisms such as mouse, rat and zebrafish. Five additional species were added in the last year including gibbon (Nomascus leucogenys) and Tasmanian devil (Sarcophilus harrisii) bringing the total number of supported species to 61 as of Ensembl release 64 (September 2011). Of these, 55 species appear on the main Ensembl website and six species are provided on the Ensembl preview site (Pre!Ensembl; http://pre.ensembl.org) with preliminary support. The past year has also seen improvements across the project.


Subject(s)
Databases, Genetic , Genomics , Animals , Gene Expression Regulation , Genetic Variation , Humans , Mice , Molecular Sequence Annotation , Rats
SELECTION OF CITATIONS
SEARCH DETAIL
...