Pesquisa | Portal Regional da BVS

Evaluation of variant identification methods for whole genome sequencing data in dairy cattle.

Baes, Christine F; Dolezal, Marlies A; Koltes, James E; Bapst, Beat; Fritz-Waters, Eric; Jansen, Sandra; Flury, Christine; Signer-Hasler, Heidi; Stricker, Christian; Fernando, Rohan; Fries, Ruedi; Moll, Juerg; Garrick, Dorian J; Reecy, James M; Gredler, Birgit.

BMC Genomics ; 15: 948, 2014 Nov 01.

Artigo em Inglês | MEDLINE | ID: mdl-25361890

RESUMO

BACKGROUND: Advances in human genomics have allowed unprecedented productivity in terms of algorithms, software, and literature available for translating raw next-generation sequence data into high-quality information. The challenges of variant identification in organisms with lower quality reference genomes are less well documented. We explored the consequences of commonly recommended preparatory steps and the effects of single and multi sample variant identification methods using four publicly available software applications (Platypus, HaplotypeCaller, Samtools and UnifiedGenotyper) on whole genome sequence data of 65 key ancestors of Swiss dairy cattle populations. Accuracy of calling next-generation sequence variants was assessed by comparison to the same loci from medium and high-density single nucleotide variant (SNV) arrays. RESULTS: The total number of SNVs identified varied by software and method, with single (multi) sample results ranging from 17.7 to 22.0 (16.9 to 22.0) million variants. Computing time varied considerably between software. Preparatory realignment of insertions and deletions and subsequent base quality score recalibration had only minor effects on the number and quality of SNVs identified by different software, but increased computing time considerably. Average concordance for single (multi) sample results with high-density chip data was 58.3% (87.0%) and average genotype concordance in correctly identified SNVs was 99.2% (99.2%) across software. The average quality of SNVs identified, measured as the ratio of transitions to transversions, was higher using single sample methods than multi sample methods. A consensus approach using results of different software generally provided the highest variant quality in terms of transition/transversion ratio. CONCLUSIONS: Our findings serve as a reference for variant identification pipeline development in non-human organisms and help assess the implication of preparatory steps in next-generation sequencing pipelines for organisms with incomplete reference genomes (pipeline code is included). Benchmarking this information should prove particularly useful in processing next-generation sequencing data for use in genome-wide association studies and genomic selection.

Assuntos

Bovinos , Variação Genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Algoritmos , Animais , Genoma , Software

Estimates of missing heritability for complex traits in Brown Swiss cattle.

Román-Ponce, Sergio-Iván; Samoré, Antonia B; Dolezal, Marlies A; Bagnato, Alessandro; Meuwissen, Theo H E.

Genet Sel Evol ; 46: 36, 2014 Jun 04.

Artigo em Inglês | MEDLINE | ID: mdl-24898214

RESUMO

BACKGROUND: Genomic selection estimates genetic merit based on dense SNP (single nucleotide polymorphism) genotypes and phenotypes. This requires that SNPs explain a large fraction of the genetic variance. The objectives of this work were: (1) to estimate the fraction of genetic variance explained by dense genome-wide markers using 54 K SNP chip genotyping, and (2) to evaluate the effect of alternative marker-based relationship matrices and corrections for the base population on the fraction of the genetic variance explained by markers. METHODS: Two alternative marker-based relationship matrices were estimated using 35 706 SNPs on 1086 dairy bulls. Both pedigree- and marker-based relationship matrices were fitted simultaneously or separately in an animal model to estimate the fraction of variance not explained by the markers, i.e. the fraction explained by the pedigree. The phenotypes considered in the analysis were the deregressed estimated breeding values (dEBV) for milk, fat and protein yield and for somatic cell score (SCS). RESULTS: When dEBV were not sufficiently accurate (50 or 70%), the estimated fraction of the genetic variance explained by the markers was around 65% for yield traits and 45% for SCS. Scaling marker genotypes with locus-specific frequencies of heterozygotes slightly increased the variance explained by markers, compared with scaling with the average frequency of heterozygotes across loci. The estimated fraction of the genetic variance explained by the markers using separately both relationships matrices followed the same trends but the results were underestimated. With less accurate dEBV estimates, the fraction of the genetic variance explained by markers was underestimated, which is probably an artifact due to the dEBV being estimated by a pedigree-based animal model. CONCLUSIONS: When using only highly accurate dEBV, the proportion of the genetic variance explained by the Illumina 54 K SNP chip was approximately 80% for Brown Swiss cattle. These results depend on the SNP chip used and the family structure of the population, i.e. more dense SNPs and closer family relationships are expected to result in a higher fraction of the variance explained by the SNPs.

Assuntos

Bovinos/classificação , Bovinos/genética , Variação Genética , Polimorfismo de Nucleotídeo Único , Alelos , Animais , Cruzamento , Frequência do Gene , Marcadores Genéticos , Genômica/métodos , Genótipo , Masculino , Modelos Genéticos , Análise de Sequência com Séries de Oligonucleotídeos/veterinária , Linhagem , Fenótipo , Locos de Características Quantitativas , Característica Quantitativa Herdável

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA