Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 11 de 11
Filter
Add more filters










Publication year range
1.
Mol Ecol Resour ; 15(4): 723-36, 2015 Jul.
Article in English | MEDLINE | ID: mdl-25388640

ABSTRACT

Single nucleotide polymorphisms (SNPs) have become the marker of choice for genetic studies in organisms of conservation, commercial or biological interest. Most SNP discovery projects in nonmodel organisms apply a strategy for identifying putative SNPs based on filtering rules that account for random sequencing errors. Here, we analyse data used to develop 4723 novel SNPs for the commercially important deep-sea fish, orange roughy (Hoplostethus atlanticus), to assess the impact of not accounting for systematic sequencing errors when filtering identified polymorphisms when discovering SNPs. We used SAMtools to identify polymorphisms in a velvet assembly of genomic DNA sequence data from seven individuals. The resulting set of polymorphisms were filtered to minimize 'bycatch'-polymorphisms caused by sequencing or assembly error. An Illumina Infinium SNP chip was used to genotype a final set of 7714 polymorphisms across 1734 individuals. Five predictors were examined for their effect on the probability of obtaining an assayable SNP: depth of coverage, number of reads that support a variant, polymorphism type (e.g. A/C), strand-bias and Illumina SNP probe design score. Our results indicate that filtering out systematic sequencing errors could substantially improve the efficiency of SNP discovery. We show that BLASTX can be used as an efficient tool to identify single-copy genomic regions in the absence of a reference genome. The results have implications for research aiming to identify assayable SNPs and build SNP genotyping assays for nonmodel organisms.


Subject(s)
Genotyping Techniques/methods , High-Throughput Nucleotide Sequencing/methods , Polymorphism, Single Nucleotide , Vertebrates/classification , Vertebrates/genetics , Animals , Computational Biology/methods
2.
Gene ; 482(1-2): 73-7, 2011 Aug 15.
Article in English | MEDLINE | ID: mdl-21620936

ABSTRACT

Copy number variation (CNV) is likely to be an important component of heritable variation in livestock. To characterise CNVs in cattle, we performed a genome wide survey to determine the number, location and gene content of these genomic features. A tiling oligonucleotide array with ~385,000 probes was used for comparative genomic hybridisation of both taurine and zebu cattle. Using a conservative set of calling criteria, a total of 51 CNV were detected that collectively spanned approximately half of one percent of the bovine genome. The size of the average CNV within each animal ranged from 213 kb up to 335 kb. Half of the CNV were detected in a single animal only, whilst the remainder was independently identified in multiple individuals. Analysis was performed to determine the gene content for each CNV region. This revealed that the majority of CNV (82%) spanned at least one gene, with a number of CNV containing genes which are known to control aspects of phenotypic variation in cattle. Whilst additional studies are required to determine the impact of individual CNV, this study confirmed them as an important class of genomic variation in cattle.


Subject(s)
Cattle/genetics , DNA Copy Number Variations/genetics , Genome/genetics , Animals , Comparative Genomic Hybridization , Female , Male , Pedigree , Reproducibility of Results
3.
Proc Natl Acad Sci U S A ; 107(31): 13642-7, 2010 Aug 03.
Article in English | MEDLINE | ID: mdl-20643938

ABSTRACT

We describe a systems biology approach for the genetic dissection of complex traits based on applying gene network theory to the results from genome-wide associations. The associations of single-nucleotide polymorphisms (SNP) that were individually associated with a primary phenotype of interest, age at puberty in our study, were explored across 22 related traits. Genomic regions were surveyed for genes harboring the selected SNP. As a result, an association weight matrix (AWM) was constructed with as many rows as genes and as many columns as traits. Each {i, j} cell value in the AWM corresponds to the z-score normalized additive effect of the ith gene (via its neighboring SNP) on the jth trait. Columnwise, the AWM recovered the genetic correlations estimated via pedigree-based restricted maximum-likelihood methods. Rowwise, a combination of hierarchical clustering, gene network, and pathway analyses identified genetic drivers that would have been missed by standard genome-wide association studies. Finally, the promoter regions of the AWM-predicted targets of three key transcription factors (TFs), estrogen-related receptor gamma (ESRRG), Pal3 motif, bound by a PPAR-gamma homodimer, IR3 sites (PPARG), and Prophet of Pit 1, PROP paired-like homeobox 1 (PROP1), were surveyed to identify binding sites corresponding to those TFs. Applied to our case, the AWM results recapitulate the known biology of puberty, captured experimentally validated binding sites, and identified candidate genes and gene-gene interactions for further investigation.


Subject(s)
Aging , Cattle/genetics , Polymorphism, Single Nucleotide , Animals , Gene Regulatory Networks , Genome-Wide Association Study , Systems Biology
4.
BMC Genomics ; 11: 370, 2010 Jun 11.
Article in English | MEDLINE | ID: mdl-20537189

ABSTRACT

BACKGROUND: Two types of horns are evident in cattle - fixed horns attached to the skull and a variation called scurs, which refers to small loosely attached horns. Cattle lacking horns are referred to as polled. Although both the Poll and Scurs loci have been mapped to BTA1 and 19 respectively, the underlying genetic basis of these phenotypes is unknown, and so far, no candidate genes regulating these developmental processes have been described. This study is the first reported attempt at transcript profiling to identify genes and pathways contributing to horn and scurs development in Brahman cattle, relative to polled counterparts. RESULTS: Expression patterns in polled, horned and scurs tissues were obtained using the Agilent 44 k bovine array. The most notable feature when comparing transcriptional profiles of developing horn tissues against polled was the down regulation of genes coding for elements of the cadherin junction as well as those involved in epidermal development. We hypothesize this as a key event involved in keratinocyte migration and subsequent horn development. In the polled-scurs comparison, the most prevalent differentially expressed transcripts code for genes involved in extracellular matrix remodelling, which were up regulated in scurs tissues relative to polled. CONCLUSION: For this first time we describe networks of genes involved in horn and scurs development. Interestingly, we did not observe differential expression in any of the genes present on the fine mapped region of BTA1 known to contain the Poll locus.


Subject(s)
Cattle/growth & development , Cattle/genetics , Gene Expression Profiling , Horns/growth & development , Animals , Cattle/anatomy & histology , Female , Gene Expression Regulation, Developmental , Gene Regulatory Networks , Male , Oligonucleotide Array Sequence Analysis , Polymerase Chain Reaction , Reproducibility of Results
5.
BMC Genomics ; 9: 187, 2008 Apr 24.
Article in English | MEDLINE | ID: mdl-18435834

ABSTRACT

BACKGROUND: The extent of linkage disequilibrium (LD) within a population determines the number of markers that will be required for successful association mapping and marker-assisted selection. Most studies on LD in cattle reported to date are based on microsatellite markers or small numbers of single nucleotide polymorphisms (SNPs) covering one or only a few chromosomes. This is the first comprehensive study on the extent of LD in cattle by analyzing data on 1,546 Holstein-Friesian bulls genotyped for 15,036 SNP markers covering all regions of all autosomes. Furthermore, most studies in cattle have used relatively small sample sizes and, consequently, may have had biased estimates of measures commonly used to describe LD. We examine minimum sample sizes required to estimate LD without bias and loss in accuracy. Finally, relatively little information is available on comparative LD structures including other mammalian species such as human and mouse, and we compare LD structure in cattle with public-domain data from both human and mouse. RESULTS: We computed three LD estimates, D', Dvol and r2, for 1,566,890 syntenic SNP pairs and a sample of 365,400 non-syntenic pairs. Mean D' is 0.189 among syntenic SNPs, and 0.105 among non-syntenic SNPs; mean r2 is 0.024 among syntenic SNPs and 0.0032 among non-syntenic SNPs. All three measures of LD for syntenic pairs decline with distance; the decline is much steeper for r2 than for D' and Dvol. The value of D' and Dvol are quite similar. Significant LD in cattle extends to 40 kb (when estimated as r2) and 8.2 Mb (when estimated as D'). The mean values for LD at large physical distances are close to those for non-syntenic SNPs. Minor allelic frequency threshold affects the distribution and extent of LD. For unbiased and accurate estimates of LD across marker intervals spanning < 1 kb to > 50 Mb, minimum sample sizes of 400 (for D') and 75 (for r2) are required. The bias due to small samples sizes increases with inter-marker interval. LD in cattle is much less extensive than in a mouse population created from crossing inbred lines, and more extensive than in humans. CONCLUSION: For association mapping in Holstein-Friesian cattle, for a given design, at least one SNP is required for each 40 kb, giving a total requirement of at least 75,000 SNPs for a low power whole-genome scan (median r2 > 0.19) and up to 300,000 markers at 10 kb intervals for a high power genome scan (median r2 > 0.62). For estimation of LD by D' and Dvol with sufficient precision, a sample size of at least 400 is required, whereas for r2 a minimum sample of 75 is adequate.


Subject(s)
Cattle/genetics , Genome , Linkage Disequilibrium , Polymorphism, Single Nucleotide , Animals , Gene Frequency , Humans , Male , Mice , Synteny
6.
Genome Biol ; 8(7): R152, 2007.
Article in English | MEDLINE | ID: mdl-17663790

ABSTRACT

BACKGROUND: Is it possible to construct an accurate and detailed subgene-level map of a genome using bacterial artificial chromosome (BAC) end sequences, a sparse marker map, and the sequences of other genomes? RESULTS: A sheep BAC library, CHORI-243, was constructed and the BAC end sequences were determined and mapped with high sensitivity and low specificity onto the frameworks of the human, dog, and cow genomes. To maximize genome coverage, the coordinates of all BAC end sequence hits to the cow and dog genomes were also converted to the equivalent human genome coordinates. The 84,624 sheep BACs (about 5.4-fold genome coverage) with paired ends in the correct orientation (tail-to-tail) and spacing, combined with information from sheep BAC comparative genome contigs (CGCs) built separately on the dog and cow genomes, were used to construct 1,172 sheep BAC-CGCs, covering 91.2% of the human genome. Clustered non-tail-to-tail and outsize BACs located close to the ends of many BAC-CGCs linked BAC-CGCs covering about 70% of the genome to at least one other BAC-CGC on the same chromosome. Using the BAC-CGCs, the intrachromosomal and interchromosomal BAC-CGC linkage information, human/cow and vertebrate synteny, and the sheep marker map, a virtual sheep genome was constructed. To identify BACs potentially located in gaps between BAC-CGCs, an additional set of 55,668 sheep BACs were positioned on the sheep genome with lower confidence. A coordinate conversion process allowed us to transfer human genes and other genome features to the virtual sheep genome to display on a sheep genome browser. CONCLUSION: We demonstrate that limited sequencing of BACs combined with positioning on a well assembled genome and integrating locations from other less well assembled genomes can yield extensive, detailed subgene-level maps of mammalian genomes, for which genomic resources are currently limited.


Subject(s)
Genome , Genomics , Physical Chromosome Mapping , Sheep, Domestic/genetics , Animals , Base Sequence , Cattle , Chromosomes, Artificial, Bacterial , Dogs , Gene Library , Genome, Human , Humans , Molecular Sequence Data , Sequence Analysis, DNA
7.
Genetics ; 176(2): 763-72, 2007 Jun.
Article in English | MEDLINE | ID: mdl-17435229

ABSTRACT

Analysis of data on 1000 Holstein-Friesian bulls genotyped for 15,036 single-nucleotide polymorphisms (SNPs) has enabled genomewide identification of haplotype blocks and tag SNPs. A final subset of 9195 SNPs in Hardy-Weinberg equilibrium and mapped on autosomes on the bovine sequence assembly (release Btau 3.1) was used in this study. The average intermarker spacing was 251.8 kb. The average minor allele frequency (MAF) was 0.29 (0.05-0.5). Following recent precedents in human HapMap studies, a haplotype block was defined where 95% of combinations of SNPs within a region are in very high linkage disequilibrium. A total of 727 haplotype blocks consisting of > or =3 SNPs were identified. The average block length was 69.7 +/- 7.7 kb, which is approximately 5-10 times larger than in humans. These blocks comprised a total of 2964 SNPs and covered 50,638 kb of the sequence map, which constitutes 2.18% of the length of all autosomes. A set of tag SNPs, which will be useful for further fine-mapping studies, has been identified. Overall, the results suggest that as many as 75,000-100,000 tag SNPs would be needed to track all important haplotype blocks in the bovine genome. This would require approximately 250,000 SNPs in the discovery phase.


Subject(s)
Cattle/genetics , Polymorphism, Single Nucleotide , Animals , Cohort Studies , DNA/genetics , Genotype , Haplotypes , Male
8.
Physiol Genomics ; 28(1): 76-83, 2006 Dec 13.
Article in English | MEDLINE | ID: mdl-16985009

ABSTRACT

We present the application of large-scale multivariate mixed-model equations to the joint analysis of nine gene expression experiments in beef cattle muscle and fat tissues with a total of 147 hybridizations, and we explore 47 experimental conditions or treatments. Using a correlation-based method, we constructed a gene network for 822 genes. Modules of muscle structural proteins and enzymes, extracellular matrix, fat metabolism, and protein synthesis were clearly evident. Detailed analysis of the network identified groupings of proteins on the basis of physical association. For example, expression of three components of the z-disk, MYOZ1, TCAP, and PDLIM3, was significantly correlated. In contrast, expression of these z-disk proteins was not highly correlated with the expression of a cluster of thick (myosins) and thin (actin and tropomyosins) filament proteins or of titin, the third major filament system. However, expression of titin was itself not significantly correlated with the cluster of thick and thin filament proteins and enzymes. Correlation in expression of many fast-twitch muscle structural proteins and enzymes was observed, but slow-twitch-specific proteins were not correlated with the fast-twitch proteins or with each other. In addition, a number of significant associations between genes and transcription factors were also identified. Our results not only recapitulate the known biology of muscle but have also started to reveal some of the underlying associations between and within the structural components of skeletal muscle.


Subject(s)
Gene Expression Profiling , Gene Regulatory Networks , Muscle, Skeletal/metabolism , Oligonucleotide Array Sequence Analysis , Adipose Tissue/metabolism , Animals , Cattle , Magnesium/metabolism , Models, Biological , Protein Biosynthesis , Transcription Factors/genetics , Transcription Factors/metabolism
9.
Genetics ; 174(1): 79-85, 2006 Sep.
Article in English | MEDLINE | ID: mdl-16816421

ABSTRACT

We constructed a metric linkage disequilibrium (LD) map of bovine chromosome 6 (BTA6) on the basis of data from 220 SNPs genotyped on 433 Australian dairy bulls. This metric LD map has distances in LD units (LDUs) that are analogous to centimorgans in linkage maps. The LD map of BTA6 has a total length of 8.9 LDUs. Within the LD map, regions of high LD (represented as blocks) and regions of low LD (steps) are observed, when plotted against the integrated map in kilobases. At the most stringent block definition, namely a set of loci with zero LDU increase over the span of these markers, BTA6 comprises 40 blocks, accounting for 41% of the chromosome. At a slightly lower stringency of block definition (a set of loci covering a maximum of 0.2 LDUs on the LD map), up to 81% of BTA6 is spanned by 46 blocks and with 13 steps that are likely to reflect recombination hot spots. The mean swept radius (the distance over which LD is likely to be useful for mapping) is 13.3 Mb, confirming extensive LD in Holstein-Friesian dairy cattle, which makes such populations ideal for whole-genome association studies.


Subject(s)
Cattle/genetics , Chromosome Mapping/methods , Linkage Disequilibrium , Animals , Chromosomes , Male , Polymorphism, Single Nucleotide , Quantitative Trait Loci , Recombination, Genetic , Semen/chemistry
10.
Bioinformatics ; 21(7): 1112-20, 2005 Apr 01.
Article in English | MEDLINE | ID: mdl-15564293

ABSTRACT

MOTIVATION: Clusters of genes encoding proteins with related functions, or in the same regulatory network, often exhibit expression patterns that are correlated over a large number of conditions. Protein associations and gene regulatory networks can be modelled from expression data. We address the question of which of several normalization methods is optimal prior to computing the correlation of the expression profiles between every pair of genes. RESULTS: We use gene expression data from five experiments with a total of 78 hybridizations and 23 diverse conditions. Nine methods of data normalization are explored based on all possible combinations of normalization techniques according to between and within gene and experiment variation. We compare the resulting empirical distribution of gene x gene correlations with the expectations and apply cross-validation to test the performance of each method in predicting accurate functional annotation. We conclude that normalization methods based on mixed-model equations are optimal.


Subject(s)
Algorithms , Data Interpretation, Statistical , Gene Expression Profiling/methods , Gene Expression Regulation/physiology , Models, Genetic , Oligonucleotide Array Sequence Analysis/methods , Benchmarking/methods , Computer Simulation , Gene Expression Profiling/standards , Models, Statistical , Numerical Analysis, Computer-Assisted , Oligonucleotide Array Sequence Analysis/standards , Software
11.
Genome ; 47(4): 639-49, 2004 Aug.
Article in English | MEDLINE | ID: mdl-15284868

ABSTRACT

Basal gene expression levels across the bovine gastrointestinal tract (GI) were examined in an attempt to formulate genetic explanations for the differences in function that are known or thought to exist between the various regions. Gene expression along the tract was studied through the random sequencing of a total of 16 412 clones from seven tissue-specific cDNA libraries spanning its length. The expressed sequence tags (ESTs) within each library were clustered to reduce clone redundancy and obtain longer consensus sequences. BLASTN and BLASTX searches against the NCBI human RefSeq databases were used to find putative matches for the bovine sequences and gene ontology assignments were made. Notable similarities and differences in gene expression were observed among the various compartments of the GI tract of the bovine. Many of the prominent transcripts have yet to be reliably identified and the prominence of others may be worthy of further examination. This collection of ESTs represents an important resource for the future construction of a GI tract specific microarray for further gene expression studies.


Subject(s)
Gastrointestinal Tract/metabolism , Gene Expression , Animals , Base Sequence , Cattle , DNA, Complementary/genetics , Expressed Sequence Tags , Gastrointestinal Tract/anatomy & histology , Gene Expression Profiling , Gene Library , Humans , Male
SELECTION OF CITATIONS
SEARCH DETAIL
...