Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 3 de 3
Filter
Add more filters










Database
Language
Publication year range
1.
Gigascience ; 4: 18, 2015.
Article in English | MEDLINE | ID: mdl-25897398

ABSTRACT

BACKGROUND: Common chimpanzees (Pan troglodytes) and bonobos (Pan paniscus) are the species most closely related to humans. For this reason, it is especially important to have complete and accurate chimpanzee nucleotide and protein sequences to understand how humans evolved their unique capabilities. We provide transcriptome data from four untransformed cell types derived from the reference Pan troglodytes, "Clint", to better annotate the chimpanzee genome and provide empirical validation for proposed gene models of this important species. FINDINGS: RNA was extracted from primary cells cultured from four tissues: skin, adipose stroma, vascular smooth muscle and skeletal muscle. These four RNA samples were sequenced on the Illumina HiSeq 2000 platform. Sequences were deposited in the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA). Transcripts were assembled, annotated and deposited in the NCBI Transcriptome Shotgun Assembly (TSA) database. CONCLUSIONS: We have provided a high quality annotation of 44,275 transcripts with full-length coding sequence (CDS). This set represented a total of 10,110 unique genes, thus providing empirical support for their existence. This dataset can be used to improve the annotation of the Pan troglodytes genome.


Subject(s)
Pan troglodytes/genetics , Transcriptome , Animals , Databases, Genetic , High-Throughput Nucleotide Sequencing , Molecular Sequence Annotation , RNA, Messenger/chemistry
2.
Biol Direct ; 9(1): 20, 2014 Oct 14.
Article in English | MEDLINE | ID: mdl-25319552

ABSTRACT

BACKGROUND: The rhesus macaque (Macaca mulatta) is a key species for advancing biomedical research. Like all draft mammalian genomes, the draft rhesus assembly (rheMac2) has gaps, sequencing errors and misassemblies that have prevented automated annotation pipelines from functioning correctly. Another rhesus macaque assembly, CR_1.0, is also available but is substantially more fragmented than rheMac2 with smaller contigs and scaffolds. Annotations for these two assemblies are limited in completeness and accuracy. High quality assembly and annotation files are required for a wide range of studies including expression, genetic and evolutionary analyses. RESULTS: We report a new de novo assembly of the rhesus macaque genome (MacaM) that incorporates both the original Sanger sequences used to assemble rheMac2 and new Illumina sequences from the same animal. MacaM has a weighted average (N50) contig size of 64 kilobases, more than twice the size of the rheMac2 assembly and almost five times the size of the CR_1.0 assembly. The MacaM chromosome assembly incorporates information from previously unutilized mapping data and preliminary annotation of scaffolds. Independent assessment of the assemblies using Ion Torrent read alignments indicates that MacaM is more complete and accurate than rheMac2 and CR_1.0. We assembled messenger RNA sequences from several rhesus tissues into transcripts which allowed us to identify a total of 11,712 complete proteins representing 9,524 distinct genes. Using a combination of our assembled rhesus macaque transcripts and human transcripts, we annotated 18,757 transcripts and 16,050 genes with complete coding sequences in the MacaM assembly. Further, we demonstrate that the new annotations provide greatly improved accuracy as compared to the current annotations of rheMac2. Finally, we show that the MacaM genome provides an accurate resource for alignment of reads produced by RNA sequence expression studies. CONCLUSIONS: The MacaM assembly and annotation files provide a substantially more complete and accurate representation of the rhesus macaque genome than rheMac2 or CR_1.0 and will serve as an important resource for investigators conducting next-generation sequencing studies with nonhuman primates. REVIEWERS: This article was reviewed by Dr. Lutz Walter, Dr. Soojin Yi and Dr. Kateryna Makova.


Subject(s)
Genome , Macaca mulatta/genetics , Amino Acid Sequence , Animals , Gene Expression Profiling , High-Throughput Nucleotide Sequencing , Molecular Sequence Annotation , Molecular Sequence Data , RNA, Messenger/metabolism , Sequence Alignment
3.
Gigascience ; 3: 14, 2014.
Article in English | MEDLINE | ID: mdl-25243066

ABSTRACT

BACKGROUND: Nonhuman primates are important for both biomedical studies and understanding human evolution. Although research in these areas has mostly focused on Old World primates, such as the rhesus macaque, the common marmoset (Callithrix jacchus), a New World primate, offers important advantages in comparison to other primates, such as an accelerated lifespan. To conduct Next Generation expression studies or to study primate evolution, a high quality annotation of the marmoset genome is required. The availability of marmoset transcriptome data from five tissues, including both raw sequences and assembled transcripts, will aid in the annotation of the newly released marmoset assembly. FINDINGS: RNA WAS EXTRACTED FROM FIVE TISSUES: skeletal muscle, bladder and hippocampus from a male common marmoset, and cerebral cortex and cerebellum from a female common marmoset. All five RNA samples were sequenced on the Illumina HiSeq 2000 platform. Sequences were deposited in the NCBI Sequence Read Archive. Transcripts were assembled, annotated and deposited in the NCBI Transcriptome Shotgun Assembly database. CONCLUSIONS: We have provided a high quality annotation of 51,163 transcripts with full-length coding sequence. This set represented a total of 10,833 unique genes. In addition to providing empirical support for the existence of these 10,833 genes, we also provide sequence information for 2,422 genes that were not previously identified in the Ensembl annotation of the marmoset genome.

SELECTION OF CITATIONS
SEARCH DETAIL
...