Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Bioinformatics ; 37(6): 750-758, 2021 05 05.
Artigo em Inglês | MEDLINE | ID: mdl-33063094

RESUMO

MOTIVATION: Infection with strains of different subtypes and the subsequent crossover reading between the two strands of genomic RNAs by host cells' reverse transcriptase are the main causes of the vast HIV-1 sequence diversity. Such inter-subtype genomic recombinants can become circulating recombinant forms (CRFs) after widespread transmissions in a population. Complete prediction of all the subtype sources of a CRF strain is a complicated machine learning problem. It is also difficult to understand whether a strain is an emerging new subtype and if so, how to accurately identify the new components of the genetic source. RESULTS: We introduce a multi-label learning algorithm for the complete prediction of multiple sources of a CRF sequence as well as the prediction of its chronological number. The prediction is strengthened by a voting of various multi-label learning methods to avoid biased decisions. In our steps, frequency and position features of the sequences are both extracted to capture signature patterns of pure subtypes and CRFs. The method was applied to 7185 HIV-1 sequences, comprising 5530 pure subtype sequences and 1655 CRF sequences. Results have demonstrated that the method can achieve very high accuracy (reaching 99%) in the prediction of the complete set of labels of HIV-1 recombinant forms. A few wrong predictions are actually incomplete predictions, very close to the complete set of genuine labels. AVAILABILITY AND IMPLEMENTATION: https://github.com/Runbin-tang/The-source-of-HIV-CRFs-prediction. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Infecções por HIV , HIV-1 , Variação Genética , Infecções por HIV/genética , HIV-1/genética , Humanos , Epidemiologia Molecular , Filogenia
2.
RNA ; 19(12): 1693-702, 2013 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-24152549

RESUMO

Adenosine-to-inosine (A-to-I) substitutions are the most common type of RNA editing in mammals. A-to-I RNA editing is particularly widespread in the brain and is known to play important roles in neuronal functions. In this study we investigated RNA-editing changes during human brain development and maturation, as well as evolutionary conservation of RNA-editing patterns across primates. We used high-throughput transcriptome sequencing (RNA-seq) to quantify the RNA-editing levels and assess ontogenetic dynamics of RNA editing at more than 8000 previously annotated exonic A-to-I RNA-editing sites in two brain regions--prefrontal cortex and cerebellum--of humans, chimpanzees, and rhesus macaques. We observed substantial conservation of RNA-editing levels between the brain regions, as well as among the three primate species. Evolutionary changes in RNA editing were nonetheless evident, with 40% of the annotated editing sites studied showing divergent editing levels among the three species and 16.5% of sites displaying statistically significant human-specific editing patterns. Across lifespan, we observed an increase of the RNA-editing level with advanced age in both brain regions of all three primate species.


Assuntos
Cerebelo/metabolismo , Macaca mulatta/genética , Pan troglodytes/genética , Córtex Pré-Frontal/metabolismo , Edição de RNA , Fatores Etários , Animais , Evolução Molecular , Humanos , Polimorfismo de Nucleotídeo Único , Análise de Sequência de RNA , Especificidade da Espécie , Transcriptoma
3.
Mol Syst Biol ; 9: 633, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23340839

RESUMO

While splicing differences between tissues, sexes and species are well documented, little is known about the extent and the nature of splicing changes that take place during human or mammalian development and aging. Here, using high-throughput transcriptome sequencing, we have characterized splicing changes that take place during whole human lifespan in two brain regions: prefrontal cortex and cerebellum. Identified changes were confirmed using independent human and rhesus macaque RNA-seq data sets, exon arrays and PCR, and were detected at the protein level using mass spectrometry. Splicing changes across lifespan were abundant in both of the brain regions studied, affecting more than a third of the genes expressed in the human brain. Approximately 15% of these changes differed between the two brain regions. Across lifespan, splicing changes followed discrete patterns that could be linked to neural functions, and associated with the expression profiles of the corresponding splicing factors. More than 60% of all splicing changes represented a single splicing pattern reflecting preferential inclusion of gene segments potentially targeting transcripts for nonsense-mediated decay in infants and elderly.


Assuntos
Envelhecimento/genética , Cerebelo/fisiologia , Córtex Pré-Frontal/fisiologia , Splicing de RNA , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Processamento Alternativo , Animais , Cerebelo/crescimento & desenvolvimento , Criança , Pré-Escolar , Éxons , Perfilação da Expressão Gênica , Humanos , Lactente , Recém-Nascido , Macaca mulatta/genética , Pessoa de Meia-Idade , Córtex Pré-Frontal/crescimento & desenvolvimento , Proteínas/genética , Proteínas/metabolismo , Análise de Sequência de RNA/métodos , Adulto Jovem
4.
J Theor Biol ; 269(1): 280-6, 2011 Jan 21.
Artigo em Inglês | MEDLINE | ID: mdl-21056578

RESUMO

Many clustering approaches have been developed for biological data analysis, however, the application of traditional clustering algorithms for RNA structure data analysis is still a challenging issue. This arises from the existence of complex secondary structures while clustering. One of the most critical issues of cluster analysis is the development of appropriate distance measures in high dimensional space. The traditional distance measures focus on scale issues, but ignores the correlation between two values. This article develops a novel interval-based distance (Hausdorff) measure for computing the similarity between characterized structures. Three relationships including perfect match, partially overlapped and non-overlapped are considered. Finally, we demonstrate the methods by analyzing a data set of RNA secondary structures from the Rfam database.


Assuntos
Biologia Computacional/métodos , Conformação de Ácido Nucleico , RNA/química , Modelos Moleculares
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...