Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 2 de 2
Filter
Add more filters










Database
Language
Publication year range
1.
J Comput Biol ; 29(9): 1001-1021, 2022 09.
Article in English | MEDLINE | ID: mdl-35593919

ABSTRACT

The comparison of DNA sequences is of great significance in genomics analysis. Although the traditional multiple sequence alignment (MSA) method is popularly used for evolutionary analysis, optimally aligning k sequences becomes computationally intractable when k increases due to the intrinsic computational complexity of MSA. Despite numerous k-mer alignment-free methods being proposed, the existing k-mer alignment-free methods may not truly capture the contextual structures of the sequences. In this study, we present a novel k-mer contextual alignment-free method (called kmer2vec), in which the sequence k-mers are semantically embedded to word2vec vectors, an essential technique in natural language processing. Consequently, the method converts each DNA/RNA sequence into a point in the word2vec high-dimensional space and compares DNA sequences in the space. Because the word2vec vectors are trained from the contextual relationship of k-mers in the genomes, the method may extract valuable structural information from the sequences and reflect the relationship among them properly. The proposed method is optimized on the parameters from word2vec training and verified in the phylogenetic analysis of large whole genomes, including coronavirus and bacterial genomes. The results demonstrate the effectiveness of the method on phylogenetic tree construction and species clustering. The method running speed is much faster than that of the MSA method, especially the phylogenetic relationships constructed by the kmer2vec method are more accurate than the conventional k-mer alignment-free method. Therefore, this approach can provide new perspectives for phylogeny and evolution and make it possible to analyze large genomes. In addition, we discuss special parameterization in the k-mer word2vec embedding construction. An effective tool for rapid SARS-CoV-2 typing can also be derived when combining kmer2vec with clustering methods.


Subject(s)
Algorithms , COVID-19 , Base Sequence , Humans , Phylogeny , SARS-CoV-2/genetics , Sequence Analysis, DNA/methods
2.
Front Cell Infect Microbiol ; 12: 1085397, 2022.
Article in English | MEDLINE | ID: mdl-36760235

ABSTRACT

Comprehensive identification of possible target cells for viruses is crucial for understanding the pathological mechanism of virosis. The susceptibility of cells to viruses depends on many factors. Besides the existence of receptors at the cell surface, effective expression of viral genes is also pivotal for viral infection. The regulation of viral gene expression is a multilevel process including transcription, translational initiation and translational elongation. At the translational elongation level, the translational efficiency of viral mRNAs mainly depends on the match between their codon composition and cellular translational machinery (usually referred to as codon adaptation). Thus, codon adaptation for viral ORFs in different cell types may be related to their susceptibility to viruses. In this study, we selected the codon adaptation index (CAI) which is a common codon adaptation-based indicator for assessing the translational efficiency at the translational elongation level to evaluate the susceptibility to two-pandemic viruses (HIV-1 and SARS-CoV-2) of different human cell types. Compared with previous studies that evaluated the infectivity of viruses based on codon adaptation, the main advantage of our study is that our analysis is refined to the cell-type level. At first, we verified the positive correlation between CAI and translational efficiency and strengthened the rationality of our research method. Then we calculated CAI for ORFs of two viruses in various human cell types. We found that compared to high-expression endogenous genes, the CAIs of viral ORFs are relatively low. This phenomenon implied that two kinds of viruses have not been well adapted to translational regulatory machinery in human cells. Also, we indicated that presumptive susceptibility to viruses according to CAI is usually consistent with the results of experimental research. However, there are still some exceptions. Finally, we found that two viruses have different effects on cellular translational mechanisms. HIV-1 decouples CAI and translational efficiency of endogenous genes in host cells and SARS-CoV-2 exhibits increased CAI for its ORFs in infected cells. Our results implied that at least in cases of HIV-1 and SARS-CoV-2, CAI can be regarded as an auxiliary index to assess cells' susceptibility to viruses but cannot be used as the only evidence to identify viral target cells.


Subject(s)
COVID-19 , HIV-1 , Humans , SARS-CoV-2/genetics , HIV-1/genetics , COVID-19/genetics , Codon/genetics , Adaptation, Physiological/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...