Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 58
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Beilstein J Org Chem ; 20: 852-858, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38655555

RESUMO

We confirm the previously revised stereochemistry of spiroviolene by X-ray crystallographically characterizing a hydrazone derivative of 9-oxospiroviolane, which is synthesized by hydroboration/oxidation of spiroviolene followed by oxidation of the resultant hydroxy group. An unexpected thermal boron migration occurred during the hydroboration process of spiroviolene that resulted in the production of a mixture of 1α-hydroxyspiroviolane, 9α- and 9ß-hydroxyspiroviolane after oxidation. The assertion of the cis-orientation of the 19- and 20-methyl groups provided further support for the revised cyclization mechanism of spiroviolene.

2.
BMC Bioinformatics ; 25(1): 133, 2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38539106

RESUMO

Cancer is one of the leading causes of deaths worldwide. Survival analysis and prediction of cancer patients is of great significance for their precision medicine. The robustness and interpretability of the survival prediction models are important, where robustness tells whether a model has learned the knowledge, and interpretability means if a model can show human what it has learned. In this paper, we propose a robust and interpretable model SurvConvMixer, which uses pathways customized gene expression images and ConvMixer for cancer short-term, mid-term and long-term overall survival prediction. With ConvMixer, the representation of each pathway can be learned respectively. We show the robustness of our model by testing the trained model on absolutely untrained external datasets. The interpretability of SurvConvMixer depends on gradient-weighted class activation mapping (Grad-Cam), by which we can obtain the pathway-level activation heat map. Then wilcoxon rank-sum tests are conducted to obtain the statistically significant pathways, thereby revealing which pathways the model focuses on more. SurvConvMixer achieves remarkable performance on the short-term, mid-term and long-term overall survival of lung adenocarcinoma, lung squamous cell carcinoma and skin cutaneous melanoma, and the external validation tests show that SurvConvMixer can generalize to external datasets so that it is robust. Finally, we investigate the activation maps generated by Grad-Cam, after wilcoxon rank-sum test and Kaplan-Meier estimation, we find that some survival-related pathways play important role in SurvConvMixer.


Assuntos
Adenocarcinoma de Pulmão , Neoplasias Pulmonares , Melanoma , Neoplasias Cutâneas , Humanos , Expressão Gênica
3.
Int J Biol Macromol ; 254(Pt 1): 127721, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-37913883

RESUMO

Glycosylation at C3-OH is the favorable modification for pharmaceutical activities and diversity expansion of 20(R)-dammarane ginsenosides. The 3-O-glycosylation, exclusively occurring in 20(R)-PPD ginsenosides, has never been achieved in 20(R)-PPT ginsenosides. Herein, 3-O-glycosylation of 20(R)-PPT enabled by a glycosyltransferase (GT) OsSGT2 was achieved with the combined assistance of AlphaFold 2 and molecular docking. Firstly, we combined AlphaFold2 algorithm and molecular docking to predict interactions between 20(R)-PPT and candidate GTs. A catalytically favorable binding geometry was thus identified in the OsSGT2-20(R)-PPT complex, suggesting OsSGT2 might act on 20(R)-PPT. The enzymatic assays demonstrated that OsSGT2 reacted with varied sugar donors to form 20(R)-PPT 3-O-glycosides, exhibiting donor promiscuity. Additionally, OsSGT2 displayed acceptor promiscuity, catalyzing 3-O-glucosylation of 20(R/S)-PPT, 20(R/S)-PPD and 20(R/S)-Rh1, respectively. Protein engineering on OsSGT2 was thus performed to probe its catalytic mechanism underlying its stereoselectivity. The W207A mutant preferred 20(S)-dammarane aglycons, while F395Q/A396G(QG) displayed a conversion enhancement towards both 20(R/S)-dammarane aglycons. The QG mutant was then used to synthesize 20(R)-PPT 3-O-glucoside, which displayed a moderate angiotensin-converting enzyme inhibitory effect with an IC50 of 27.5 ± 4.7 µM, superior to that of its 20(S)-epimer, with the combined assistance of target fishing and reverse docking. The water solubility of 20(R)-PPT 3-O-glucoside increased as well.


Assuntos
Ginsenosídeos , Glicosilação , Ginsenosídeos/farmacologia , Simulação de Acoplamento Molecular , Damaranos , Glucosídeos
4.
BMC Bioinformatics ; 24(1): 353, 2023 Sep 20.
Artigo em Inglês | MEDLINE | ID: mdl-37730567

RESUMO

OBJECTIVE: Breast cancer is a significant health issue for women, and human epidermal growth factor receptor-2 (HER2) plays a crucial role as a vital prognostic and predictive factor. The HER2 status is essential for formulating effective treatment plans for breast cancer. However, the assessment of HER2 status using immunohistochemistry (IHC) is time-consuming and costly. Existing computational methods for evaluating HER2 status have limitations and lack sufficient accuracy. Therefore, there is an urgent need for an improved computational method to better assess HER2 status, which holds significant importance in saving lives and alleviating the burden on pathologists. RESULTS: This paper analyzes the characteristics of histological images of breast cancer and proposes a neural network model named HAHNet that combines multi-scale features with attention mechanisms for HER2 status classification. HAHNet directly classifies the HER2 status from hematoxylin and eosin (H&E) stained histological images, reducing additional costs. It achieves superior performance compared to other computational methods. CONCLUSIONS: According to our experimental results, the proposed HAHNet achieved high performance in classifying the HER2 status of breast cancer using only H&E stained samples. It can be applied in case classification, benefiting the work of pathologists and potentially helping more breast cancer patients.


Assuntos
Neoplasias da Mama , Humanos , Feminino , Amarelo de Eosina-(YS) , Redes Neurais de Computação , Coloração e Rotulagem
5.
Brief Bioinform ; 24(5)2023 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-37615358

RESUMO

Non-coding RNA (ncRNA) plays a critical role in biology. ncRNAs from the same family usually have similar functions, as a result, it is essential to predict ncRNA families before identifying their functions. There are two primary methods for predicting ncRNA families, namely, traditional biological methods and computational methods. In traditional biological methods, a lot of manpower and resources are required to predict ncRNA families. Therefore, this paper proposed a new ncRNA family prediction method called MFPred based on computational methods. MFPred identified ncRNA families by extracting sequence features of ncRNAs, and it possessed three primary modules, including (1) four ncRNA sequences encoding and feature extraction module, which encoded ncRNA sequences and extracted four different features of ncRNA sequences, (2) dynamic Bi_GRU and feature fusion module, which extracted contextual information features of the ncRNA sequence and (3) ResNet_SE module that extracted local information features of the ncRNA sequence. In this study, MFPred was compared with the previously proposed ncRNA family prediction methods using two frequently used public ncRNA datasets, NCY and nRC. The results showed that MFPred outperformed other prediction methods in the two datasets.


Assuntos
Biologia Computacional , RNA não Traduzido , Humanos , Biologia Computacional/métodos , RNA não Traduzido/genética
6.
Comput Biol Med ; 164: 107246, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37487383

RESUMO

RNA secondary structure is essential for predicting the tertiary structure and understanding RNA function. Recent research tends to stack numerous modules to design large deep-learning models. This can increase the accuracy to more than 70%, as well as significant training costs and prediction efficiency. We proposed a model with three feature extractors called GCNfold. Structure Extractor utilizes a three-layer Graph Convolutional Network (GCN) to mine the structural information of RNA, such as stems, hairpin, and internal loops. Structure and Sequence Fusion embeds structural information into sequences with Transformer Encoders. Long-distance Dependency Extractor captures long-range pairwise relationships by UNet. The experiments indicate that GCNfold has a small number of parameters, a fast inference speed, and a high accuracy among all models with over 80% accuracy. Additionally, GCNfold-Small takes only 90ms to infer an RNA secondary structure and can achieve close to 90% accuracy on average. The GCNfold code is available on Github https://github.com/EnbinYang/GCNfold.


Assuntos
RNA , Estrutura Secundária de Proteína , RNA/genética
7.
BMC Bioinformatics ; 24(1): 68, 2023 Feb 27.
Artigo em Inglês | MEDLINE | ID: mdl-36849908

RESUMO

BACKGROUND: Although research on non-coding RNAs (ncRNAs) is a hot topic in life sciences, the functions of numerous ncRNAs remain unclear. In recent years, researchers have found that ncRNAs of the same family have similar functions, therefore, it is important to accurately predict ncRNAs families to identify their functions. There are several methods available to solve the prediction problem of ncRNAs family, whose main ideas can be divided into two categories, including prediction based on the secondary structure features of ncRNAs, and prediction according to sequence features of ncRNAs. The first type of prediction method requires a complicated process and has a low accuracy in obtaining the secondary structure of ncRNAs, while the second type of method has a simple prediction process and a high accuracy, but there is still room for improvement. The existing methods for ncRNAs family prediction are associated with problems such as complicated prediction processes and low accuracy, in this regard, it is necessary to propose a new method to predict the ncRNAs family more perfectly. RESULTS: A deep learning model-based method, ncDENSE, was proposed in this study, which predicted ncRNAs families by extracting ncRNAs sequence features. The bases in ncRNAs sequences were encoded by one-hot coding and later fed into an ensemble deep learning model, which contained the dynamic bi-directional gated recurrent unit (Bi-GRU), the dense convolutional network (DenseNet), and the Attention Mechanism (AM). To be specific, dynamic Bi-GRU was used to extract contextual feature information and capture long-term dependencies of ncRNAs sequences. AM was employed to assign different weights to features extracted by Bi-GRU and focused the attention on information with greater weights. Whereas DenseNet was adopted to extract local feature information of ncRNAs sequences and classify them by the full connection layer. According to our results, the ncDENSE method improved the Accuracy, Sensitivity, Precision, F-score, and MCC by 2.08[Formula: see text], 2.33[Formula: see text], 2.14[Formula: see text], 2.16[Formula: see text], and 2.39[Formula: see text], respectively, compared with the suboptimal method. CONCLUSIONS: Overall, the ncDENSE method proposed in this paper extracts sequence features of ncRNAs by dynamic Bi-GRU and DenseNet and improves the accuracy in predicting ncRNAs family and other data.


Assuntos
Disciplinas das Ciências Biológicas , Aprendizado Profundo , Humanos , RNA não Traduzido/genética
8.
Front Genet ; 13: 1062928, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36353108

RESUMO

[This corrects the article DOI: 10.3389/fgene.2021.634635.].

9.
Artigo em Inglês | MEDLINE | ID: mdl-36083953

RESUMO

Manual wheelchair users are exposed to whole-body vibrations as a direct result of using their wheelchair. Wheels, tires, and caster forks have been developed to reduce or attenuate the vibration that transmits through the frame and reaches the user. Five of these components with energy-absorbing characteristics were compared to standard pneumatic drive wheels and casters. This study used a robotic wheelchair propulsion system to repeatedly drive an ultra-lightweight wheelchair over four common indoor and outdoor surfaces: linoleum tile, decorative brick, poured concrete sidewalk, and expanded aluminum grates. Data from the propulsion system and a seat-mounted accelerometer were used to evaluate the energetic efficiency and vibration exposure of each configuration. Equivalence test results identified meaningful differences in both propulsion cost and seat vibration. LoopWheels and SoftWheels both increased propulsion costs by 12-16% over the default configuration without reducing vibration at the seat. Frog Legs suspension caster forks increased vibration exposure by 16-97% across all four surfaces. Softroll casters reduced vibration by 11% over metal grates. Wide pneumatic 'mountain' tires showed no difference from the default configuration. All vibration measurements were within acceptable ranges compared to health guidance standards. Out of the component options, softroll casters show the most promising results for ease of efficiency and effectiveness at reducing vibrations through the wheelchair frame and seat cushion. These results suggest some components with built-in suspension systems are ineffective at reducing vibration exposure beyond standard components, and often introduce mechanical inefficiencies that the user would have to overcome with every propulsion stroke.


Assuntos
Cadeiras de Rodas , Alumínio , Desenho de Equipamento , Humanos , Vibração
10.
Entropy (Basel) ; 24(9)2022 Sep 10.
Artigo em Inglês | MEDLINE | ID: mdl-36141162

RESUMO

Precise iris segmentation is a very important part of accurate iris recognition. Traditional iris segmentation methods require complex prior knowledge and pre- and post-processing and have limited accuracy under non-ideal conditions. Deep learning approaches outperform traditional methods. However, the limitation of a small number of labeled datasets degrades their performance drastically because of the difficulty in collecting and labeling irises. Furthermore, previous approaches ignore the large distribution gap within the non-ideal iris dataset due to illumination, motion blur, squinting eyes, etc. To address these issues, we propose a three-stage training strategy. Firstly, supervised contrastive pretraining is proposed to increase intra-class compactness and inter-class separability to obtain a good pixel classifier under a limited amount of data. Secondly, the entire network is fine-tuned using cross-entropy loss. Thirdly, an intra-dataset adversarial adaptation is proposed, which reduces the intra-dataset gap in the non-ideal situation by aligning the distribution of the hard and easy samples at the pixel class level. Our experiments show that our method improved the segmentation performance and achieved the following encouraging results: 0.44%, 1.03%, 0.66%, 0.41%, and 0.37% in the Nice1 and 96.66%, 98.72%, 93.21%, 94.28%, and 97.41% in the F1 for UBIRIS.V2, IITD, MICHE-I, CASIA-D, and CASIA-T.

11.
J Comput Sci Technol ; 37(4): 991-1002, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35992496

RESUMO

First discovered in Wuhan, China, SARS-CoV-2 is a highly pathogenic novel coronavirus, which rapidly spread globally and became a pandemic with no vaccine and limited distinctive clinical drugs available till March 13th, 2020. Ribonucleic Acid interference (RNAi) technology, a gene-silencing technology that targets mRNA, can cause damage to RNA viruses effectively. Here, we report a new efficient small interfering RNA (siRNA) design method named Simple Multiple Rules Intelligent Method (SMRI) to propose a new solution of the treatment of COVID-19. To be specific, this study proposes a new model named Base Preference and Thermodynamic Characteristic model (BPTC model) indicating the siRNA silencing efficiency and a new index named siRNA Extended Rules index (SER index) based on the BPTC model to screen high-efficiency siRNAs and filter out the siRNAs that are difficult to take effect or synthesize as a part of the SMRI method, which is more robust and efficient than the traditional statistical indicators under the same circumstances. Besides, to silence the spike protein of SARS-CoV-2 to invade cells, this study further puts forward the SMRI method to search candidate high-efficiency siRNAs on SARS-CoV-2's S gene. This study is one of the early studies applying RNAi therapy to the COVID-19 treatment. According to the analysis, the average value of predicted interference efficiency of the candidate siRNAs designed by the SMRI method is comparable to that of the mainstream siRNA design algorithms. Moreover, the SMRI method ensures that the designed siRNAs have more than three base mismatches with human genes, thus avoiding silencing normal human genes. This is not considered by other mainstream methods, thereby the five candidate high-efficiency siRNAs which are easy to take effect or synthesize and much safer for human body are obtained by our SMRI method, which provide a new safer, small dosage and long efficacy solution for the treatment of COVID-19. Supplementary Information: The online version contains supplementary material available at 10.1007/s11390-021-0826-x.

12.
Bioorg Chem ; 128: 106060, 2022 11.
Artigo em Inglês | MEDLINE | ID: mdl-35926428

RESUMO

Fourteen phenolic constituents, notopheninetols A-E (1-5), notoflavinols A and B (6 and 7), and (2R)-5,4'-dihydroxy-7-O-[(E)-3,7-dimethyl-2,6-octadienyl]flavanone (8a), along with 12 known analogues (8b and 9-19) were isolated from the roots and rhizomes of Notopterygium incisum. Compounds 1-4 and 6-8 were seven pairs of enantiomers, and they were separated by chiral HPLC to obtain the optically pure compounds. The structures of the new compounds were elucidated based on detailed analyses of 1D and 2D NMR and HRESIMS data, and the absolute configurations were determined by quantum chemical calculations of the electronic circular dichroism (ECD) spectra, comparison of the experimental ECD data with those reported, and chemical methods. Compounds 1 and 2 possessed a 1-benzyl-2-methyl-indane skeleton, which was unprecedented in natural source. All of the isolated compounds were evaluated for their nitric oxide (NO) inhibitory effects on RAW264.7 cells induced by LPS, and compounds 6a/6b, 7a, 8a/8b, and the hydrogenated products 6'a and 7'a showed moderate inhibitory activities with IC50 values in the range of 6.2-20.6 µM. Moreover, the interactions of these bioactive compounds with inducible nitric oxide synthase (iNOS) were explored by employing molecular docking simulation.


Assuntos
Apiaceae , Rizoma , Apiaceae/química , Simulação de Acoplamento Molecular , Estrutura Molecular , Óxido Nítrico/análise , Raízes de Plantas/química , Rizoma/química
13.
BMC Bioinformatics ; 23(1): 354, 2022 Aug 23.
Artigo em Inglês | MEDLINE | ID: mdl-35999499

RESUMO

BACKGROUND: RNA secondary structure is very important for deciphering cell's activity and disease occurrence. The first method which was used by the academics to predict this structure is biological experiment, But this method is too expensive, causing the promotion to be affected. Then, computing methods emerged, which has good efficiency and low cost. However, the accuracy of computing methods are not satisfactory. Many machine learning methods have also been applied to this area, but the accuracy has not improved significantly. Deep learning has matured and achieves great success in many areas such as computer vision and natural language processing. It uses neural network which is a kind of structure that has good functionality and versatility, but its effect is highly correlated with the quantity and quality of the data. At present, there is no model with high accuracy, low data dependence and high convenience in predicting RNA secondary structure. RESULTS: This paper designs a neural network called LTPConstraint to predict RNA secondary structure. The network is based on many network structure such as Bidirectional LSTM, Transformer and generator. It also uses transfer learning to train modelso that the data dependence can be reduced. CONCLUSIONS: LTPConstraint has achieved high accuracy in RNA secondary structure prediction. Compared with the previous methods, the accuracy improves obviously both in predicting the structure with pseudoknot and the structure without pseudoknot. At the same time, LTPConstraint is easy to operate and can achieve result very quickly.


Assuntos
Redes Neurais de Computação , RNA , Aprendizado de Máquina , Estrutura Secundária de Proteína , RNA/química
14.
BMC Infect Dis ; 22(1): 490, 2022 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-35606725

RESUMO

BACKGROUND: Tuberculosis (TB) is the respiratory infectious disease with the highest incidence in China. We aim to design a series of forecasting models and find the factors that affect the incidence of TB, thereby improving the accuracy of the incidence prediction. RESULTS: In this paper, we developed a new interpretable prediction system based on the multivariate multi-step Long Short-Term Memory (LSTM) model and SHapley Additive exPlanation (SHAP) method. Four accuracy measures are introduced into the system: Root Mean Square Error, Mean Absolute Error, Mean Absolute Percentage Error, and symmetric Mean Absolute Percentage Error. The Autoregressive Integrated Moving Average (ARIMA) model and seasonal ARIMA model are established. The multi-step ARIMA-LSTM model is proposed for the first time to examine the performance of each model in the short, medium, and long term, respectively. Compared with the ARIMA model, each error of the multivariate 2-step LSTM model is reduced by 12.92%, 15.94%, 15.97%, and 14.81% in the short term. The 3-step ARIMA-LSTM model achieved excellent performance, with each error decreased to 15.19%, 33.14%, 36.79%, and 29.76% in the medium and long term. We provide the local and global explanation of the multivariate single-step LSTM model in the field of incidence prediction, pioneering. CONCLUSIONS: The multivariate 2-step LSTM model is suitable for short-term prediction and obtained a similar performance as previous studies. The 3-step ARIMA-LSTM model is appropriate for medium-to-long-term prediction and outperforms these models. The SHAP results indicate that the five most crucial features are maximum temperature, average relative humidity, local financial budget, monthly sunshine percentage, and sunshine hours.


Assuntos
Tuberculose , China/epidemiologia , Previsões , Humanos , Incidência , Modelos Estatísticos , Temperatura , Tuberculose/epidemiologia
15.
Methods ; 204: 368-375, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-35490852

RESUMO

Access to RNA secondary structure is a prerequisite for understanding and mastering RNA function. RNA secondary structures play an important role in cells, they can cause or contribute to neurological disorders and can be applied in the medical field. However, the experimental method to obtain RNA secondary structure is costly, laborious and not universal. Although computational methods can predict RNA secondary structure more accurately for short-sequence RNAs, it cannot predict long-sequence RNAs and pseudoknot, which is the bottleneck of RNA secondary structure prediction at present. In recent years, researchers have attempted to use deep learning algorithms to predict RNA secondary structure and have achieved results. However, the small amount of data on the secondary structure of long-sequence RNAs leads to the low accuracy of deep learning methods to predict the secondary structure of RNAs across races. Similarly, RNA structure with pseudoknot is very complex and insufficient data caused the deep learning algorithm to struggle to predict the secondary structure of RNA containing pseudoknots. The RNA data are encoded into grayscale images by a unique encoding method based on the real RNA secondary structure and sequence information. Then, this paper reasonably expands the image data to increase the amount of RNA data, solves the problem of insufficient data for predicting long sequences and RNA secondary structure with pseudoknots in current deep learning methods, and provides a good data foundation for deep learning.The article proposes a multi-scale feature fusion Conditional Deep Convolutional Generative Adversarial Network prediction model (MSFF-CDCGAN) based on the improved Conditional Deep Convolutional Generative Adversarial Network (CDCGAN) model to predict RNA secondary structure. The experimental results showed that the MSFF-CDCGAN model could predict long-sequence RNAs and pseudoknots more accurately than traditional prediction methods. This paper introduces Generative Adversarial Network (GAN) to RNA secondary structure prediction for the first time. It uses a unique image encoding approach to expand the original RNA data set, thus transforming the structure prediction problem into an image analysis problem and effectively solving the bottleneck in RNA secondary structure prediction.


Assuntos
Algoritmos , RNA , Processamento de Imagem Assistida por Computador , Estrutura Secundária de Proteína , RNA/química , RNA/genética , Análise de Sequência de RNA/métodos
16.
Front Plant Sci ; 13: 860791, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35463453

RESUMO

Although growing evidence shows that microRNA (miRNA) regulates plant growth and development, miRNA regulatory networks in plants are not well understood. Current experimental studies cannot characterize miRNA regulatory networks on a large scale. This information gap provides an excellent opportunity to employ computational methods for global analysis and generate valuable models and hypotheses. To address this opportunity, we collected miRNA-target interactions (MTIs) and used MTIs from Arabidopsis thaliana and Medicago truncatula to predict homologous MTIs in soybeans, resulting in 80,235 soybean MTIs in total. A multi-level iterative bi-clustering method was developed to identify 483 soybean miRNA-target regulatory modules (MTRMs). Furthermore, we collected soybean miRNA expression data and corresponding gene expression data in response to abiotic stresses. By clustering these data, 37 MTRMs related to abiotic stresses were identified, including stress-specific MTRMs and shared MTRMs. These MTRMs have gene ontology (GO) enrichment in resistance response, iron transport, positive growth regulation, etc. Our study predicts soybean MTRMs and miRNA-GO networks under different stresses, and provides miRNA targeting hypotheses for experimental analyses. The method can be applied to other biological processes and other plants to elucidate miRNA co-regulation mechanisms.

17.
Front Genet ; 13: 800853, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35368657

RESUMO

Lung cancer is the leading cause of the cancer deaths. Therefore, predicting the survival status of lung cancer patients is of great value. However, the existing methods mainly depend on statistical machine learning (ML) algorithms. Moreover, they are not appropriate for high-dimensionality genomics data, and deep learning (DL), with strong high-dimensional data learning capability, can be used to predict lung cancer survival using genomics data. The Cancer Genome Atlas (TCGA) is a great database that contains many kinds of genomics data for 33 cancer types. With this enormous amount of data, researchers can analyze key factors related to cancer therapy. This paper proposes a novel method to predict lung cancer long-term survival using gene expression data from TCGA. Firstly, we select the most relevant genes to the target problem by the supervised feature selection method called mutual information selector. Secondly, we propose a method to convert gene expression data into two kinds of images with KEGG BRITE and KEGG Pathway data incorporated, so that we could make good use of the convolutional neural network (CNN) model to learn high-level features. Afterwards, we design a CNN-based DL model and added two kinds of clinical data to improve the performance, so that we finally got a multimodal DL model. The generalized experiments results indicated that our method performed much better than the ML models and unimodal DL models. Furthermore, we conduct survival analysis and observe that our model could better divide the samples into high-risk and low-risk groups.

19.
ACS Synth Biol ; 10(12): 3583-3594, 2021 12 17.
Artigo em Inglês | MEDLINE | ID: mdl-34846134

RESUMO

The diversity expansion of testosterone17-O-ß-glycosides (TGs) will increase the probability of screening more active molecules from their acetylated derivatives with anticancer activities. Glycosyltransferases (GTs) responsible for the increased diversity of TGs, however, were seldom documented. Herein, a glycosyltransferase OsSGT2 with testosterone glycodiversification capacity was identified from Ornithogalum saundersiae through transcriptome-wide mining. Specifically, OsSGT2 was demonstrated to be reactive with testosterone and eight donors. OsSGT2 displayed both sugar-aglycon and sugar-sugar GT activities. OsSGT2-catalyzed testosterone glycodiversification could be achieved, generating testosterone monoglycosides and disglycosides with varied percentage conversions. Among the eight donors, the conversion of UDP-Glc was the highest, approaching 90%, while the percentage conversions of UDP-GlcNAc, UDP-Gal, helicin, and UDP-Rha were less than 10%. Protein engineering toward F395 was thus performed to improve the conversion of UDP-GlcNAc. Eight variants displayed increased conversions and the mutant F395C got the highest conversion of 72.11 ± 7.82%, eight times more than that of the wild-type. This study provides a promising alternative for diversity expansion of TGs, also significant insights into the molecular basis for the conversion improvement of sugar donors.


Assuntos
Ornithogalum , Glicosídeos/metabolismo , Glicosiltransferases/genética , Glicosiltransferases/metabolismo , Ornithogalum/genética , Ornithogalum/metabolismo , Engenharia de Proteínas , Testosterona
20.
BMC Bioinformatics ; 22(1): 447, 2021 Sep 20.
Artigo em Inglês | MEDLINE | ID: mdl-34544356

RESUMO

BACKGROUND: Studies have proven that the same family of non-coding RNAs (ncRNAs) have similar functions, so predicting the ncRNAs family is helpful to the research of ncRNAs functions. The existing calculation methods mainly fall into two categories: the first type is to predict ncRNAs family by learning the features of sequence or secondary structure, and the other type is to predict ncRNAs family by the alignment among homologs sequences. In the first type, some methods predict ncRNAs family by learning predicted secondary structure features. The inaccuracy of predicted secondary structure may cause the low accuracy of those methods. Different from that, ncRFP directly learning the features of ncRNA sequences to predict ncRNAs family. Although ncRFP simplifies the prediction process and improves the performance, there is room for improvement in ncRFP performance due to the incomplete features of its input data. In the secondary type, the homologous sequence alignment method can achieve the highest performance at present. However, due to the need for consensus secondary structure annotation of ncRNA sequences, and the helplessness for modeling pseudoknots, the use of the method is limited. RESULTS: In this paper, a novel method "ncDLRES", which according to learning the sequence features, is proposed to predict the family of ncRNAs based on Dynamic LSTM (Long Short-term Memory) and ResNet (Residual Neural Network). CONCLUSIONS: ncDLRES extracts the features of ncRNA sequences based on Dynamic LSTM and then classifies them by ResNet. Compared with the homologous sequence alignment method, ncDLRES reduces the data requirement and expands the application scope. By comparing with the first type of methods, the performance of ncDLRES is greatly improved.


Assuntos
Biologia Computacional , RNA não Traduzido , Redes Neurais de Computação , Conformação de Ácido Nucleico , RNA não Traduzido/genética , Alinhamento de Sequência
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...