Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 32
Filter
Add more filters










Publication year range
1.
Article in English | MEDLINE | ID: mdl-37831572

ABSTRACT

As a highly contagious disease, COVID-19 has not only had a great impact on the life, study and work of hundreds of millions of people around the world, but also had a huge impact on the global health care system. Therefore, any technical tool that allows for rapid screening and high-precision diagnosis of COVID-19 infections can be of vital help. In order to reduce the burden on health care system, the computer-aided diagnosis of COVID-19 has become a current research hotspot. X-ray imaging is a common and low-cost tool that can help with the COVID-19 diagnosis. The data used for this study has 15,153 CXR images, containing 10,192 normal lungs, 3,631 COVID-19 positive cases and 1,345 images of viral pneumonia. For this computer-aided task, we propose the dual-ended multiple attention learning model (DMAL). The model incorporates multiple attention learning into both networks, and the two networks are linked using an integration module. Specifically, in both networks, the backbone network is used to extract global features and the branch network captures local area information; the integration module combines multi-stage features; and the attention module containing element, channel and spatial attention prompts the model to focus on multi-scale information relevant to the disease. We evaluate the proposed DMAL network using relevant competitive methods as well as ten advanced deep learning models in the image domain and obtain the best performance with 99.67%, 99.53%, 99.66%, 99.60% and 99.76% in terms of Accuracy, Precision, Sensitivity, F1 Scores and Specificity. The proposed method will help in the rapid screening and high-precision diagnosis of COVID-19, given the general trend of such severe global infections. Our code and model are available in [https://github.com/Graziagh/DMALNet].

2.
PLoS One ; 18(9): e0291961, 2023.
Article in English | MEDLINE | ID: mdl-37733828

ABSTRACT

Coronaviruses have affected the lives of people around the world. Increasingly, studies have indicated that the virus is mutating and becoming more contagious. Hence, the pressing priority is to swiftly and accurately predict patient outcomes. In addition, physicians and patients increasingly need interpretability when building machine models in healthcare. We propose an interpretable machine framework(KISM) that can diagnose and prognose patients based on blood test datasets. First, we use k-nearest neighbors, isolated forests, and SMOTE to pre-process the original blood test datasets. Seven machine learning tools Support Vector Machine, Extra Tree, Random Forest, Gradient Boosting Decision Tree, eXtreme Gradient Boosting, Logistic Regression, and ensemble learning were then used to diagnose and predict COVID-19. In addition, we used SHAP and scikit-learn post-hoc interpretability to report feature importance, allowing healthcare professionals and artificial intelligence models to interact to suggest biomarkers that some doctors may have missed. The 10-fold cross-validation of two public datasets shows that the performance of KISM is better than that of the current state-of-the-art methods. In the diagnostic COVID-19 task, an AUC value of 0.9869 and an accuracy of 0.9787 were obtained, and ultimately Leukocytes, platelets, and Proteina C reativa mg/dL were found to be the most indicative biomarkers for the diagnosis of COVID-19. An AUC value of 0.9949 and an accuracy of 0.9677 were obtained in the prognostic COVID-19 task and Age, LYMPH, and WBC were found to be the most indicative biomarkers for identifying the severity of the patient.


Subject(s)
COVID-19 , Humans , COVID-19/diagnosis , Artificial Intelligence , Prognosis , Machine Learning , Blood Platelets , COVID-19 Testing
3.
BMC Bioinformatics ; 24(1): 333, 2023 Sep 06.
Article in English | MEDLINE | ID: mdl-37674125

ABSTRACT

BACKGROUND: Hepatitis C is a prevalent disease that poses a high risk to the human liver. Early diagnosis of hepatitis C is crucial for treatment and prognosis. Therefore, developing an effective medical decision system is essential. In recent years, many computational methods have been proposed to identify hepatitis C patients. Although existing hepatitis prediction models have achieved good results in terms of accuracy, most of them are black-box models and cannot gain the trust of doctors and patients in clinical practice. As a result, this study aims to use various Machine Learning (ML) models to predict whether a patient has hepatitis C, while also using explainable models to elucidate the prediction process of the ML models, thus making the prediction process more transparent. RESULT: We conducted a study on the prediction of hepatitis C based on serological testing and provided comprehensive explanations for the prediction process. Throughout the experiment, we modeled the benchmark dataset, and evaluated model performance using fivefold cross-validation and independent testing experiments. After evaluating three types of black-box machine learning models, Random Forest (RF), Support Vector Machine (SVM), and AdaBoost, we adopted Bayesian-optimized RF as the classification algorithm. In terms of model interpretation, in addition to using common SHapley Additive exPlanations (SHAP) to provide global explanations for the model, we also utilized the Local Interpretable Model-Agnostic Explanations with stability (LIME_stabilitly) to provide local explanations for the model. CONCLUSION: Both the fivefold cross-validation and independent testing show that our proposed method significantly outperforms the state-of-the-art method. IHCP maintains excellent model interpretability while obtaining excellent predictive performance. This helps uncover potential predictive patterns of the model and enables clinicians to better understand the model's decision-making process.


Subject(s)
Hepatitis C , Humans , Bayes Theorem , Hepatitis C/diagnosis , Hepacivirus , Machine Learning
4.
BMC Bioinformatics ; 24(1): 261, 2023 Jun 22.
Article in English | MEDLINE | ID: mdl-37349705

ABSTRACT

BACKGROUND: Autism spectrum disorders (ASD) are a group of neurodevelopmental disorders characterized by difficulty communicating with society and others, behavioral difficulties, and a brain that processes information differently than normal. Genetics has a strong impact on ASD associated with early onset and distinctive signs. Currently, all known ASD risk genes are able to encode proteins, and some de novo mutations disrupting protein-coding genes have been demonstrated to cause ASD. Next-generation sequencing technology enables high-throughput identification of ASD risk RNAs. However, these efforts are time-consuming and expensive, so an efficient computational model for ASD risk gene prediction is necessary. RESULTS: In this study, we propose DeepASDPerd, a predictor for ASD risk RNA based on deep learning. Firstly, we use K-mer to feature encode the RNA transcript sequences, and then fuse them with corresponding gene expression values to construct a feature matrix. After combining chi-square test and logistic regression to select the best feature subset, we input them into a binary classification prediction model constructed by convolutional neural network and long short-term memory for training and classification. The results of the tenfold cross-validation proved our method outperformed the state-of-the-art methods. Dataset and source code are available at https://github.com/Onebear-X/DeepASDPred is freely available. CONCLUSIONS: Our experimental results show that DeepASDPred has outstanding performance in identifying ASD risk RNA genes.


Subject(s)
Autism Spectrum Disorder , Deep Learning , Humans , Autism Spectrum Disorder/genetics , RNA/genetics , Neural Networks, Computer , Software
5.
Pest Manag Sci ; 79(5): 1922-1930, 2023 May.
Article in English | MEDLINE | ID: mdl-36658467

ABSTRACT

BACKGROUND: Succinate dehydrogenase inhibitor (SDHI) fungicides are an important class of agricultural fungicides with the advantages of high efficiency and a broad bactericidal spectrum. To pursue novel SDHIs, a series of N-substituted dithiin tetracarboximide derivatives were designed, synthesized, and characterized by 1 H NMR, 13 C NMR, and high resolution mass spectrum (HRMS). RESULTS: These engineered compounds displayed potent fungicidal activity against phytopathogens, including Sclerotinia sclerotiorum, Botrytis cinerea, and Rhizoctonia solani, comparable with that of the commercial SDHI fungicide boscalid. In particular, compound 18 stood out with prominent activity against S. sclerotiorum with a half-maximal effective concentration (EC50 ) value of 1.37 µg ml-1 . Compound 1 exhibited the most potent antifungal activity against B. cinerea with EC50 values of 5.02 µg ml-1 . As for R. solani, 12 and 13 exhibited remarkably inhibitory activity with EC50 values of 4.26 and 5.76 µg ml-1 , respectively. In the succinate dehydrogenase (SDH) inhibition assay, 13 presented significant inhibitory activity with a half-maximal inhibitory concentration (IC50 ) value of 15.3 µm, which was approximately equivalent to that of boscalid (14.2 µm). Furthermore, molecular docking studies revealed that 13 could anchor in the binding site of SDH. CONCLUSION: Taken together, results suggested that the dithiin tetracarboximide scaffold possessed a huge potential to be developed as novel fungicides and SDHIs. © 2023 Society of Chemical Industry.


Subject(s)
Antifungal Agents , Fungicides, Industrial , Antifungal Agents/chemistry , Fungicides, Industrial/chemistry , Structure-Activity Relationship , Molecular Docking Simulation , Succinate Dehydrogenase
6.
Article in English | MEDLINE | ID: mdl-35536814

ABSTRACT

N6-methyladenosine (m6A) is a universal post-transcriptional modification of RNAs, and it is widely involved in various biological processes. Identifying m6A modification sites accurately is indispensable to further investigate m6A-mediated biological functions. How to better represent RNA sequences is crucial for building effective computational methods for detecting m6A modification sites. However, traditional encoding methods require complex biological prior knowledge and are time-consuming. Furthermore, most of the existing m6A sites prediction methods are limited to single species, and few methods are able to predict m6A sites across different species and tissues. Thus, it is necessary to design a more efficient computational method to predict m6A sites across multiple species and tissues. In this paper, we proposed ELMo4m6A, a contextual language embedding-based method for predicting m6A sites from RNA sequences without any prior knowledge. ELMo4m6A first learns embeddings of RNA sequences using a language model ELMo, then uses a hybrid convolutional neural network (CNN) and long short-term memory (LSTM) to identify m6A sites. The results of 5-fold cross-validation and independent testing demonstrate that ELMo4m6A is superior to state-of-the-art methods. Moreover, we applied integrated gradients to find potential sequence patterns contributing to m6A sites.


Subject(s)
Adenosine , RNA , RNA/genetics , Adenosine/genetics , Neural Networks, Computer , Sequence Analysis, RNA/methods
7.
BMC Bioinformatics ; 23(1): 272, 2022 Jul 11.
Article in English | MEDLINE | ID: mdl-35820811

ABSTRACT

BACKGROUND: Understanding the regulatory role of enhancer-promoter interactions (EPIs) on specific gene expression in cells contributes to the understanding of gene regulation, cell differentiation, etc., and its identification has been a challenging task. On the one hand, using traditional wet experimental methods to identify EPIs often means a lot of human labor and time costs. On the other hand, although the currently proposed computational methods have good recognition effects, they generally require a long training time. RESULTS: In this study, we studied the EPIs of six human cell lines and designed a cell line-specific EPIs prediction method based on a stacking ensemble learning strategy, which has better prediction performance and faster training speed, called StackEPI. Specifically, by combining different encoding schemes and machine learning methods, our prediction method can extract the cell line-specific effective information of enhancer and promoter gene sequences comprehensively and in many directions, and make accurate recognition of cell line-specific EPIs. Ultimately, the source code to implement StackEPI and experimental data involved in the experiment are available at https://github.com/20032303092/StackEPI.git . CONCLUSIONS: The comparison results show that our model can deliver better performance on the problem of identifying cell line-specific EPIs and outperform other state-of-the-art models. In addition, our model also has a more efficient computation speed.


Subject(s)
Cell Communication , Regulatory Sequences, Nucleic Acid , Cell Line , Humans , Machine Learning , Promoter Regions, Genetic
8.
Int J Health Plann Manage ; 37(1): 242-257, 2022 Jan.
Article in English | MEDLINE | ID: mdl-34536240

ABSTRACT

This study investigates the nexus between tourism, CO2 emissions and health spending in Mexico. We applied a nonlinear ARDL approach for the empirical analysis for the time period 1996-2018. Mexico receives a large number of tourists each year, tourism improves foreign exchange earnings and contributes positively to the economic growth. However, tourist activities impose a serious environmental cost in terms of CO2 emissions which increase health spending. The empirical findings suggest that tourism leads to CO2 emissions which resultantly causes a high level of health spending in Mexico. Both short-run and long-run findings reported a significant positive association between tourism, CO2 emissions, and health expenditures. Therefore, the government needs legislation to reduce CO2 emissions, besides the use of renewable energy could also help to reduce the CO2 emissions and health expenditures in society. This study does not support to reduce the health expenditure, rather it suggests optimal utilization of the funds allocated to the health sector.


Subject(s)
Carbon Dioxide , Tourism , Carbon Dioxide/analysis , Economic Development , Mexico , Renewable Energy
9.
Brief Bioinform ; 23(1)2022 01 17.
Article in English | MEDLINE | ID: mdl-34486019

ABSTRACT

Long noncoding RNAs (lncRNAs) play important roles in various biological regulatory processes, and are closely related to the occurrence and development of diseases. Identifying lncRNA-disease associations is valuable for revealing the molecular mechanism of diseases and exploring treatment strategies. Thus, it is necessary to computationally predict lncRNA-disease associations as a complementary method for biological experiments. In this study, we proposed a novel prediction method GCRFLDA based on the graph convolutional matrix completion. GCRFLDA first constructed a graph using the available lncRNA-disease association information. Then, it constructed an encoder consisting of conditional random field and attention mechanism to learn efficient embeddings of nodes, and a decoder layer to score lncRNA-disease associations. In GCRFLDA, the Gaussian interaction profile kernels similarity and cosine similarity were fused as side information of lncRNA and disease nodes. Experimental results on four benchmark datasets show that GCRFLDA is superior to other existing methods. Moreover, we conducted case studies on four diseases and observed that 70 of 80 predicted associated lncRNAs were confirmed by the literature.


Subject(s)
RNA, Long Noncoding , Algorithms , Computational Biology/methods , RNA, Long Noncoding/genetics , Research Design
10.
BMC Bioinformatics ; 22(1): 516, 2021 Oct 23.
Article in English | MEDLINE | ID: mdl-34688247

ABSTRACT

BACKGROUND: The origin is the starting site of DNA replication, an extremely vital part of the informational inheritance between parents and children. More importantly, accurately identifying the origin of replication has great application value in the diagnosis and treatment of diseases related to genetic information errors, while the traditional biological experimental methods are time-consuming and laborious. RESULTS: We carried out research on the origin of replication in a variety of eukaryotes and proposed a unique prediction method for each species. Throughout the experiment, we collected data from 7 species, including Homo sapiens, Mus musculus, Drosophila melanogaster, Arabidopsis thaliana, Kluyveromyces lactis, Pichia pastoris and Schizosaccharomyces pombe. In addition to the commonly used sequence feature extraction methods PseKNC-II and Base-content, we designed a feature extraction method based on TF-IDF. Then the two-step method was utilized for feature selection. After comparing a variety of traditional machine learning classification models, the multi-layer perceptron was employed as the classification algorithm. Ultimately, the data and codes involved in the experiment are available at https://github.com/Sarahyouzi/EukOriginPredict . CONCLUSIONS: The prediction accuracy of the training set of the above-mentioned seven species after 100 times fivefold cross validation reach 92.60%, 90.80%, 91.22%, 96.15%, 96.72%, 99.86%, 96.72%, respectively. It denotes that compared with other methods, the methods we designed could accomplish superior performance. In addition, our experiments reveals that the models of multiple species could predict each other with high accuracy, and the results of STREME shows that they have a certain common motif.


Subject(s)
Drosophila melanogaster , Eukaryota , Animals , Drosophila melanogaster/genetics , Kluyveromyces , Mice , Neural Networks, Computer , Saccharomycetales
11.
Pest Manag Sci ; 77(11): 5109-5119, 2021 Nov.
Article in English | MEDLINE | ID: mdl-34240541

ABSTRACT

BACKGROUND: The worldwide reduction in food production due to pests and diseases is still an important challenge facing today. Validoxylamine A (VAA) is a natural polyhydroxyl compound derived from validamycin, acting as an efficient trehalase inhibitor with insecticidal and antifungal activities. To extend the application and discover green pesticide, a series of ester derivatives were prepared based on VAA as a lead compound. Their biological activities were investigated against three typically agricultural disease, Rhizoctonia solani, Sclerotinia sclerotiorum and Aphis craccivora. RESULTS: This study involved 30 novel validoxylamine A fatty acid esters (VAFAEs) synthesized by Novozym 435 and they were characterized with high-resolution electrospray ionization mass spectrometry (HR-ESI-MS) and proton nuclear magnetic resonance (1 H-NMR). Of these 30 derivatives, most compounds showed improved antifungal activity, and 12 novel compounds showed improved insecticidal activity. When reacted with pentadecanoic acid, compound 14 showed the highest inhibitory activity against R. solani [median effective concentration (EC50 ) 0.01 µmol L-1 ], while the EC50 value of VAA was 34.99 µmol L-1 . Furthermore, 21 novel VAFAEs showed higher inhibitory activity against S. sclerotiorum. Validoxylamine A oleic acid ester, compound 21, exhibited the highest insecticidal activity against A. craccivora [median lethal concentration (LC50 ) 39.63 µmol L-1 ], while the LC50 value of Pymetrozine was 50.45 µmol L-1 , a commercialized pesticide against A. craccivora. CONCLUSION: Combining our results, esterification of VAA by introducing different acyl donors was beneficial for the development of new eco-friendly drugs in the field of pesticides.


Subject(s)
Esters , Ascomycota , Inositol/analogs & derivatives , Rhizoctonia , Structure-Activity Relationship
12.
BMC Bioinformatics ; 22(1): 14, 2021 Jan 07.
Article in English | MEDLINE | ID: mdl-33413088

ABSTRACT

BACKGROUND: With the development of deep learning (DL), more and more methods based on deep learning are proposed and achieve state-of-the-art performance in biomedical image segmentation. However, these methods are usually complex and require the support of powerful computing resources. According to the actual situation, it is impractical that we use huge computing resources in clinical situations. Thus, it is significant to develop accurate DL based biomedical image segmentation methods which depend on resources-constraint computing. RESULTS: A lightweight and multiscale network called PyConvU-Net is proposed to potentially work with low-resources computing. Through strictly controlled experiments, PyConvU-Net predictions have a good performance on three biomedical image segmentation tasks with the fewest parameters. CONCLUSIONS: Our experimental results preliminarily demonstrate the potential of proposed PyConvU-Net in biomedical image segmentation with resources-constraint computing.


Subject(s)
Deep Learning , Image Interpretation, Computer-Assisted , Software
13.
Article in English | MEDLINE | ID: mdl-32850711

ABSTRACT

Plenty of microbes in our human body play a vital role in the process of cell physiology. In recent years, there is accumulating evidence indicating that microbes are closely related to many complex human diseases. In-depth investigation of disease-associated microbes can contribute to understanding the pathogenesis of diseases and thus provide novel strategies for the treatment, diagnosis, and prevention of diseases. To date, many computational models have been proposed for predicting microbe-disease associations using available similarity networks. However, these similarity networks are not effectively fused. In this study, we proposed a novel computational model based on multi-data integration and network consistency projection for Human Microbe-Disease Associations Prediction (HMDA-Pred), which fuses multiple similarity networks by a linear network fusion method. HMDA-Pred yielded AUC values of 0.9589 and 0.9361 ± 0.0037 in the experiments of leave-one-out cross validation (LOOCV) and 5-fold cross validation (5-fold CV), respectively. Furthermore, in case studies, 10, 8, and 10 out of the top 10 predicted microbes of asthma, colon cancer, and inflammatory bowel disease were confirmed by the literatures, respectively.

14.
PLoS One ; 15(5): e0228479, 2020.
Article in English | MEDLINE | ID: mdl-32413030

ABSTRACT

Terminator is a DNA sequence that gives the RNA polymerase the transcriptional termination signal. Identifying terminators correctly can optimize the genome annotation, more importantly, it has considerable application value in disease diagnosis and therapies. However, accurate prediction methods are deficient and in urgent need. Therefore, we proposed a prediction method "iterb-PPse" for terminators by incorporating 47 nucleotide properties into PseKNC-Ⅰ and PseKNC-Ⅱ and utilizing Extreme Gradient Boosting to predict terminators based on Escherichia coli and Bacillus subtilis. Combing with the preceding methods, we employed three new feature extraction methods K-pwm, Base-content, Nucleotidepro to formulate raw samples. The two-step method was applied to select features. When identifying terminators based on optimized features, we compared five single models as well as 16 ensemble models. As a result, the accuracy of our method on benchmark dataset achieved 99.88%, higher than the existing state-of-the-art predictor iTerm-PseKNC in 100 times five-fold cross-validation test. Its prediction accuracy for two independent datasets reached 94.24% and 99.45% respectively. For the convenience of users, we developed a software on the basis of "iterb-PPse" with the same name. The open software and source code of "iterb-PPse" are available at https://github.com/Sarahyouzi/iterb-PPse.


Subject(s)
Sequence Analysis, DNA/methods , Software , Terminator Regions, Genetic , Bacillus subtilis , DNA, Bacterial/chemistry , DNA, Bacterial/genetics , Escherichia coli , RNA, Bacterial/chemistry , RNA, Bacterial/genetics , RNA, Messenger/genetics , RNA, Messenger/metabolism , Rho Factor/metabolism , Transcription Termination, Genetic
15.
RSC Adv ; 10(20): 11634-11642, 2020 Mar 19.
Article in English | MEDLINE | ID: mdl-35496629

ABSTRACT

LncRNA and miRNA are two non-coding RNA types that are popular in current research. LncRNA interacts with miRNA to regulate gene transcription, further affecting human health and disease. Accurate identification of lncRNA-miRNA interactions contributes to the in-depth study of the biological functions and mechanisms of non-coding RNA. However, relying on biological experiments to obtain interaction information is time-consuming and expensive. Considering the rapid accumulation of gene information and the few computational methods, it is urgent to supplement the effective computational models to predict lncRNA-miRNA interactions. In this work, we propose a heterogeneous graph inference method based on similarity network fusion (SNFHGILMI) to predict potential lncRNA-miRNA interactions. First, we calculated multiple similarity data, including lncRNA sequence similarity, miRNA sequence similarity, lncRNA Gaussian nuclear similarity, and miRNA Gaussian nuclear similarity. Second, the similarity network fusion method was employed to integrate the data and get the similarity network of lncRNA and miRNA. Then, we constructed a bipartite network by combining the known interaction network and similarity network of lncRNA and miRNA. Finally, the heterogeneous graph inference method was introduced to construct a prediction model. On the real dataset, the model SNFHGILMI achieved AUC of 0.9501 and 0.9426 ± 0.0035 based on LOOCV and 5-fold cross validation, respectively. Furthermore, case studies also demonstrate that SNFHGILMI is a high-performance prediction method that can accurately predict new lncRNA-miRNA interactions. The Matlab code and readme file of SNFHGILMI can be downloaded from https://github.com/cj-DaSE/SNFHGILMI.

16.
AMB Express ; 9(1): 94, 2019 Jun 28.
Article in English | MEDLINE | ID: mdl-31254161

ABSTRACT

α-Arbutin is an effective skin-whitening cosmetic ingredient and hyperpigmentation therapy agent. It can be synthesized by one-step enzymatic glycosylation of hydroquinone (HQ), but limited by the low yield. Amylosucrase (Amy-1) from Xanthomonas campestris pv. campestris 8004 was recently identified with high HQ glycosylation activity. In this study, whole-cell transformation by Amy-1 was optimized and process scale-up was evaluated in 5000-L reactor. In comparison with purified Amy-1, whole-cell catalyst of recombinant E. coli displays better tolerance against inhibitors (oxidized products of HQ) and requires lower molar ratio of sucrose and HQ to reach high conversion rate (> 99%). Excess accumulation of glucose (0.6-1.0 M) derived from sucrose hydrolysis inhibits HQ glycosylation rate by 46-60%, which suggests the importance of balancing HQ glycosylation rate and sucrose hydrolysis rate by adjusting the activity of whole-cell catalyst and HQ-fed rate. Using optimal conditions, 540 mM of final concentration and 95% of molar conversion rate were obtained within 13-18 h in laboratory scale. For industrial scale-up production, 398 mM and 375 mM of final concentration with high conversion rates (~ 95%) were obtained in 3500-L and 4000-L of reaction volume, respectively. These yields and productivities (4.5-4.9 kg kL-1 h-1) were the highest by comparing to the best we known. Hence, high-yield production of α-arbutin by batch-feeding whole-cell biotransformation was successfully achieved in the 5000-L reaction scale.

17.
J Ind Microbiol Biotechnol ; 46(6): 759-767, 2019 Jun.
Article in English | MEDLINE | ID: mdl-30820723

ABSTRACT

α-Arbutin is an effective skin-whitening cosmetic ingredient and can be synthesized through hydroquinone glycosylation. In this study, amylosucrase (Amy-1) from Xanthomonas campestris pv. campestris 8004 was newly identified as a sucrose-utilizing glycosylating hydroquinone enzyme. Its kinetic parameters showed a seven-time higher affinity to hydroquinone than maltose-utilizing α-glycosidase. The glycosylation of HQ can be quickly achieved with over 99% conversion when a high molar ratio of glycoside donor to acceptor (80:1) was used. A batch-feeding catalysis method was designed to eliminate HQ inhibition with high productivity (> 36.4 mM h-1). Besides, to eliminate the serious inhibition caused by the accumulated hydroquinone oxidation products, the whole-cell catalysis was further proposed. 306 mM of α-arbutin was finally achieved with 95% molar conversion rate within 15 h. Hence, the batch-feeding whole-cell biocatalysis by Amy-1 is a promising technology for α-arbutin production with enhanced yield and molar conversion rate.


Subject(s)
Arbutin/biosynthesis , Glucosyltransferases/metabolism , Hydroquinones/metabolism , Xanthomonas campestris/metabolism , Biocatalysis , Cosmetics , Glycosylation , Oxidation-Reduction
18.
Methods Mol Biol ; 1915: 111-120, 2019.
Article in English | MEDLINE | ID: mdl-30617800

ABSTRACT

Calpains are a family of Ca2+-dependent cysteine proteases involved in many important biological processes, where they selectively cleave relevant substrates at specific cleavage sites to regulate the function of the substrate proteins. Presently, our knowledge about the function of calpains and the mechanism of substrate cleavage is still limited due to the fact that the experimental determination and validation on calpain bindings are usually laborious and expensive. This chapter describes LabCaS, an algorithm that is designed for predicting the calpain substrate cleavage sites from amino acid sequences. LabCaS is built on a conditional random field (CRF) statistic model, which trains the cleavage site prediction on multiple features of amino acid residue preference, solvent accessibility information, pair-wise alignment similarity score, secondary structure propensity, and physical-chemistry properties. Large-scale benchmark tests have shown that LabCaS can achieve a reliable recognition of the cleavage sites for most calpain proteins with an average AUC score of 0.862. Due to the fast speed and convenience of use, the protocol should find its usefulness in large-scale calpain-based function annotations of the newly sequenced proteins. The online web server of LabCaS is freely available at http://www.csbio.sjtu.edu.cn/bioinf/LabCaS .


Subject(s)
Amino Acid Sequence/genetics , Calpain/chemistry , Models, Statistical , Molecular Biology/methods , Algorithms , Binding Sites , Calpain/genetics , Proteolysis , Substrate Specificity
19.
Molecules ; 23(11)2018 Nov 05.
Article in English | MEDLINE | ID: mdl-30400596

ABSTRACT

In the present study, 45 maleimides have been synthesized and evaluated for anti-leishmanial activities against L. donovani in vitro and cytotoxicity toward THP1 cells. All compounds exhibited obvious anti-leishmanial activities. Among the tested compounds, there were 10 maleimides with superior anti-leishmanial activities to standard drug amphotericin B, and 32 maleimides with superior anti-leishmanial activities to standard drug pentamidine, especially compounds 16 (IC50 < 0.0128 µg/mL) and 42 (IC50 < 0.0128 µg/mL), which showed extraordinary efficacy in an in vitro test and low cytotoxicities (CC50 > 10 µg/mL). The anti-leishmanial activities of 16 and 42 were 10 times better than that of amphotericin B. The structure and activity relationship (SAR) studies revealed that 3,4-non-substituted maleimides displayed the strongest anti-leishmanial activities compared to those for 3-methyl-maleimides and 3,4-dichloro-maleimides. 3,4-dichloro-maleimides were the least cytotoxic compared to 3-methyl-maleimides and 3,4-non-substituted maleimides. The results show that several of the reported compounds are promising leads for potential anti-leishmanial drug development.


Subject(s)
Antiprotozoal Agents/pharmacology , Leishmania/drug effects , Maleimides/pharmacology , Antiprotozoal Agents/chemical synthesis , Antiprotozoal Agents/chemistry , Dose-Response Relationship, Drug , Leishmania donovani/drug effects , Maleimides/chemical synthesis , Maleimides/chemistry , Molecular Structure , Parasitic Sensitivity Tests , Structure-Activity Relationship
20.
BMC Genomics ; 17: 582, 2016 08 09.
Article in English | MEDLINE | ID: mdl-27506469

ABSTRACT

BACKGROUND: Non-coding RNAs (ncRNAs) play crucial roles in many biological processes, such as post-transcription of gene regulation. ncRNAs mainly function through interaction with RNA binding proteins (RBPs). To understand the function of a ncRNA, a fundamental step is to identify which protein is involved into its interaction. Therefore it is promising to computationally predict RBPs, where the major challenge is that the interaction pattern or motif is difficult to be found. RESULTS: In this study, we propose a computational method IPMiner (Interaction Pattern Miner) to predict ncRNA-protein interactions from sequences, which makes use of deep learning and further improves its performance using stacked ensembling. One of the IPMiner's typical merits is that it is able to mine the hidden sequential interaction patterns from sequence composition features of protein and RNA sequences using stacked autoencoder, and then the learned hidden features are fed into random forest models. Finally, stacked ensembling is used to integrate different predictors to further improve the prediction performance. The experimental results indicate that IPMiner achieves superior performance on the tested lncRNA-protein interaction dataset with an accuracy of 0.891, sensitivity of 0.939, specificity of 0.831, precision of 0.945 and Matthews correlation coefficient of 0.784, respectively. We further comprehensively investigate IPMiner on other RNA-protein interaction datasets, which yields better performance than the state-of-the-art methods, and the performance has an increase of over 20 % on some tested benchmarked datasets. In addition, we further apply IPMiner for large-scale prediction of ncRNA-protein network, that achieves promising prediction performance. CONCLUSION: By integrating deep neural network and stacked ensembling, from simple sequence composition features, IPMiner can automatically learn high-level abstraction features, which had strong discriminant ability for RNA-protein detection. IPMiner achieved high performance on our constructed lncRNA-protein benchmark dataset and other RNA-protein datasets. IPMiner tool is available at http://www.csbio.sjtu.edu.cn/bioinf/IPMiner .


Subject(s)
Computational Biology/methods , RNA, Untranslated , RNA-Binding Proteins , Software , Area Under Curve , Cluster Analysis , Protein Binding , RNA, Untranslated/genetics , RNA, Untranslated/metabolism , RNA-Binding Proteins/metabolism , Reproducibility of Results
SELECTION OF CITATIONS
SEARCH DETAIL
...