Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 70
Filter
1.
bioRxiv ; 2024 May 19.
Article in English | MEDLINE | ID: mdl-38798479

ABSTRACT

Continued advances in variant effect prediction are necessary to demonstrate the ability of machine learning methods to accurately determine the clinical impact of variants of unknown significance (VUS). Towards this goal, the ARSA Critical Assessment of Genome Interpretation (CAGI) challenge was designed to characterize progress by utilizing 219 experimentally assayed missense VUS in the Arylsulfa-tase A ( ARSA ) gene to assess the performance of community-submitted predictions of variant functional effects. The challenge involved 15 teams, and evaluated additional predictions from established and recently released models. Notably, a model developed by participants of a genetics and coding bootcamp, trained with standard machine-learning tools in Python, demonstrated superior performance among sub-missions. Furthermore, the study observed that state-of-the-art deep learning methods provided small but statistically significant improvement in predictive performance compared to less elaborate techniques. These findings underscore the utility of variant effect prediction, and the potential for models trained with modest resources to accurately classify VUS in genetic and clinical research.

2.
Angew Chem Int Ed Engl ; : e202400441, 2024 Apr 08.
Article in English | MEDLINE | ID: mdl-38587149

ABSTRACT

Nickel-catalyzed transannulation reactions triggered by the extrusion of small gaseous molecules have emerged as a powerful strategy for the efficient construction of heterocyclic compounds. However, their use in asymmetric synthesis remains challenging because of the difficulty in controlling stereo- and regioselectivity. Herein, we report the first nickel-catalyzed asymmetric synthesis of N-N atropisomers by the denitrogenative transannulation of benzotriazones with alkynes. A broad range of N-N atropisomers was obtained with excellent regio- and enantioselectivity under mild conditions. Moreover, density functional theory (DFT) calculations provided insights into the nickel-catalyzed reaction mechanism and enantioselectivity control.

3.
Comput Biol Med ; 172: 108227, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38460308

ABSTRACT

Accurately predicting protein-ATP binding residues is critical for protein function annotation and drug discovery. Computational methods dedicated to the prediction of binding residues based on protein sequence information have exhibited notable advancements in predictive accuracy. Nevertheless, these methods continue to grapple with several formidable challenges, including limited means of extracting more discriminative features and inadequate algorithms for integrating protein and residue information. To address the problems, we propose ATP-Deep, a novel protein-ATP binding residues predictor. ATP-Deep harnesses the capabilities of unsupervised pre-trained language models and incorporates domain-specific evolutionary context information from homologous sequences. It further refines the embedding at the residue level through integration with corresponding protein-level information and employs a contextual-based co-attention mechanism to adeptly fuse multiple sources of features. The performance evaluation results on the benchmark datasets reveal that ATP-Deep achieves an AUC of 0.954 and 0.951, respectively, surpassing the performance of the state-of-the-art model. These findings underscore the effectiveness of assimilating protein-level information and deploying a contextual-based co-attention mechanism grounded in context to bolster the prediction performance of protein-ATP binding residues.


Subject(s)
Algorithms , Proteins , Protein Binding , Proteins/chemistry , Amino Acid Sequence , Adenosine Triphosphate
4.
J Chem Inf Model ; 64(4): 1407-1418, 2024 Feb 26.
Article in English | MEDLINE | ID: mdl-38334115

ABSTRACT

Studying the effect of single amino acid variations (SAVs) on protein structure and function is integral to advancing our understanding of molecular processes, evolutionary biology, and disease mechanisms. Screening for deleterious variants is one of the crucial issues in precision medicine. Here, we propose a novel computational approach, TransEFVP, based on large-scale protein language model embeddings and a transformer-based neural network to predict disease-associated SAVs. The model adopts a two-stage architecture: the first stage is designed to fuse different feature embeddings through a transformer encoder. In the second stage, a support vector machine model is employed to quantify the pathogenicity of SAVs after dimensionality reduction. The prediction performance of TransEFVP on blind test data achieves a Matthews correlation coefficient of 0.751, an F1-score of 0.846, and an area under the receiver operating characteristic curve of 0.871, higher than the existing state-of-the-art methods. The benchmark results demonstrate that TransEFVP can be explored as an accurate and effective SAV pathogenicity prediction method. The data and codes for TransEFVP are available at https://github.com/yzh9607/TransEFVP/tree/master for academic use.


Subject(s)
Algorithms , Proteins , Humans , Proteins/chemistry , Amino Acid Sequence , Neural Networks, Computer , Amino Acids
5.
J Chem Inf Model ; 64(4): 1394-1406, 2024 Feb 26.
Article in English | MEDLINE | ID: mdl-38349747

ABSTRACT

Nonsynonymous single-nucleotide polymorphisms (nsSNPs), implicated in over 6000 diseases, necessitate accurate prediction for expedited drug discovery and improved disease diagnosis. In this study, we propose FCMSTrans, a novel nsSNP predictor that innovatively combines the transformer framework and multiscale modules for comprehensive feature extraction. The distinctive attribute of FCMSTrans resides in a deep feature combination strategy. This strategy amalgamates evolutionary-scale modeling (ESM) and ProtTrans (PT) features, providing an understanding of protein biochemical properties, and position-specific scoring matrix, secondary structure, predicted relative solvent accessibility, and predicted disorder (PSPP) features, which are derived from four protein sequences and structure-oriented characteristics. This feature combination offers a comprehensive view of the molecular dynamics involving nsSNPs. Our model employs the transformer's self-attention mechanisms across multiple layers, extracting higher-level and abstract representations. Simultaneously, varied-level features are captured by multiscale convolutions, enriching feature abstraction at multiple echelons. Our comparative analyses with existing methodologies highlight significant improvements made possible by the integrated feature fusion approach adopted in FCMSTrans. This is further substantiated by performance assessments based on diverse data sets, such as PredictSNP, MMP, and PMD, with areas under the curve (AUCs) of 0.869, 0.819, and 0.693, respectively. Furthermore, FCMSTrans shows robustness and superiority by outperforming the current best predictor, PROVEAN, in a blind test conducted on a third-party data set, achieving an impressive AUC score of 0.7838. The Python code of FCMSTrans is available at https://github.com/gc212/FCMSTrans for academic usage.


Subject(s)
Drug Discovery , Electric Power Supplies , Amino Acid Sequence , Area Under Curve , Polymorphism, Single Nucleotide
6.
ACS Omega ; 9(2): 2032-2047, 2024 Jan 16.
Article in English | MEDLINE | ID: mdl-38250421

ABSTRACT

Genetic variations (including substitutions, insertions, and deletions) exert a profound influence on DNA sequences. These variations are systematically classified as synonymous, nonsynonymous, and nonsense, each manifesting distinct effects on proteins. The implementation of high-throughput sequencing has significantly augmented our comprehension of the intricate interplay between gene variations and protein structure and function, as well as their ramifications in the context of diseases. Frameshift variations, particularly small insertions and deletions (indels), disrupt protein coding and are instrumental in disease pathogenesis. This review presents a succinct review of computational methods, databases, current challenges, and future directions in predicting the consequences of coding frameshift small indels variations. We analyzed the predictive efficacy, reliability, and utilization of computational methods and variant account, reliability, and utilization of database. Besides, we also compared the prediction methodologies on GOF/LOF pathogenic variation data. Addressing the challenges pertaining to prediction accuracy and cross-species generalizability, nascent technologies such as AI and deep learning harbor immense potential to enhance predictive capabilities. The importance of interdisciplinary research and collaboration cannot be overstated for devising effective diagnosis, treatment, and prevention strategies concerning diseases associated with coding frameshift indels variations.

7.
Int J Biol Macromol ; 260(Pt 1): 129245, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38191109

ABSTRACT

Aerogels with low thermal conductivity and high adsorption capacity present a promising solution to curb water pollution caused by organic reagents as well as mitigate heat loss. Although aerogels exhibiting good adsorption capacity and thermal insulation have been reported, materials with mechanical integrity, high flexibility and shear resistance still pose a formidable task. Here, we produced bacterial cellulose-based ultralight multifunctional hybrid aerogels by using freeze-drying followed by chemical vapor deposition silylation method. The hybrid aerogels displayed a low density of 10-15 mg/cm3, high porosity exceeding 99.1 %, low thermal conductivity (27.3-29.2 mW/m.K) and superior hydrophobicity (water contact angle>120o). They also exhibited excellent mechanical properties including superelasticity, high flexibility and shear resistance. The hybrid aerogels demonstrated high heat shielding efficiency when used as an insulating material. As a selective oil absorbent, the hybrid aerogels exhibit a maximum adsorption capacity of up to approximately 156 times its own weight and excellent recoverability. Especially, the aerogel's highly accessible porous microstructure results in an impressive flux rate of up to 162 L/h.g when used as a filter in a continuous oil-water separator to isolate n-hexane-water mixtures. This work presents a novel endeavor to create high-performance, sustainable, reusable, and adaptable multifunctional aerogels.


Subject(s)
Cellulose , Gases , Adsorption , Freeze Drying , Hot Temperature
8.
Am J Obstet Gynecol ; 230(4): 390-402, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38072372

ABSTRACT

OBJECTIVE: This study aimed to provide procedure-specific estimates of the risk for symptomatic venous thromboembolism and major bleeding in noncancer gynecologic surgeries. DATA SOURCES: We conducted comprehensive searches on Embase, MEDLINE, Web of Science, and Google Scholar. Furthermore, we performed separate searches for randomized trials that addressed the effects of thromboprophylaxis. STUDY ELIGIBILITY CRITERIA: Eligible studies were observational studies that enrolled ≥50 adult patients who underwent noncancer gynecologic surgery procedures and that reported the absolute incidence of at least 1 of the following: symptomatic pulmonary embolism, symptomatic deep vein thrombosis, symptomatic venous thromboembolism, bleeding that required reintervention (including re-exploration and angioembolization), bleeding that led to transfusion, or postoperative hemoglobin level <70 g/L. METHODS: A teams of 2 reviewers independently assessed eligibility, performed data extraction, and evaluated the risk of bias of the eligible articles. We adjusted the reported estimates for thromboprophylaxis and length of follow-up and used the median value from studies to determine the cumulative incidence at 4 weeks postsurgery stratified by patient venous thromboembolism risk factors and used the Grading of Recommendations Assessment, Development and Evaluation approach to rate the evidence certainty. RESULTS: We included 131 studies (1,741,519 patients) that reported venous thromboembolism risk estimates for 50 gynecologic noncancer procedures and bleeding requiring reintervention estimates for 35 procedures. The evidence certainty was generally moderate or low for venous thromboembolism and low or very low for bleeding requiring reintervention. The risk for symptomatic venous thromboembolism varied from a median of <0.1% for several procedures (eg, transvaginal oocyte retrieval) to 1.5% for others (eg, minimally invasive sacrocolpopexy with hysterectomy, 1.2%-4.6% across patient venous thromboembolism risk groups). Venous thromboembolism risk was <0.5% for 30 (60%) of the procedures; 0.5% to 1.0% for 10 (20%) procedures; and >1.0% for 10 (20%) procedures. The risk for bleeding the require reintervention varied from <0.1% (transvaginal oocyte retrieval) to 4.0% (open myomectomy). The bleeding requiring reintervention risk was <0.5% in 17 (49%) procedures, 0.5% to 1.0% for 12 (34%) procedures, and >1.0% in 6 (17%) procedures. CONCLUSION: The risk for venous thromboembolism in gynecologic noncancer surgery varied between procedures and patients. Venous thromboembolism risks exceeded the bleeding risks only among selected patients and procedures. Although most of the evidence is of low certainty, the results nevertheless provide a compelling rationale for restricting pharmacologic thromboprophylaxis to a minority of patients who undergo gynecologic noncancer procedures.


Subject(s)
Thrombosis , Venous Thromboembolism , Adult , Humans , Female , Anticoagulants/therapeutic use , Venous Thromboembolism/prevention & control , Postoperative Complications/prevention & control , Hemorrhage/chemically induced , Gynecologic Surgical Procedures/adverse effects
9.
Ann Surg ; 279(2): 213-225, 2024 Feb 01.
Article in English | MEDLINE | ID: mdl-37551583

ABSTRACT

OBJECTIVE: To provide procedure-specific estimates of symptomatic venous thromboembolism (VTE) and major bleeding after abdominal surgery. BACKGROUND: The use of pharmacological thromboprophylaxis represents a trade-off that depends on VTE and bleeding risks that vary between procedures; their magnitude remains uncertain. METHODS: We identified observational studies reporting procedure-specific risks of symptomatic VTE or major bleeding after abdominal surgery, adjusted the reported estimates for thromboprophylaxis and length of follow-up, and estimated cumulative incidence at 4 weeks postsurgery, stratified by VTE risk groups, and rated evidence certainty. RESULTS: After eligibility screening, 285 studies (8,048,635 patients) reporting on 40 general abdominal, 36 colorectal, 15 upper gastrointestinal, and 24 hepatopancreatobiliary surgery procedures proved eligible. Evidence certainty proved generally moderate or low for VTE and low or very low for bleeding requiring reintervention. The risk of VTE varied substantially among procedures: in general abdominal surgery from a median of <0.1% in laparoscopic cholecystectomy to a median of 3.7% in open small bowel resection, in colorectal from 0.3% in minimally invasive sigmoid colectomy to 10.0% in emergency open total proctocolectomy, and in upper gastrointestinal/hepatopancreatobiliary from 0.2% in laparoscopic sleeve gastrectomy to 6.8% in open distal pancreatectomy for cancer. CONCLUSIONS: VTE thromboprophylaxis provides net benefit through VTE reduction with a small increase in bleeding in some procedures (eg, open colectomy and open pancreaticoduodenectomy), whereas the opposite is true in others (eg, laparoscopic cholecystectomy and elective groin hernia repairs). In many procedures, thromboembolism and bleeding risks are similar, and decisions depend on individual risk prediction and values and preferences regarding VTE and bleeding.


Subject(s)
Colorectal Neoplasms , Thrombosis , Venous Thromboembolism , Humans , Anticoagulants/therapeutic use , Colorectal Neoplasms/drug therapy , Hemorrhage , Postoperative Complications/epidemiology , Postoperative Complications/prevention & control , Postoperative Complications/drug therapy , Venous Thromboembolism/epidemiology , Venous Thromboembolism/etiology , Venous Thromboembolism/prevention & control
10.
Am J Obstet Gynecol ; 230(4): 403-416, 2024 Apr.
Article in English | MEDLINE | ID: mdl-37827272

ABSTRACT

OBJECTIVE: This study aimed to provide procedure-specific estimates of the risk of symptomatic venous thromboembolism and major bleeding in the absence of thromboprophylaxis, following gynecologic cancer surgery. DATA SOURCES: We conducted comprehensive searches on Embase, MEDLINE, Web of Science, and Google Scholar for observational studies. We also reviewed reference lists of eligible studies and review articles. We performed separate searches for randomized trials addressing effects of thromboprophylaxis and conducted a web-based survey on thromboprophylaxis practice. STUDY ELIGIBILITY CRITERIA: Observational studies enrolling ≥50 adult patients undergoing gynecologic cancer surgery procedures reporting absolute incidence for at least 1 of the following were included: symptomatic pulmonary embolism, symptomatic deep vein thrombosis, symptomatic venous thromboembolism, bleeding requiring reintervention (including reexploration and angioembolization), bleeding leading to transfusion, or postoperative hemoglobin <70 g/L. METHODS: Two reviewers independently assessed eligibility, performed data extraction, and evaluated risk of bias of eligible articles. We adjusted the reported estimates for thromboprophylaxis and length of follow-up and used the median value from studies to determine cumulative incidence at 4 weeks postsurgery stratified by patient venous thromboembolism risk factors. The GRADE approach was applied to rate evidence certainty. RESULTS: We included 188 studies (398,167 patients) reporting on 37 gynecologic cancer surgery procedures. The evidence certainty was generally low to very low. Median symptomatic venous thromboembolism risk (in the absence of prophylaxis) was <1% in 13 of 37 (35%) procedures, 1% to 2% in 11 of 37 (30%), and >2.0% in 13 of 37 (35%). The risks of venous thromboembolism varied from 0.1% in low venous thromboembolism risk patients undergoing cervical conization to 33.5% in high venous thromboembolism risk patients undergoing pelvic exenteration. Estimates of bleeding requiring reintervention varied from <0.1% to 1.3%. Median risks of bleeding requiring reintervention were <1% in 22 of 29 (76%) and 1% to 2% in 7 of 29 (24%) procedures. CONCLUSION: Venous thromboembolism reduction with thromboprophylaxis likely outweighs the increase in bleeding requiring reintervention in many gynecologic cancer procedures (eg, open surgery for ovarian cancer and pelvic exenteration). In some procedures (eg, laparoscopic total hysterectomy without lymphadenectomy), thromboembolism and bleeding risks are similar, and decisions depend on individual risk prediction and values and preferences regarding venous thromboembolism and bleeding.


Subject(s)
Neoplasms , Thrombosis , Venous Thromboembolism , Adult , Humans , Female , Anticoagulants/therapeutic use , Venous Thromboembolism/epidemiology , Venous Thromboembolism/prevention & control , Postoperative Complications/prevention & control , Hemorrhage
11.
J Chem Inf Model ; 63(22): 7239-7257, 2023 Nov 27.
Article in English | MEDLINE | ID: mdl-37947586

ABSTRACT

Understanding the pathogenicity of missense mutation (MM) is essential for shed light on genetic diseases, gene functions, and individual variations. In this study, we propose a novel computational approach, called MMPatho, for enhancing missense mutation pathogenic prediction. First, we established a large-scale nonredundant MM benchmark data set based on the entire Ensembl database, complemented by a focused blind test set specifically for pathogenic GOF/LOF MM. Based on this data set, for each mutation, we utilized Ensembl VEP v104 and dbNSFP v4.1a to extract variant-level, amino acid-level, individuals' outputs, and genome-level features. Additionally, protein sequences were generated using ENSP identifiers with the Ensembl API, and then encoded. The mutant sites' ESM-1b and ProtTrans-T5 embeddings were subsequently extracted. Then, our model group (MMPatho) was developed by leveraging upon these efforts, which comprised ConsMM and EvoIndMM. To be specific, ConsMM employs individuals' outputs and XGBoost with SHAP explanation analysis, while EvoIndMM investigates the potential enhancement of predictive capability by incorporating evolutionary information from ESM-1b and ProtT5-XL-U50, large protein language embeddings. Through rigorous comparative experiments, both ConsMM and EvoIndMM were capable of achieving remarkable AUROC (0.9836 and 0.9854) and AUPR (0.9852 and 0.9902) values on the blind test set devoid of overlapping variations and proteins from the training data, thus highlighting the superiority of our computational approach in the prediction of MM pathogenicity. Our Web server, available at http://csbio.njust.edu.cn/bioinf/mmpatho/, allows researchers to predict the pathogenicity (alongside the reliability index score) of MMs using the ConsMM and EvoIndMM models and provides extensive annotations for user input. Additionally, the newly constructed benchmark data set and blind test set can be accessed via the data page of our web server.


Subject(s)
Computational Biology , Mutation, Missense , Humans , Reproducibility of Results , Consensus , Proteins
12.
IEEE/ACM Trans Comput Biol Bioinform ; 20(5): 3205-3214, 2023.
Article in English | MEDLINE | ID: mdl-37289599

ABSTRACT

It has been demonstrated that RNA modifications play essential roles in multiple biological processes. Accurate identification of RNA modifications in the transcriptome is critical for providing insights into the biological functions and mechanisms. Many tools have been developed for predicting RNA modifications at single-base resolution, which employ conventional feature engineering methods that focus on feature design and feature selection processes that require extensive biological expertise and may introduce redundant information. With the rapid development of artificial intelligence technologies, end-to-end methods are favorably received by researchers. Nevertheless, each well-trained model is only suitable for a specific RNA methylation modification type for nearly all of these approaches. In this study, we present MRM-BERT by feeding task-specific sequences into the powerful BERT (Bidirectional Encoder Representations from Transformers) model and implementing fine-tuning, which exhibits competitive performance to the state-of-the-art methods. MRM-BERT avoids repeated de novo training of the model and can predict multiple RNA modifications such as pseudouridine, m6A, m5C, and m1A in Mus musculus, Arabidopsis thaliana, and Saccharomyces cerevisiae. In addition, we analyse the attention heads to provide high attention regions for the prediction, and conduct saturated in silico mutagenesis of the input sequences to discover potential changes of RNA modifications, which can better assist researchers in their follow-up research.


Subject(s)
Arabidopsis , Artificial Intelligence , Mice , Animals , Pseudouridine , Arabidopsis/genetics , Transcriptome , Saccharomyces cerevisiae/genetics , RNA/genetics
13.
Materials (Basel) ; 16(12)2023 Jun 14.
Article in English | MEDLINE | ID: mdl-37374571

ABSTRACT

Ductility-based structural design is currently the mainstream method. In order to analyze the ductility performance of concrete columns with high-strength steel reinforcements under eccentric compression, corresponding experimental studies have been performed. Numerical models were established, and their reliability was verified. Based on the numerical models, the parameter analysis was carried out, where eccentricity, concrete strength, and reinforcement ratio were considered to systematically discuss the ductility of the concrete column section with high-strength steel reinforcement. The results show that the ductility of the section under eccentric compression increases with the strength of the concrete and eccentricity, and decreases with the reinforcement ratio. Finally, a simplified calculation formula capable of quantitatively evaluating the section ductility was proposed.

14.
Huan Jing Ke Xue ; 44(4): 2093-2102, 2023 Apr 08.
Article in Chinese | MEDLINE | ID: mdl-37040959

ABSTRACT

To reveal the characteristics and key impact factors of phytoplankton communities in different types of lakes, sampling surveys for phytoplankton and water quality parameters were conducted at 174 sampling sites in a total of 24 lakes covering urban, countryside, and ecological conservation areas of Wuhan in spring, summer, autumn, and winter 2018. The results showed that a total of 365 species of phytoplankton from nine phyla and 159 genera were identified in the three types of lakes. The main species were green algae, cyanobacteria, and diatoms, accounting for 55.34%, 15.89%, and 15.07% of the total number of species, respectively. The phytoplankton cell density varied from 3.60×106-421.99×106 cell·L-1, chlorophyll-a content varied from 15.60-240.50 µg·L-1, biomass varied from 27.71-379.79 mg·L-1, and the Shannon-Wiener diversity index varied from 0.29-2.86. In the three lake types, cell density, Chla, and biomass were lower in EL and UL, whereas the opposite was true for the Shannon-Wiener diversity index. NMDS and ANOSIM analysis showed differences in phytoplankton community structure (Stress=0.13, R=0.048, P=0.2298). In addition, the phytoplankton community structure of the three lake types had significant seasonal characteristics, with chlorophyll-a content and biomass being significantly higher in summer than in winter (P<0.05). Spearman correlation analysis showed that phytoplankton biomass decreased with increasing N:P in UL and CL, whereas the opposite was true for EL. Redundancy analysis (RDA) showed that WT, pH, NO3-, EC, and N:P were the key factors that significantly affected the variability in phytoplankton community structure in the three types of lakes in Wuhan (P<0.05).


Subject(s)
Cyanobacteria , Diatoms , Phytoplankton , Lakes/analysis , Chlorophyll/analysis , Chlorophyll A
16.
Brief Bioinform ; 24(1)2023 01 19.
Article in English | MEDLINE | ID: mdl-36528806

ABSTRACT

Determining the pathogenicity and functional impact (i.e. gain-of-function; GOF or loss-of-function; LOF) of a variant is vital for unraveling the genetic level mechanisms of human diseases. To provide a 'one-stop' framework for the accurate identification of pathogenicity and functional impact of variants, we developed a two-stage deep-learning-based computational solution, termed VPatho, which was trained using a total of 9619 pathogenic GOF/LOF and 138 026 neutral variants curated from various databases. A total number of 138 variant-level, 262 protein-level and 103 genome-level features were extracted for constructing the models of VPatho. The development of VPatho consists of two stages: (i) a random under-sampling multi-scale residual neural network (ResNet) with a newly defined weighted-loss function (RUS-Wg-MSResNet) was proposed to predict variants' pathogenicity on the gnomAD_NV + GOF/LOF dataset; and (ii) an XGBOD model was constructed to predict the functional impact of the given variants. Benchmarking experiments demonstrated that RUS-Wg-MSResNet achieved the highest prediction performance with the weights calculated based on the ratios of neutral versus pathogenic variants. Independent tests showed that both RUS-Wg-MSResNet and XGBOD achieved outstanding performance. Moreover, assessed using variants from the CAGI6 competition, RUS-Wg-MSResNet achieved superior performance compared to state-of-the-art predictors. The fine-trained XGBOD models were further used to blind test the whole LOF data downloaded from gnomAD and accordingly, we identified 31 nonLOF variants that were previously labeled as LOF/uncertain variants. As an implementation of the developed approach, a webserver of VPatho is made publicly available at http://csbio.njust.edu.cn/bioinf/vpatho/ to facilitate community-wide efforts for profiling and prioritizing the query variants with respect to their pathogenicity and functional impact.


Subject(s)
Deep Learning , Humans , Gain of Function Mutation , Genome
17.
Article in Chinese | WPRIM (Western Pacific) | ID: wpr-1008624

ABSTRACT

Bufonis Venenum, an animal medicinal material, is widely used for treating cardiovascular diseases and pain induced by rheumatics or malignant tumors. In view of the high activity and high toxicity, it is of great significance to pay attention to the quality control of Bufonis Venenum to ensure the safety and effectiveness of its preparations. China's drug standards involve 102 preparations(474 batch numbers) containing Bufonis Venenum approved for sale, including 14 preparations in the Chinese Pharmacopoeia(2020 edition) and 68 preparations in the standards issued by the Ministry of Health Drug Standard of the People's Republic of China. Bufonis Venenum is mostly used in pill and powder preparations in the form of raw powder, with the main functions of clearing heat, removing toxin, relieving swelling and pain, replenishing qi, activating blood, opening orifice, and awakening brain. Except the high level of quality control for Bufonis Venenum in the preparations in the Chinese Pharmacopoeia(2020 edition), the quality control standards of Bufonis Venenum in other preparations are low or even absent. Therefore, it is urgent to conduct research on the improvement of quality standards for the preparations containing Bufonis Venenum. This study retrieved the reports focusing on the quality evaluation and quality control of the preparations containing Bufonis Venenum from CNKI, PubMed, and Web of Science. Qualitative and quantitative analysis methods for 64 preparations containing Bufonis Venenum have been reported, mainly including thin-layer chromatography, HPLC fingerprint, and multi-component content determination. The index components mainly involved bufadienolides, such as gamabufalin, arenobufagin, bufotalin, bufalin, cinobufagin, and resibufogenin. According to the literature information, this paper suggests that attention should be paid to the correlations between the analysis methods and detection indexes of medicinal materials, decoction pieces and preparations, the monitoring of indole alkaloids, and the content uniformity inspection for further improving the quality standards for the preparations containing Bufonis Venenum.


Subject(s)
Animals , Humans , Bufonidae , Powders , Bufanolides/pharmacology , Quality Control , Chromatography, High Pressure Liquid , Pain/drug therapy
18.
Brief Bioinform ; 23(6)2022 11 19.
Article in English | MEDLINE | ID: mdl-36094083

ABSTRACT

Short open reading frames (sORFs) refer to the small nucleic fragments no longer than 303 nt in length that probably encode small peptides. To date, translatable sORFs have been found in both untranslated regions of messenger ribonucleic acids (RNAs; mRNAs) and long non-coding RNAs (lncRNAs), playing vital roles in a myriad of biological processes. As not all sORFs are translated or essentially translatable, it is important to develop a highly accurate computational tool for characterizing the coding potential of sORFs, thereby facilitating discovery of novel functional peptides. In light of this, we designed a series of ensemble models by integrating Efficient-CapsNet and LightGBM, collectively termed csORF-finder, to differentiate the coding sORFs (csORFs) from non-coding sORFs in Homo sapiens, Mus musculus and Drosophila melanogaster, respectively. To improve the performance of csORF-finder, we introduced a novel feature encoding scheme named trinucleotide deviation from expected mean (TDE) and computed all types of in-frame sequence-based features, such as i-framed-3mer, i-framed-CKSNAP and i-framed-TDE. Benchmarking results showed that these features could significantly boost the performance compared to the original 3-mer, CKSNAP and TDE features. Our performance comparisons showed that csORF-finder achieved a superior performance than the state-of-the-art methods for csORF prediction on multi-species and non-ATG initiation independent test datasets. Furthermore, we applied csORF-finder to screen the lncRNA datasets for identifying potential csORFs. The resulting data serve as an important computational repository for further experimental validation. We hope that csORF-finder can be exploited as a powerful platform for high-throughput identification of csORFs and functional characterization of these csORFs encoded peptides.


Subject(s)
Open Reading Frames , RNA, Long Noncoding , Animals , Mice , Drosophila melanogaster/genetics , Machine Learning , Peptides/genetics , RNA, Long Noncoding/genetics , RNA, Messenger/genetics , Humans
19.
J Chem Inf Model ; 62(17): 4270-4282, 2022 09 12.
Article in English | MEDLINE | ID: mdl-35973091

ABSTRACT

An essential step in engineering proteins and understanding disease-causing missense mutations is to accurately model protein stability changes when such mutations occur. Here, we developed a new sequence-based predictor for the protein stability (PROST) change (Gibb's free energy change, ΔΔG) upon a single-point missense mutation. PROST extracts multiple descriptors from the most promising sequence-based predictors, such as BoostDDG, SAAFEC-SEQ, and DDGun. RPOST also extracts descriptors from iFeature and AlphaFold2. The extracted descriptors include sequence-based features, physicochemical properties, evolutionary information, evolutionary-based physicochemical properties, and predicted structural features. The PROST predictor is a weighted average ensemble model based on extreme gradient boosting (XGBoost) decision trees and an extra-trees regressor; PROST is trained on both direct and hypothetical reverse mutations using the S5294 (S2647 direct mutations + S2647 inverse mutations). The parameters for the PROST model are optimized using grid searching with 5-fold cross-validation, and feature importance analysis unveils the most relevant features. The performance of PROST is evaluated in a blinded manner, employing nine distinct data sets and existing state-of-the-art sequence-based and structure-based predictors. This method consistently performs well on frataxin, S217, S349, Ssym, S669, Myoglobin, and CAGI5 data sets in blind tests and similarly to the state-of-the-art predictors for p53 and S276 data sets. When the performance of PROST is compared with the latest predictors such as BoostDDG, SAAFEC-SEQ, ACDC-NN-seq, and DDGun, PROST dominates these predictors. A case study of mutation scanning of the frataxin protein for nine wild-type residues demonstrates the utility of PROST. Taken together, these findings indicate that PROST is a well-suited predictor when no protein structural information is available. The source code of PROST, data sets, examples, and pretrained models along with how to use PROST are available at https://github.com/ShahidIqb/PROST and https://prost.erc.monash.edu/seq.


Subject(s)
Mutation, Missense , Zygote Intrafallopian Transfer , Protein Stability , Proteins/chemistry , Software
20.
Org Lett ; 24(17): 3138-3143, 2022 May 06.
Article in English | MEDLINE | ID: mdl-35452582

ABSTRACT

We report herein that copper(I) catalysis using a bis(phosphine) dioxide ligand can catalyze the desymmetric C-H arylation of prochiral bipyrroles. More than 50 nitrogen-nitrogen atropisomers were achieved in good to excellent yields with excellent enantioselectivities (≤97% yield, ≤98% ee). The reaction proceeds under mild conditions with good functional group compatibility on arenes and diaryliodonium salts. Moreover, this principle enables iterative arylation of the bipyrroles to enantioselectively arylate different positions during the catalysis of copper.

SELECTION OF CITATIONS
SEARCH DETAIL
...