Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 84
Filter
1.
Genome Biol ; 25(1): 41, 2024 Feb 01.
Article in English | MEDLINE | ID: mdl-38303023

ABSTRACT

Protein function annotation has been one of the longstanding issues in biological sciences, and various computational methods have been developed. However, the existing methods suffer from a serious long-tail problem, with a large number of GO families containing few annotated proteins. Herein, an innovative strategy named AnnoPRO was therefore constructed by enabling sequence-based multi-scale protein representation, dual-path protein encoding using pre-training, and function annotation by long short-term memory-based decoding. A variety of case studies based on different benchmarks were conducted, which confirmed the superior performance of AnnoPRO among available methods. Source code and models have been made freely available at: https://github.com/idrblab/AnnoPRO and https://zenodo.org/records/10012272.


Subject(s)
Deep Learning , Humans , Computational Biology/methods , Proteins/metabolism , Software , Molecular Sequence Annotation
2.
Nucleic Acids Res ; 52(D1): D1450-D1464, 2024 Jan 05.
Article in English | MEDLINE | ID: mdl-37850638

ABSTRACT

Distinct from the traditional diagnostic/prognostic biomarker (adopted as the indicator of disease state/process), the therapeutic biomarker (ThMAR) has emerged to be very crucial in the clinical development and clinical practice of all therapies. There are five types of ThMAR that have been found to play indispensable roles in various stages of drug discovery, such as: Pharmacodynamic Biomarker essential for guaranteeing the pharmacological effects of a therapy, Safety Biomarker critical for assessing the extent or likelihood of therapy-induced toxicity, Monitoring Biomarker indispensable for guiding clinical management by serially measuring patients' status, Predictive Biomarker crucial for maximizing the clinical outcome of a therapy for specific individuals, and Surrogate Endpoint fundamental for accelerating the approval of a therapy. However, these data of ThMARs has not been comprehensively described by any of the existing databases. Herein, a database, named 'TheMarker', was therefore constructed to (a) systematically offer all five types of ThMAR used at different stages of drug development, (b) comprehensively describe ThMAR information for the largest number of drugs among available databases, (c) extensively cover the widest disease classes by not just focusing on anticancer therapies. These data in TheMarker are expected to have great implication and significant impact on drug discovery and clinical practice, and it is freely accessible without any login requirement at: https://idrblab.org/themarker.


Subject(s)
Biomarkers , Databases, Factual , Humans , Drug Discovery , Therapeutics , Prognosis , Disease
3.
Nucleic Acids Res ; 52(D1): D1465-D1477, 2024 Jan 05.
Article in English | MEDLINE | ID: mdl-37713619

ABSTRACT

Target discovery is one of the essential steps in modern drug development, and the identification of promising targets is fundamental for developing first-in-class drug. A variety of methods have emerged for target assessment based on druggability analysis, which refers to the likelihood of a target being effectively modulated by drug-like agents. In the therapeutic target database (TTD), nine categories of established druggability characteristics were thus collected for 426 successful, 1014 clinical trial, 212 preclinical/patented, and 1479 literature-reported targets via systematic review. These characteristic categories were classified into three distinct perspectives: molecular interaction/regulation, human system profile and cell-based expression variation. With the rapid progression of technology and concerted effort in drug discovery, TTD and other databases were highly expected to facilitate the explorations of druggability characteristics for the discovery and validation of innovative drug target. TTD is now freely accessible at: https://idrblab.org/ttd/.


Subject(s)
Databases, Pharmaceutical , Humans , Drug Delivery Systems , Drug Discovery , Molecular Targeted Therapy
4.
Nucleic Acids Res ; 51(D1): D1288-D1299, 2023 01 06.
Article in English | MEDLINE | ID: mdl-36243961

ABSTRACT

The efficacy and safety of drugs are widely known to be determined by their interactions with multiple molecules of pharmacological importance, and it is therefore essential to systematically depict the molecular atlas and pharma-information of studied drugs. However, our understanding of such information is neither comprehensive nor precise, which necessitates the construction of a new database providing a network containing a large number of drugs and their interacting molecules. Here, a new database describing the molecular atlas and pharma-information of drugs (DrugMAP) was therefore constructed. It provides a comprehensive list of interacting molecules for >30 000 drugs/drug candidates, gives the differential expression patterns for >5000 interacting molecules among different disease sites, ADME (absorption, distribution, metabolism and excretion)-relevant organs and physiological tissues, and weaves a comprehensive and precise network containing >200 000 interactions among drugs and molecules. With the great efforts made to clarify the complex mechanism underlying drug pharmacokinetics and pharmacodynamics and rapidly emerging interests in artificial intelligence (AI)-based network analyses, DrugMAP is expected to become an indispensable supplement to existing databases to facilitate drug discovery. It is now fully and freely accessible at: https://idrblab.org/drugmap/.


Subject(s)
Artificial Intelligence , Drug Discovery , Databases, Factual , Pharmaceutical Preparations , Atlases as Topic
5.
Brief Bioinform ; 23(4)2022 07 18.
Article in English | MEDLINE | ID: mdl-35758241

ABSTRACT

The discovery of proper molecular signature from OMIC data is indispensable for determining biological state, physiological condition, disease etiology, and therapeutic response. However, the identified signature is reported to be highly inconsistent, and there is little overlap among the signatures identified from different biological datasets. Such inconsistency raises doubts about the reliability of reported signatures and significantly hampers its biological and clinical applications. Herein, an online tool, ConSIG, was constructed to realize consistent discovery of gene/protein signature from any uploaded transcriptomic/proteomic data. This tool is unique in a) integrating a novel strategy capable of significantly enhancing the consistency of signature discovery, b) determining the optimal signature by collective assessment, and c) confirming the biological relevance by enriching the disease/gene ontology. With the increasingly accumulated concerns about signature consistency and biological relevance, this online tool is expected to be used as an essential complement to other existing tools for OMIC-based signature discovery. ConSIG is freely accessible to all users without login requirement at https://idrblab.org/consig/.


Subject(s)
Proteomics , Transcriptome , Gene Ontology , Reproducibility of Results
6.
Chin J Integr Med ; 28(7): 627-635, 2022 Jul.
Article in English | MEDLINE | ID: mdl-35583580

ABSTRACT

OBJECTIVE: To investigate how the National Health Commission of China (NHCC)-recommended Chinese medicines (CMs) modulate the major maladjustments of coronavirus disease 2019 (COVID-19), particularly the clinically observed complications and comorbidities. METHODS: By focusing on the potent targets in common with the conventional medicines, we investigated the mechanisms of 11 NHCC-recommended CMs in the modulation of the major COVID-19 pathophysiology (hyperinflammations, viral replication), complications (pain, headache) and comorbidities (hypertension, obesity, diabetes). The constituent herbs of these CMs and their chemical ingredients were from the Traditional Chinese Medicine Information Database. The experimentally-determined targets and the activity values of the chemical ingredients of these CMs were from the Natural Product Activity and Species Source Database. The approved and clinical trial drugs against these targets were searched from the Therapeutic Target Database and DrugBank Database. Pathways of the targets was obtained from Kyoto Encyclopedia of Genes and Genomes and additional literature search. RESULTS: Overall, 9 CMs modulated 6 targets discovered by the COVID-19 target discovery studies, 8 and 11 CMs modulated 8 and 6 targets of the approved or clinical trial drugs for the treatment of the major COVID-19 complications and comorbidities, respectively. CONCLUSION: The coordinated actions of each NHCC-recommended CM against a few targets of the major COVID-19 pathophysiology, complications and comorbidities, partly have common mechanisms with the conventional medicines.


Subject(s)
COVID-19 Drug Treatment , COVID-19 , Medicine, Chinese Traditional , COVID-19/complications , COVID-19/epidemiology , COVID-19/physiopathology , Comorbidity , Drugs, Chinese Herbal/therapeutic use , Humans , Medicine , SARS-CoV-2
7.
Nucleic Acids Res ; 50(D1): D1398-D1407, 2022 01 07.
Article in English | MEDLINE | ID: mdl-34718717

ABSTRACT

Drug discovery relies on the knowledge of not only drugs and targets, but also the comparative agents and targets. These include poor binders and non-binders for developing discovery tools, prodrugs for improved therapeutics, co-targets of therapeutic targets for multi-target strategies and off-target investigations, and the collective structure-activity and drug-likeness landscapes of enhanced drug feature. However, such valuable data are inadequately covered by the available databases. In this study, a major update of the Therapeutic Target Database, previously featured in NAR, was therefore introduced. This update includes (a) 34 861 poor binders and 12 683 non-binders of 1308 targets; (b) 534 prodrug-drug pairs for 121 targets; (c) 1127 co-targets of 672 targets regulated by 642 approved and 624 clinical trial drugs; (d) the collective structure-activity landscapes of 427 262 active agents of 1565 targets; (e) the profiles of drug-like properties of 33 598 agents of 1102 targets. Moreover, a variety of additional data and function are provided, which include the cross-links to the target structure in PDB and AlphaFold, 159 and 1658 newly emerged targets and drugs, and the advanced search function for multi-entry target sequences or drug structures. The database is accessible without login requirement at: https://idrblab.org/ttd/.


Subject(s)
Databases, Factual , Drug Discovery/trends , Prodrugs/classification , Humans , Molecular Targeted Therapy , Prodrugs/chemistry , Prodrugs/therapeutic use , Structure-Activity Relationship
8.
Comput Med Imaging Graph ; 88: 101861, 2021 03.
Article in English | MEDLINE | ID: mdl-33497891

ABSTRACT

Colorectal cancer (CRC) is the second leading cause of cancer-related mortality worldwide. In coping with it, histopathology image analysis (HIA) provides key information for clinical diagnosis of CRC. Nowadays, the deep learning methods are widely used in improving cancer classification and localization of tumor-regions in HIA. However, these efforts are both time-consuming and labor-intensive due to the manual annotation of tumor-regions in the whole slide images (WSIs). Furthermore, classical deep learning methods to analyze thousands of patches extracted from WSIs may cause loss of integrated information of image. Herein, a novel method was developed, which used only global labels to achieve WSI classification and localization of carcinoma by combining features from different magnifications of WSIs. The model was trained and tested using 1346 colorectal cancer WSIs from the Cancer Genome Atlas (TCGA). Our method classified colorectal cancer with an accuracy of 94.6 %, which slightly outperforms most of the existing methods. Its cancerous-location probability maps were in good agreement with annotations from three individual expert pathologists. Independent tests on 50 newly-collected colorectal cancer WSIs from hospitals produced 92.0 % accuracy and cancerous-location probability maps were in good agreement with the three pathologists. The results thereby demonstrated that the method sufficiently achieved WSI classification and localization utilizing only global labels. This weakly supervised deep learning method is effective in time and cost, as it delivered a better performance in comparison with the state-of-the-art methods.


Subject(s)
Colorectal Neoplasms , Deep Learning , Humans , Image Processing, Computer-Assisted
10.
Asian J Pharm Sci ; 16(6): 665-667, 2021 Nov.
Article in English | MEDLINE | ID: mdl-35027947

ABSTRACT

Graphical Abstract Image, graphical abstract.

12.
J Cancer ; 11(4): 849-857, 2020.
Article in English | MEDLINE | ID: mdl-31949488

ABSTRACT

Gastric cancer (GC) is the third leading cause of cancer-related death. Although the therapeutic approaches have improved, the 5-year survival rate of GC patients after surgical resection remains low due to the high rates of metastasis and recurrence. Patients with schizophrenia have significantly lower incidences of cancer after long-term drug treatment, suggesting the potential or partially ameliorate the risk of cancer development of antipsychotic drugs. The goal of this study was to explore antipsychotic drugs with an optional effective therapy against gastric cellular carcinoma. We found that sertindole, an atypical antipsychotic, exhibited anti-tumor efficacy on human GC cells in vitro and in vivo. Moreover, sertindole in combination with cisplatin dramatically enhanced apoptosis-induction in GC cells. In addition, the pro-apoptotic effect of sertindole on GC might in part, involved in inhibition of STAT3 activation and downstream signals, including Mcl1, surviving, c-Myc, cyclin D1. Collectively, these results suggested that sertindole could be a potential anticancer reagent and be an attractive therapeutic adjuvant for the treatment of human GC.

13.
J Cell Mol Med ; 24(3): 2215-2228, 2020 02.
Article in English | MEDLINE | ID: mdl-31943775

ABSTRACT

Increasing evidence has verified that small nucleolar RNAs (snoRNAs) play significant roles in tumorigenesis and exhibit prognostic value in clinical practice. In the study, we analysed the expression profile and clinical relevance of snoRNAs from TCGA database including 530 ccRCC (clear cell renal cell carcinoma) and 72 control cases. By using univariate and multivariate Cox analysis, we established a six-snoRNA signature and divided patients into high-risk or low-risk groups. We found patients in high-risk group had significantly shorter overall survival and recurrence-free survival than those in low-risk group in test series, validation series and entire series by Kaplan-Meier analysis. We also confirmed this signature had a great accuracy and specificity in 64 clinical tissue cases and 50 serum samples. Then, depending on receiver operating characteristic curve analysis we found the six-snoRNA signature was an superior indicator better than conventional clinical factors (AUC = 0.732). Furthermore, combining the signature with TNM stage or Fuhrman grade were the optimal indicators (AUC = 0.792; AUC = 0.800) and processed the clinical applied value for ccRCC. Finally, we found the SNORA70B and its hose gene USP34 might directly regulate Wnt signalling pathway to promote tumorigenesis in ccRCC. In general, our study established a six-snoRNA signature as an independent and superior diagnosis and prognosis indicator for ccRCC.


Subject(s)
Biomarkers, Tumor/genetics , Carcinoma, Renal Cell/genetics , Kidney Neoplasms/genetics , RNA, Small Nucleolar/genetics , Carcinogenesis/genetics , Carcinogenesis/pathology , Carcinoma, Renal Cell/pathology , Case-Control Studies , Humans , Kaplan-Meier Estimate , Kidney Neoplasms/pathology , Multivariate Analysis , Prognosis , Risk Factors , Signal Transduction/genetics , Ubiquitin-Specific Proteases/genetics
14.
Brief Bioinform ; 21(2): 621-636, 2020 03 23.
Article in English | MEDLINE | ID: mdl-30649171

ABSTRACT

Label-free quantification (LFQ) with a specific and sequentially integrated workflow of acquisition technique, quantification tool and processing method has emerged as the popular technique employed in metaproteomic research to provide a comprehensive landscape of the adaptive response of microbes to external stimuli and their interactions with other organisms or host cells. The performance of a specific LFQ workflow is highly dependent on the studied data. Hence, it is essential to discover the most appropriate one for a specific data set. However, it is challenging to perform such discovery due to the large number of possible workflows and the multifaceted nature of the evaluation criteria. Herein, a web server ANPELA (https://idrblab.org/anpela/) was developed and validated as the first tool enabling performance assessment of whole LFQ workflow (collective assessment by five well-established criteria with distinct underlying theories), and it enabled the identification of the optimal LFQ workflow(s) by a comprehensive performance ranking. ANPELA not only automatically detects the diverse formats of data generated by all quantification tools but also provides the most complete set of processing methods among the available web servers and stand-alone tools. Systematic validation using metaproteomic benchmarks revealed ANPELA's capabilities in 1 discovering well-performing workflow(s), (2) enabling assessment from multiple perspectives and (3) validating LFQ accuracy using spiked proteins. ANPELA has a unique ability to evaluate the performance of whole LFQ workflow and enables the discovery of the optimal LFQs by the comprehensive performance ranking of all 560 workflows. Therefore, it has great potential for applications in metaproteomic and other studies requiring LFQ techniques, as many features are shared among proteomic studies.


Subject(s)
Proteins/chemistry , Proteomics/methods , Workflow , Internet , Reproducibility of Results
15.
Brief Bioinform ; 21(3): 1058-1068, 2020 05 21.
Article in English | MEDLINE | ID: mdl-31157371

ABSTRACT

The etiology of schizophrenia (SCZ) is regarded as one of the most fundamental puzzles in current medical research, and its diagnosis is limited by the lack of objective molecular criteria. Although plenty of studies were conducted, SCZ gene signatures identified by these independent studies are found highly inconsistent. As one of the most important factors contributing to this inconsistency, the feature selection methods used currently do not fully consider the reproducibility among the signatures discovered from different datasets. Therefore, it is crucial to develop new bioinformatics tools of novel strategy for ensuring a stable discovery of gene signature for SCZ. In this study, a novel feature selection strategy (1) integrating repeated random sampling with consensus scoring and (2) evaluating the consistency of gene rank among different datasets was constructed. By systematically assessing the identified SCZ signature comprising 135 differentially expressed genes, this newly constructed strategy demonstrated significantly enhanced stability and better differentiating ability compared with the feature selection methods popular in current SCZ research. Based on a first-ever assessment on methods' reproducibility cross-validated by independent datasets from three representative studies, the new strategy stood out among the popular methods by showing superior stability and differentiating ability. Finally, 2 novel and 17 previously reported transcription factors were identified and showed great potential in revealing the etiology of SCZ. In sum, the SCZ signature identified in this study would provide valuable clues for discovering diagnostic molecules and potential targets for SCZ.


Subject(s)
Schizophrenia/genetics , Transcriptome , Computational Biology/methods , Datasets as Topic , Gene Expression Regulation , Humans , Reproducibility of Results
16.
Brief Bioinform ; 21(6): 2206-2218, 2020 12 01.
Article in English | MEDLINE | ID: mdl-31799600

ABSTRACT

Protein dynamics is central to all biological processes, including signal transduction, cellular regulation and biological catalysis. Among them, in-depth exploration of ligand-driven protein dynamics contributes to an optimal understanding of protein function, which is particularly relevant to drug discovery. Hence, a wide range of computational tools have been designed to investigate the important dynamic information in proteins. However, performing and analyzing protein dynamics is still challenging due to the complicated operation steps, giving rise to great difficulty, especially for nonexperts. Moreover, there is a lack of web protocol to provide online facility to investigate and visualize ligand-driven protein dynamics. To this end, in this study, we integrated several bioinformatic tools to develop a protocol, named Ligand and Receptor Molecular Dynamics (LARMD, http://chemyang.ccnu.edu.cn/ccb/server/LARMD/ and http://agroda.gzu.edu.cn:9999/ccb/server/LARMD/), for profiling ligand-driven protein dynamics. To be specific, estrogen receptor (ER) was used as a case to reveal ERß-selective mechanism, which plays a vital role in the treatment of inflammatory diseases and many types of cancers in clinical practice. Two different residues (Ile373/Met421 and Met336/Leu384) in the pocket of ERß/ERα were the significant determinants for selectivity, especially Met336 of ERß. The helix H8, helix H11 and H7-H8 loop influenced the migration of selective agonist (WAY-244). These computational results were consistent with the experimental results. Therefore, LARMD provides a user-friendly online protocol to study the dynamic property of protein and to design new ligand or site-directed mutagenesis.


Subject(s)
Computational Biology , Estrogen Receptor alpha , Estrogen Receptor beta , Molecular Dynamics Simulation , Computational Biology/methods , Drug Discovery , Estrogen Receptor alpha/chemistry , Estrogen Receptor alpha/metabolism , Estrogen Receptor beta/chemistry , Estrogen Receptor beta/metabolism , Ligands
17.
Nucleic Acids Res ; 48(D1): D1031-D1041, 2020 01 08.
Article in English | MEDLINE | ID: mdl-31691823

ABSTRACT

Knowledge of therapeutic targets and early drug candidates is useful for improved drug discovery. In particular, information about target regulators and the patented therapeutic agents facilitates research regarding druggability, systems pharmacology, new trends, molecular landscapes, and the development of drug discovery tools. To complement other databases, we constructed the Therapeutic Target Database (TTD) with expanded information about (i) target-regulating microRNAs and transcription factors, (ii) target-interacting proteins, and (iii) patented agents and their targets (structures and experimental activity values if available), which can be conveniently retrieved and is further enriched with regulatory mechanisms or biochemical classes. We also updated the TTD with the recently released International Classification of Diseases ICD-11 codes and additional sets of successful, clinical trial, and literature-reported targets that emerged since the last update. TTD is accessible at http://bidd.nus.edu.sg/group/ttd/ttd.asp. In case of possible web connectivity issues, two mirror sites of TTD are also constructed (http://db.idrblab.org/ttd/ and http://db.idrblab.net/ttd/).


Subject(s)
Computational Biology/methods , Databases, Factual , Drug Discovery , Molecular Targeted Therapy , Software , Biomarkers , Drug Discovery/methods , Humans , Ligands , User-Computer Interface , Web Browser
18.
J Transl Med ; 17(1): 259, 2019 08 08.
Article in English | MEDLINE | ID: mdl-31395064

ABSTRACT

BACKGROUND: Ovarian cancer is the leading cause of death in gynecological cancer. Cancer stem cells (CSCs) contribute to the occurrence, progression and resistance. Small nucleolar RNAs (SnoRNAs), a class of small molecule non-coding RNA, involve in the cancer cell stemness and tumorigenesis. METHODS: In this study, we screened out SNORNAs related to ovarian patient's prognosis by analyzing the data of 379 cases of ovarian cancer patients in the TCGA database, and analyzed the difference of SNORNAs expression between OVCAR-3 (OV) sphere-forming (OS) cells and OV cells. After overexpression or knockdown SNORD89, the expression of Nanog, CD44, and CD133 was measured by qRT-PCR or flow cytometry analysis in OV, CAOV-3 (CA) and OS cells, respectively. CCK-8 assays, plate clone formation assay and soft agar colony formation assay were carried out to evaluate the changes of cell proliferation and self-renewal ability. Scratch migration assay and trans-well invasion analysis were used for assessing the changes of migration and invasion ability. RESULTS: High expression of SNORD89 indicates the poor prognosis of ovarian cancer patients and was associated with patients' age, therapy outcome. SNORD89 highly expressed in ovarian cancer stem cells. The overexpression of SNORD89 resulted in the increased stemness markers, S phase cell cycle, cell proliferation, invasion and migration ability in OV and CA cells. Conversely, these phenomena were reversed after SNORD89 silencing in OS cells. Further, we found that SNORD89 could upregulate c-Myc and Notch1 expression in mRNA and protein levels. SNORD89 deteriorates the prognosis of ovarian cancer patients by regulating Notch1-c-Myc pathway to promote cell stemness and acts as an oncogene in ovarian tumorigenesis. Consequently, SNORD89 can be a novel prognostic biomarker and therapeutic target for ovarian cancer.


Subject(s)
Neoplastic Stem Cells/metabolism , Neoplastic Stem Cells/pathology , Ovarian Neoplasms/genetics , Ovarian Neoplasms/pathology , RNA, Small Nucleolar/metabolism , Receptor, Notch1/metabolism , Signal Transduction , Carcinogenesis/genetics , Carcinogenesis/pathology , Cell Line, Tumor , Cell Movement/genetics , Cell Proliferation/genetics , Cell Self Renewal/genetics , Female , Gene Expression Regulation, Neoplastic , Humans , Neoplasm Invasiveness , Phenotype , Prognosis , Proto-Oncogene Proteins c-myc/metabolism , RNA, Small Nucleolar/genetics
19.
CNS Neurosci Ther ; 25(9): 1054-1063, 2019 09.
Article in English | MEDLINE | ID: mdl-31350824

ABSTRACT

AIMS: As one of the most fundamental questions in modern science, "what causes schizophrenia (SZ)" remains a profound mystery due to the absence of objective gene markers. The reproducibility of the gene signatures identified by independent studies is found to be extremely low due to the incapability of available feature selection methods and the lack of measurement on validating signatures' robustness. These irreproducible results have significantly limited our understanding of the etiology of SZ. METHODS: In this study, a new feature selection strategy was developed, and a comprehensive analysis was then conducted to ensure a reliable signature discovery. Particularly, the new strategy (a) combined multiple randomized sampling with consensus scoring and (b) assessed gene ranking consistency among different datasets, and a comprehensive analysis among nine independent studies was conducted. RESULTS: Based on a first-ever evaluation of methods' reproducibility that was cross-validated by nine independent studies, the newly developed strategy was found to be superior to the traditional ones. As a result, 33 genes were consistently identified from multiple datasets by the new strategy as differentially expressed, which might facilitate our understanding of the mechanism underlying the etiology of SZ. CONCLUSION: A new strategy capable of enhancing the reproducibility of feature selection in current SZ research was successfully constructed and validated. A group of candidate genes identified in this study should be considered as great potential for revealing the etiology of SZ.


Subject(s)
Artificial Intelligence/standards , Databases, Genetic/standards , Gene Expression Profiling/methods , Gene Expression Profiling/standards , Schizophrenia/genetics , Humans , Random Allocation , Reproducibility of Results , Schizophrenia/diagnosis
20.
Mol Cell Proteomics ; 18(8): 1683-1699, 2019 08.
Article in English | MEDLINE | ID: mdl-31097671

ABSTRACT

The label-free proteome quantification (LFQ) is multistep workflow collectively defined by quantification tools and subsequent data manipulation methods that has been extensively applied in current biomedical, agricultural, and environmental studies. Despite recent advances, in-depth and high-quality quantification remains extremely challenging and requires the optimization of LFQs by comparatively evaluating their performance. However, the evaluation results using different criteria (precision, accuracy, and robustness) vary greatly, and the huge number of potential LFQs becomes one of the bottlenecks in comprehensively optimizing proteome quantification. In this study, a novel strategy, enabling the discovery of the LFQs of simultaneously enhanced performance from thousands of workflows (integrating 18 quantification tools with 3,128 manipulation chains), was therefore proposed. First, the feasibility of achieving simultaneous improvement in the precision, accuracy, and robustness of LFQ was systematically assessed by collectively optimizing its multistep manipulation chains. Second, based on a variety of benchmark datasets acquired by various quantification measurements of different modes of acquisition, this novel strategy successfully identified a number of manipulation chains that simultaneously improved the performance across multiple criteria. Finally, to further enhance proteome quantification and discover the LFQs of optimal performance, an online tool (https://idrblab.org/anpela/) enabling collective performance assessment (from multiple perspectives) of the entire LFQ workflow was developed. This study confirmed the feasibility of achieving simultaneous improvement in precision, accuracy, and robustness. The novel strategy proposed and validated in this study together with the online tool might provide useful guidance for the research field requiring the mass-spectrometry-based LFQ technique.


Subject(s)
Proteomics/methods , Proteome , Software , Workflow
SELECTION OF CITATIONS
SEARCH DETAIL
...