Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 27
Filtrar
1.
Nucleic Acids Res ; 2024 May 29.
Artigo em Inglês | MEDLINE | ID: mdl-38808672

RESUMO

Enrichment analysis, crucial for interpreting genomic, transcriptomic, and proteomic data, is expanding into metabolomics. Furthermore, there is a rising demand for integrated enrichment analysis that combines data from different studies and omics platforms, as seen in meta-analysis and multi-omics research. To address these growing needs, we have updated WebGestalt to include enrichment analysis capabilities for both metabolites and multiple input lists of analytes. We have also significantly increased analysis speed, revamped the user interface, and introduced new pathway visualizations to accommodate these updates. Notably, the adoption of a Rust backend reduced gene set enrichment analysis time by 95% from 270.64 to 12.41 s and network topology-based analysis by 89% from 159.59 to 17.31 s in our evaluation. This performance improvement is also accessible in both the R package and a newly introduced Python package. Additionally, we have updated the data in the WebGestalt database to reflect the current status of each source and have expanded our collection of pathways, networks, and gene signatures. The 2024 WebGestalt update represents a significant leap forward, offering new support for metabolomics, streamlined multi-omics analysis capabilities, and remarkable performance enhancements. Discover these updates and more at https://www.webgestalt.org.

2.
bioRxiv ; 2024 Mar 21.
Artigo em Inglês | MEDLINE | ID: mdl-38562798

RESUMO

Mass spectrometry-based phosphoproteomics offers a comprehensive view of protein phosphorylation, but limited knowledge about the regulation and function of most phosphosites restricts our ability to extract meaningful biological insights from phosphoproteomics data. To address this, we combine machine learning and phosphoproteomic data from 1,195 tumor specimens spanning 11 cancer types to construct CoPheeMap, a network mapping the co-regulation of 26,280 phosphosites. Integrating network features from CoPheeMap into a machine learning model, CoPheeKSA, we achieve superior performance in predicting kinase-substrate associations. CoPheeKSA reveals 24,015 associations between 9,399 phosphosites and 104 serine/threonine kinases, including many unannotated phosphosites and under-studied kinases. We validate the accuracy of these predictions using experimentally determined kinase-substrate specificities. By applying CoPheeMap and CoPheeKSA to phosphosites with high computationally predicted functional significance and cancer-associated phosphosites, we demonstrate the effectiveness of these tools in systematically illuminating phosphosites of interest, revealing dysregulated signaling processes in human cancer, and identifying under-studied kinases as putative therapeutic targets.

3.
Mol Cell Proteomics ; 23(1): 100682, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-37993103

RESUMO

Global phosphoproteomics experiments quantify tens of thousands of phosphorylation sites. However, data interpretation is hampered by our limited knowledge on functions, biological contexts, or precipitating enzymes of the phosphosites. This study establishes a repository of phosphosites with associated evidence in biomedical abstracts, using deep learning-based natural language processing techniques. Our model for illuminating the dark phosphoproteome through PubMed mining (IDPpub) was generated by fine-tuning BioBERT, a deep learning tool for biomedical text mining. Trained using sentences containing protein substrates and phosphorylation site positions from 3000 abstracts, the IDPpub model was then used to extract phosphorylation sites from all MEDLINE abstracts. The extracted proteins were normalized to gene symbols using the National Center for Biotechnology Information gene query, and sites were mapped to human UniProt sequences using ProtMapper and mouse UniProt sequences by direct match. Precision and recall were calculated using 150 curated abstracts, and utility was assessed by analyzing the CPTAC (Clinical Proteomics Tumor Analysis Consortium) pan-cancer phosphoproteomics datasets and the PhosphoSitePlus database. Using 10-fold cross validation, pairs of correct substrates and phosphosite positions were extracted with an average precision of 0.93 and recall of 0.94. After entity normalization and site mapping to human reference sequences, an independent validation achieved a precision of 0.91 and recall of 0.77. The IDPpub repository contains 18,458 unique human phosphorylation sites with evidence sentences from 58,227 abstracts and 5918 mouse sites in 14,610 abstracts. This included evidence sentences for 1803 sites identified in CPTAC studies that are not covered by manually curated functional information in PhosphoSitePlus. Evaluation results demonstrate the potential of IDPpub as an effective biomedical text mining tool for collecting phosphosites. Moreover, the repository (http://idppub.ptmax.org), which can be automatically updated, can serve as a powerful complement to existing resources.


Assuntos
Mineração de Dados , Processamento de Linguagem Natural , Humanos , Mineração de Dados/métodos , Bases de Dados Factuais , PubMed
4.
Cell Syst ; 14(9): 777-787.e5, 2023 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-37619559

RESUMO

By combining mass-spectrometry-based proteomics and phosphoproteomics with genomics, epi-genomics, and transcriptomics, proteogenomics provides comprehensive molecular characterization of cancer. Using this approach, the Clinical Proteomic Tumor Analysis Consortium (CPTAC) has characterized over 1,000 primary tumors spanning 10 cancer types, many with matched normal tissues. Here, we present LinkedOmicsKB, a proteogenomics data-driven knowledge base that makes consistently processed and systematically precomputed CPTAC pan-cancer proteogenomics data available to the public through ∼40,000 gene-, protein-, mutation-, and phenotype-centric web pages. Visualization techniques facilitate efficient exploration and reasoning of complex, interconnected data. Using three case studies, we illustrate the practical utility of LinkedOmicsKB in providing new insights into genes, phosphorylation sites, somatic mutations, and cancer phenotypes. With precomputed results of 19,701 coding genes, 125,969 phosphosites, and 256 genotypes and phenotypes, LinkedOmicsKB provides a comprehensive resource to accelerate proteogenomics data-driven discoveries to improve our understanding and treatment of human cancer. A record of this paper's transparent peer review process is included in the supplemental information.


Assuntos
Neoplasias , Proteogenômica , Humanos , Proteômica , Proteogenômica/métodos , Genômica , Neoplasias/genética , Bases de Conhecimento
5.
Cancer Cell ; 41(8): 1397-1406, 2023 08 14.
Artigo em Inglês | MEDLINE | ID: mdl-37582339

RESUMO

The National Cancer Institute's Clinical Proteomic Tumor Analysis Consortium (CPTAC) investigates tumors from a proteogenomic perspective, creating rich multi-omics datasets connecting genomic aberrations to cancer phenotypes. To facilitate pan-cancer investigations, we have generated harmonized genomic, transcriptomic, proteomic, and clinical data for >1000 tumors in 10 cohorts to create a cohesive and powerful dataset for scientific discovery. We outline efforts by the CPTAC pan-cancer working group in data harmonization, data dissemination, and computational resources for aiding biological discoveries. We also discuss challenges for multi-omics data integration and analysis, specifically the unique challenges of working with both nucleotide sequencing and mass spectrometry proteomics data.


Assuntos
Neoplasias , Proteogenômica , Humanos , Proteômica , Genômica , Neoplasias/genética , Perfilação da Expressão Gênica
6.
Cancer Cell ; 41(9): 1586-1605.e15, 2023 09 11.
Artigo em Inglês | MEDLINE | ID: mdl-37567170

RESUMO

We characterized a prospective endometrial carcinoma (EC) cohort containing 138 tumors and 20 enriched normal tissues using 10 different omics platforms. Targeted quantitation of two peptides can predict antigen processing and presentation machinery activity, and may inform patient selection for immunotherapy. Association analysis between MYC activity and metformin treatment in both patients and cell lines suggests a potential role for metformin treatment in non-diabetic patients with elevated MYC activity. PIK3R1 in-frame indels are associated with elevated AKT phosphorylation and increased sensitivity to AKT inhibitors. CTNNB1 hotspot mutations are concentrated near phosphorylation sites mediating pS45-induced degradation of ß-catenin, which may render Wnt-FZD antagonists ineffective. Deep learning accurately predicts EC subtypes and mutations from histopathology images, which may be useful for rapid diagnosis. Overall, this study identified molecular and imaging markers that can be further investigated to guide patient stratification for more precise treatment of EC.


Assuntos
Neoplasias do Endométrio , Metformina , Proteogenômica , Feminino , Humanos , Proteínas Proto-Oncogênicas c-akt/genética , Estudos Prospectivos , Neoplasias do Endométrio/tratamento farmacológico , Neoplasias do Endométrio/genética , Neoplasias do Endométrio/metabolismo , beta Catenina/genética , beta Catenina/metabolismo , Metformina/farmacologia
7.
Tomography ; 9(2): 810-828, 2023 04 10.
Artigo em Inglês | MEDLINE | ID: mdl-37104137

RESUMO

Co-clinical trials are the concurrent or sequential evaluation of therapeutics in both patients clinically and patient-derived xenografts (PDX) pre-clinically, in a manner designed to match the pharmacokinetics and pharmacodynamics of the agent(s) used. The primary goal is to determine the degree to which PDX cohort responses recapitulate patient cohort responses at the phenotypic and molecular levels, such that pre-clinical and clinical trials can inform one another. A major issue is how to manage, integrate, and analyze the abundance of data generated across both spatial and temporal scales, as well as across species. To address this issue, we are developing MIRACCL (molecular and imaging response analysis of co-clinical trials), a web-based analytical tool. For prototyping, we simulated data for a co-clinical trial in "triple-negative" breast cancer (TNBC) by pairing pre- (T0) and on-treatment (T1) magnetic resonance imaging (MRI) from the I-SPY2 trial, as well as PDX-based T0 and T1 MRI. Baseline (T0) and on-treatment (T1) RNA expression data were also simulated for TNBC and PDX. Image features derived from both datasets were cross-referenced to omic data to evaluate MIRACCL functionality for correlating and displaying MRI-based changes in tumor size, vascularity, and cellularity with changes in mRNA expression as a function of treatment.


Assuntos
Neoplasias de Mama Triplo Negativas , Humanos , Neoplasias de Mama Triplo Negativas/patologia , Imageamento por Ressonância Magnética , Processamento de Imagem Assistida por Computador
8.
iScience ; 24(10): 103107, 2021 Oct 22.
Artigo em Inglês | MEDLINE | ID: mdl-34622160

RESUMO

Comprehensive characterization of tumor antigens is essential for the design of cancer immunotherapies, and mass spectrometry (MS)-based immunopeptidomics enables high-throughput identification of major histocompatibility complex (MHC)-bound peptide antigens in vivo. Here we construct an immunopeptidome atlas of human cancer through an extensive collection of 43 published immunopeptidomic datasets and standardized analysis of 81.6 million MS/MS spectra using an open search engine. Our analysis greatly expands the current knowledge of MHC-bound antigens, including an unprecedented characterization of post-translationally modified antigens and their cancer-association. We also perform systematic analysis of cancer-testis antigens, cancer-associated antigens, and neoantigens. We make all these data together with annotated MS/MS spectra supporting identification of each antigen in an easily browsable web portal named cancer antigen atlas (caAtlas). caAtlas provides a central resource for the selection and prioritization of MHC-bound peptides for in vitro HLA binding assay and immunogenicity testing, which will pave the way to eventual development of cancer immunotherapies.

9.
Cell ; 184(19): 5031-5052.e26, 2021 09 16.
Artigo em Inglês | MEDLINE | ID: mdl-34534465

RESUMO

Pancreatic ductal adenocarcinoma (PDAC) is a highly aggressive cancer with poor patient survival. Toward understanding the underlying molecular alterations that drive PDAC oncogenesis, we conducted comprehensive proteogenomic analysis of 140 pancreatic cancers, 67 normal adjacent tissues, and 9 normal pancreatic ductal tissues. Proteomic, phosphoproteomic, and glycoproteomic analyses were used to characterize proteins and their modifications. In addition, whole-genome sequencing, whole-exome sequencing, methylation, RNA sequencing (RNA-seq), and microRNA sequencing (miRNA-seq) were performed on the same tissues to facilitate an integrated proteogenomic analysis and determine the impact of genomic alterations on protein expression, signaling pathways, and post-translational modifications. To ensure robust downstream analyses, tumor neoplastic cellularity was assessed via multiple orthogonal strategies using molecular features and verified via pathological estimation of tumor cellularity based on histological review. This integrated proteogenomic characterization of PDAC will serve as a valuable resource for the community, paving the way for early detection and identification of novel therapeutic targets.


Assuntos
Adenocarcinoma/genética , Carcinoma Ductal Pancreático/genética , Neoplasias Pancreáticas/genética , Proteogenômica , Adenocarcinoma/diagnóstico , Adulto , Idoso , Idoso de 80 Anos ou mais , Algoritmos , Carcinoma Ductal Pancreático/diagnóstico , Estudos de Coortes , Células Endoteliais/metabolismo , Epigênese Genética , Feminino , Dosagem de Genes , Genoma Humano , Glicólise , Glicoproteínas/biossíntese , Humanos , Masculino , Pessoa de Meia-Idade , Terapia de Alvo Molecular , Neoplasias Pancreáticas/diagnóstico , Fenótipo , Fosfoproteínas/metabolismo , Fosforilação , Prognóstico , Proteínas Quinases/metabolismo , Proteoma/metabolismo , Especificidade por Substrato , Transcriptoma/genética
10.
Cancer Cell ; 39(3): 361-379.e16, 2021 03 08.
Artigo em Inglês | MEDLINE | ID: mdl-33417831

RESUMO

We present a proteogenomic study of 108 human papilloma virus (HPV)-negative head and neck squamous cell carcinomas (HNSCCs). Proteomic analysis systematically catalogs HNSCC-associated proteins and phosphosites, prioritizes copy number drivers, and highlights an oncogenic role for RNA processing genes. Proteomic investigation of mutual exclusivity between FAT1 truncating mutations and 11q13.3 amplifications reveals dysregulated actin dynamics as a common functional consequence. Phosphoproteomics characterizes two modes of EGFR activation, suggesting a new strategy to stratify HNSCCs based on EGFR ligand abundance for effective treatment with inhibitory EGFR monoclonal antibodies. Widespread deletion of immune modulatory genes accounts for low immune infiltration in immune-cold tumors, whereas concordant upregulation of multiple immune checkpoint proteins may underlie resistance to anti-programmed cell death protein 1 monotherapy in immune-hot tumors. Multi-omic analysis identifies three molecular subtypes with high potential for treatment with CDK inhibitors, anti-EGFR antibody therapy, and immunotherapy, respectively. Altogether, proteogenomics provides a systematic framework to inform HNSCC biology and treatment.


Assuntos
Antineoplásicos Imunológicos/uso terapêutico , Infecções por Papillomavirus/genética , Carcinoma de Células Escamosas de Cabeça e Pescoço/tratamento farmacológico , Carcinoma de Células Escamosas de Cabeça e Pescoço/genética , Adulto , Idoso , Idoso de 80 Anos ou mais , Receptores ErbB/genética , Feminino , Humanos , Imunoterapia/métodos , Masculino , Pessoa de Meia-Idade , Infecções por Papillomavirus/tratamento farmacológico , Infecções por Papillomavirus/virologia , Proteogenômica/métodos , Proteômica/métodos , Adulto Jovem
11.
Cell ; 183(5): 1436-1456.e31, 2020 11 25.
Artigo em Inglês | MEDLINE | ID: mdl-33212010

RESUMO

The integration of mass spectrometry-based proteomics with next-generation DNA and RNA sequencing profiles tumors more comprehensively. Here this "proteogenomics" approach was applied to 122 treatment-naive primary breast cancers accrued to preserve post-translational modifications, including protein phosphorylation and acetylation. Proteogenomics challenged standard breast cancer diagnoses, provided detailed analysis of the ERBB2 amplicon, defined tumor subsets that could benefit from immune checkpoint therapy, and allowed more accurate assessment of Rb status for prediction of CDK4/6 inhibitor responsiveness. Phosphoproteomics profiles uncovered novel associations between tumor suppressor loss and targetable kinases. Acetylproteome analysis highlighted acetylation on key nuclear proteins involved in the DNA damage response and revealed cross-talk between cytoplasmic and mitochondrial acetylation and metabolism. Our results underscore the potential of proteogenomics for clinical investigation of breast cancer through more accurate annotation of targetable pathways and biological features of this remarkably heterogeneous malignancy.


Assuntos
Neoplasias da Mama/genética , Neoplasias da Mama/patologia , Carcinogênese/genética , Carcinogênese/patologia , Terapia de Alvo Molecular , Proteogenômica , Desaminases APOBEC/metabolismo , Adulto , Idoso , Idoso de 80 Anos ou mais , Neoplasias da Mama/imunologia , Neoplasias da Mama/terapia , Estudos de Coortes , Dano ao DNA , Reparo do DNA , Feminino , Humanos , Imunoterapia , Metabolômica , Pessoa de Meia-Idade , Mutagênese/genética , Fosforilação , Inibidores de Proteínas Quinases/farmacologia , Proteínas Quinases/metabolismo , Receptor ErbB-2/metabolismo , Proteína do Retinoblastoma/metabolismo , Microambiente Tumoral/imunologia
12.
Proteomics ; 20(21-22): e1900335, 2020 11.
Artigo em Inglês | MEDLINE | ID: mdl-32939979

RESUMO

Proteomics, the study of all the proteins in biological systems, is becoming a data-rich science. Protein sequences and structures are comprehensively catalogued in online databases. With recent advancements in tandem mass spectrometry (MS) technology, protein expression and post-translational modifications (PTMs) can be studied in a variety of biological systems at the global scale. Sophisticated computational algorithms are needed to translate the vast amount of data into novel biological insights. Deep learning automatically extracts data representations at high levels of abstraction from data, and it thrives in data-rich scientific research domains. Here, a comprehensive overview of deep learning applications in proteomics, including retention time prediction, MS/MS spectrum prediction, de novo peptide sequencing, PTM prediction, major histocompatibility complex-peptide binding prediction, and protein structure prediction, is provided. Limitations and the future directions of deep learning in proteomics are also discussed. This review will provide readers an overview of deep learning and how it can be used to analyze proteomics data.


Assuntos
Aprendizado Profundo , Proteômica , Algoritmos , Processamento de Proteína Pós-Traducional , Espectrometria de Massas em Tandem
14.
Cell ; 180(4): 729-748.e26, 2020 02 20.
Artigo em Inglês | MEDLINE | ID: mdl-32059776

RESUMO

We undertook a comprehensive proteogenomic characterization of 95 prospectively collected endometrial carcinomas, comprising 83 endometrioid and 12 serous tumors. This analysis revealed possible new consequences of perturbations to the p53 and Wnt/ß-catenin pathways, identified a potential role for circRNAs in the epithelial-mesenchymal transition, and provided new information about proteomic markers of clinical and genomic tumor subgroups, including relationships to known druggable pathways. An extensive genome-wide acetylation survey yielded insights into regulatory mechanisms linking Wnt signaling and histone acetylation. We also characterized aspects of the tumor immune landscape, including immunogenic alterations, neoantigens, common cancer/testis antigens, and the immune microenvironment, all of which can inform immunotherapy decisions. Collectively, our multi-omic analyses provide a valuable resource for researchers and clinicians, identify new molecular associations of potential mechanistic significance in the development of endometrial cancers, and suggest novel approaches for identifying potential therapeutic targets.


Assuntos
Carcinoma/genética , Neoplasias do Endométrio/genética , Regulação Neoplásica da Expressão Gênica , Proteoma/genética , Transcriptoma , Acetilação , Animais , Antígenos de Neoplasias/genética , Carcinoma/imunologia , Carcinoma/patologia , Neoplasias do Endométrio/imunologia , Neoplasias do Endométrio/patologia , Transição Epitelial-Mesenquimal/genética , Retroalimentação Fisiológica , Feminino , Instabilidade Genômica , Humanos , Camundongos , MicroRNAs/genética , MicroRNAs/metabolismo , Repetições de Microssatélites , Fosforilação , Processamento de Proteína Pós-Traducional , Proteoma/metabolismo , Transdução de Sinais
15.
Nucleic Acids Res ; 47(W1): W199-W205, 2019 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-31114916

RESUMO

WebGestalt is a popular tool for the interpretation of gene lists derived from large scale -omics studies. In the 2019 update, WebGestalt supports 12 organisms, 342 gene identifiers and 155 175 functional categories, as well as user-uploaded functional databases. To address the growing and unique need for phosphoproteomics data interpretation, we have implemented phosphosite set analysis to identify important kinases from phosphoproteomics data. We have completely redesigned result visualizations and user interfaces to improve user-friendliness and to provide multiple types of interactive and publication-ready figures. To facilitate comprehension of the enrichment results, we have implemented two methods to reduce redundancy between enriched gene sets. We introduced a web API for other applications to get data programmatically from the WebGestalt server or pass data to WebGestalt for analysis. We also wrapped the core computation into an R package called WebGestaltR for users to perform analysis locally or in third party workflows. WebGestalt can be freely accessed at http://www.webgestalt.org.


Assuntos
Bases de Dados Genéticas , Software , Conjuntos de Dados como Assunto , Interface Usuário-Computador , Navegador
16.
Mol Cell Proteomics ; 18(8 suppl 1): S141-S152, 2019 08 09.
Artigo em Inglês | MEDLINE | ID: mdl-31142576

RESUMO

Gene set analysis plays a critical role in the functional interpretation of omics data. Although this is typically done for one omics experiment at a time, there is an increasing need to combine gene set analysis results from multiple experiments performed on the same or different omics platforms, such as in multi-omics studies. Integrating results from multiple experiments is challenging, and annotation redundancy between gene sets further obscures clear conclusions. We propose to use a weighted set cover algorithm to reduce redundancy of gene sets identified in a single experiment. Next, we use affinity propagation to consolidate similar gene sets identified from multiple experiments into clusters and to automatically determine the most representative gene set for each cluster. Using three examples from over representation analysis and gene set enrichment analysis, we showed that weighted set cover outperformed a previously published set cover method and reduced the number of gene sets by 52-77%. Focusing on overlapping genes between the list of input genes and the enriched gene sets in over-representation analysis and leading-edge genes in gene set enrichment analysis further reduced the number of gene sets. A use case combining enrichment analysis results from RNA-Seq and proteomics data comparing basal and luminal A breast cancer samples highlighted the known difference in proliferation and DNA damage response. Finally, we used these algorithms for a pan-cancer survival analysis. Our analysis clearly revealed prognosis-related pathways common to multiple cancer types or specific to individual cancer types, as well as pathways associated with prognosis in different directions in different cancer types. We implemented these two algorithms in an R package, Sumer, which generates tables and static and interactive plots for exploration and publication. Sumer is publicly available at https://github.com/bzhanglab/sumer.


Assuntos
Algoritmos , Genômica/métodos , Neoplasias da Mama/genética , Neoplasias Colorretais/genética , Feminino , Regulação Neoplásica da Expressão Gênica , Humanos , Proteínas de Neoplasias/genética , RNA-Seq
17.
Curr Protoc Bioinformatics ; 61(1): e45, 2018 03.
Artigo em Inglês | MEDLINE | ID: mdl-30040199

RESUMO

ECOD is a database of evolutionary domains from structures deposited in the PDB. Domains in ECOD are classified by a mixed manual/automatic method wherein the bulk of newly deposited structures are classified automatically by protein-protein BLAST. Those structures that cannot be classified automatically are referred to manual curators who use a combination of alignment results, functional analysis, and close reading of the literature to generate novel assignments. ECOD differs from other structural domain resources in that it is continually updated, classifying thousands of proteins per week. ECOD recognizes homology as its key organizing concept, rather than structural or sequence similarity alone. Such a classification scheme provides functional information about proteins of interest by placing them in the correct evolutionary context among all proteins of known structure. This unit demonstrates how to access ECOD via the Web and how to search the database by sequence or structure. It also details the distributable data files available for large-scale bioinformatics users. © 2018 by John Wiley & Sons, Inc.


Assuntos
Biologia Computacional/métodos , Bases de Dados de Proteínas , Domínios Proteicos , Proteínas/química , Homologia de Sequência de Aminoácidos , Homologia Estrutural de Proteína , Sequência de Aminoácidos , Alinhamento de Sequência
18.
Bioinformatics ; 34(17): 2997-3003, 2018 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-29659718

RESUMO

Motivation: The ECOD database classifies protein domains based on their evolutionary relationships, considering both remote and close homology. The family group in ECOD provides classification of domains that are closely related to each other based on sequence similarity. Due to different perspectives on domain definition, direct application of existing sequence domain databases, such as Pfam, to ECOD struggles with several shortcomings. Results: We created multiple sequence alignments and profiles from ECOD domains with the help of structural information in alignment building and boundary delineation. We validated the alignment quality by scoring structure superposition to demonstrate that they are comparable to curated seed alignments in Pfam. Comparison to Pfam and CDD reveals that 27 and 16% of ECOD families are new, but they are also dominated by small families, likely because of the sampling bias from the PDB database. There are 35 and 48% of families whose boundaries are modified comparing to counterparts in Pfam and CDD, respectively. Availability and implementation: The new families are now integrated in the ECOD website. The aggregate HMMER profile library and alignment are available for download on ECOD website (http://prodata.swmed.edu/ecod). Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Bases de Dados de Proteínas , Proteínas/química , Domínios Proteicos , Alinhamento de Sequência , Software
19.
J Colloid Interface Sci ; 506: 365-372, 2017 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-28750238

RESUMO

Exploiting novel metal-organic frameworks (MOFs) as electrode materials with superior rate capabilities and understanding their electrochemical behaviour in detail are crucial for boosting the application of MOFs in the field of energy storage. Herein, we prepared Co2(DOBDC) (DOBDC=2,5-dioxido-1,4-benzenedicarboxylate) via a hydrothermal method and explored its electrochemical performance as an anode material for lithium-ion batteries. The as-prepared Co2(DOBDC) MOF exhibits a reversible capacity of 526.1mAhg-1 after 200 charge/discharge cycles at a current density of 500mAg-1 and also demonstrates an impressive rate capability, with a high capacity of 408.2mAhg-1 at a high current density of 2Ag-1. Furthermore, synchrotron-based soft X-ray absorption spectroscopy (sXAS) and electron paramagnetic resonance (EPR) spectroscopy have been applied to investigate the spin state of cobalt in the electrodes at different states of charge. Our results suggest that localized electrons in high-spin (S=3/2) Co2+ in pristine Co2(DOBDC) are gradually delocalized after discharging. It was also found that the high rate capability of Co2(DOBDC) is mainly ascribed to an ultrafast ion intercalation pseudocapacitance process, which results from its unique microporous architecture and adequate specific surface that offers sufficient electrode/electrolyte contact and benefits fast Li+ ion diffusion.

20.
Nucleic Acids Res ; 45(D1): D296-D302, 2017 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-27899594

RESUMO

Evolutionary Classification Of protein Domains (ECOD) (http://prodata.swmed.edu/ecod) comprehensively classifies protein with known spatial structures maintained by the Protein Data Bank (PDB) into evolutionary groups of protein domains. ECOD relies on a combination of automatic and manual weekly updates to achieve its high accuracy and coverage with a short update cycle. ECOD classifies the approximately 120 000 depositions of the PDB into more than 500 000 domains in ∼3400 homologous groups. We show the performance of the weekly update pipeline since the release of ECOD, describe improvements to the ECOD website and available search options, and discuss novel structures and homologous groups that have been classified in the recent updates. Finally, we discuss the future directions of ECOD and further improvements planned for the hierarchy and update process.


Assuntos
Bases de Dados de Proteínas , Evolução Molecular , Modelos Moleculares , Domínios Proteicos , Proteínas , Biologia Computacional/métodos , Conformação Proteica , Proteínas/química , Proteínas/classificação , Proteínas/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...