Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 148
Filtrar
1.
PeerJ ; 11: e16164, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37818330

RESUMO

Background: Aberrant protein kinase regulation leading to abnormal substrate phosphorylation is associated with several human diseases. Despite the promise of therapies targeting kinases, many human kinases remain understudied. Most existing computational tools predicting phosphorylation cover less than 50% of known human kinases. They utilize local feature selection based on protein sequences, motifs, domains, structures, and/or functions, and do not consider the heterogeneous relationships of the proteins. In this work, we present KSFinder, a tool that predicts kinase-substrate links by capturing the inherent association of proteins in a network comprising 85% of the known human kinases. We also postulate the potential role of two understudied kinases based on their substrate predictions from KSFinder. Methods: KSFinder learns the semantic relationships in a phosphoproteome knowledge graph using a knowledge graph embedding algorithm and represents the nodes in low-dimensional vectors. A multilayer perceptron (MLP) classifier is trained to discern kinase-substrate links using the embedded vectors. KSFinder uses a strategic negative generation approach that eliminates biases in entity representation and combines data from experimentally validated non-interacting protein pairs, proteins from different subcellular locations, and random sampling. We assess KSFinder's generalization capability on four different datasets and compare its performance with other state-of-the-art prediction models. We employ KSFinder to predict substrates of 68 "dark" kinases considered understudied by the Illuminating the Druggable Genome program and use our text-mining tool, RLIMS-P along with manual curation, to search for literature evidence for the predictions. In a case study, we performed functional enrichment analysis for two dark kinases - HIPK3 and CAMKK1 using their predicted substrates. Results: KSFinder shows improved performance over other kinase-substrate prediction models and generalized prediction ability on different datasets. We identified literature evidence for 17 novel predictions involving an understudied kinase. All of these 17 predictions had a probability score ≥0.7 (nine at >0.9, six at 0.8-0.9, and two at 0.7-0.8). The evaluation of 93,593 negative predictions (probability ≤0.3) identified four false negatives. The top enriched biological processes of HIPK3 substrates relate to the regulation of extracellular matrix and epigenetic gene expression, while CAMKK1 substrates include lipid storage regulation and glucose homeostasis. Conclusions: KSFinder outperforms the current kinase-substrate prediction tools with higher kinase coverage. The strategically developed negatives provide a superior generalization ability for KSFinder. We predicted substrates of 432 kinases, 68 of which are understudied, and hypothesized the potential functions of two dark kinases using their predicted substrates.


Assuntos
Reconhecimento Automatizado de Padrão , Proteínas Quinases , Humanos , Proteínas Quinases/genética , Fosforilação , Algoritmos , Proteoma/química
2.
PLoS One ; 18(4): e0274042, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37022994

RESUMO

Chinese hamster ovary (CHO) cells are widely used for mass production of therapeutic proteins in the pharmaceutical industry. With the growing need in optimizing the performance of producer CHO cell lines, research on CHO cell line development and bioprocess continues to increase in recent decades. Bibliographic mapping and classification of relevant research studies will be essential for identifying research gaps and trends in literature. To qualitatively and quantitatively understand the CHO literature, we have conducted topic modeling using a CHO bioprocess bibliome manually compiled in 2016, and compared the topics uncovered by the Latent Dirichlet Allocation (LDA) models with the human labels of the CHO bibliome. The results show a significant overlap between the manually selected categories and computationally generated topics, and reveal the machine-generated topic-specific characteristics. To identify relevant CHO bioprocessing papers from new scientific literature, we have developed supervized models using Logistic Regression to identify specific article topics and evaluated the results using three CHO bibliome datasets, Bioprocessing set, Glycosylation set, and Phenotype set. The use of top terms as features supports the explainability of document classification results to yield insights on new CHO bioprocessing papers.


Assuntos
Mineração de Dados , Cricetinae , Animais , Humanos , Células CHO , Cricetulus , Fenótipo , Glicosilação
3.
JAMA Netw Open ; 6(3): e233012, 2023 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-36920393

RESUMO

Importance: The association between degree of neighborhood deprivation and primary hypertension diagnosis in youth remains understudied. Objective: To assess the association between neighborhood measures of deprivation and primary hypertension diagnosis in youth. Design, Setting, and Participants: This cross-sectional study included 65 452 Delaware Medicaid-insured youths aged 8 to 18 years between January 1, 2014, and December 31, 2019. Residence was geocoded by national area deprivation index (ADI). Exposures: Higher area deprivation. Main Outcomes and Measures: The main outcome was primary hypertension diagnosis based on International Classification of Diseases, Ninth Revision and Tenth Revision codes. Data were analyzed between September 1, 2021, and December 31, 2022. Results: A total of 65 452 youths were included in the analysis, including 64 307 (98.3%) without a hypertension diagnosis (30 491 [47%] female and 33 813 [53%] male; mean [SD] age, 12.5 (3.1) years; 12 500 [19%] Hispanic, 25 473 [40%] non-Hispanic Black, 24 565 [38%] non-Hispanic White, and 1769 [3%] other race or ethnicity; 13 029 [20%] with obesity; and 31 548 [49%] with an ADI ≥50) and 1145 (1.7%) with a diagnosis of primary hypertension (mean [SD] age, 13.3 [2.8] years; 464 [41%] female and 681 [59%] male; 271 [24%] Hispanic, 460 [40%] non-Hispanic Black, 396 [35%] non-Hispanic White, and 18 [2%] of other race or ethnicity; 705 [62%] with obesity; and 614 [54%] with an ADI ≥50). The mean (SD) duration of full Medicaid benefit coverage was 61 (16) months for those with a diagnosis of primary hypertension and 46.0 (24.3) months for those without. By multivariable logistic regression, residence within communities with ADI greater than or equal to 50 was associated with 60% greater odds of a hypertension diagnosis (odds ratio [OR], 1.61; 95% CI 1.04-2.51). Older age (OR per year, 1.16; 95%, CI, 1.14-1.18), an obesity diagnosis (OR, 5.16; 95% CI, 4.54-5.85), and longer duration of full Medicaid benefit coverage (OR, 1.03; 95% CI, 1.03-1.04) were associated with greater odds of primary hypertension diagnosis, whereas female sex was associated with lower odds (OR, 0.68; 95%, 0.61-0.77). Model fit including a Medicaid-by-ADI interaction term was significant for the interaction and revealed slightly greater odds of hypertension diagnosis for youths with ADI less than 50 (OR, 1.03; 95% CI, 1.03-1.04) vs ADI ≥50 (OR, 1.02; 95% CI, 1.02-1.03). Race and ethnicity were not associated with primary hypertension diagnosis. Conclusions and Relevance: In this cross-sectional study, higher childhood neighborhood ADI, obesity, age, sex, and duration of Medicaid benefit coverage were associated with a primary hypertension diagnosis in youth. Screening algorithms and national guidelines may consider the importance of ADI when assessing for the presence and prevalence of primary hypertension in youth.


Assuntos
Hipertensão , Medicaid , Estados Unidos/epidemiologia , Humanos , Masculino , Adolescente , Feminino , Criança , Estudos Transversais , Delaware/epidemiologia , Obesidade , Hipertensão/diagnóstico , Hipertensão/epidemiologia , Hipertensão Essencial
4.
Sci Rep ; 13(1): 1200, 2023 01 21.
Artigo em Inglês | MEDLINE | ID: mdl-36681715

RESUMO

Chinese hamster ovary (CHO) cell lines are widely used to manufacture biopharmaceuticals. However, CHO cells are not an optimal expression host due to the intrinsic plasticity of the CHO genome. Genome plasticity can lead to chromosomal rearrangements, transgene exclusion, and phenotypic drift. A poorly understood genomic element of CHO cell line instability is extrachromosomal circular DNA (eccDNA) in gene expression and regulation. EccDNA can facilitate ultra-high gene expression and are found within many eukaryotes including humans, yeast, and plants. EccDNA confers genetic heterogeneity, providing selective advantages to individual cells in response to dynamic environments. In CHO cell cultures, maintaining genetic homogeneity is critical to ensuring consistent productivity and product quality. Understanding eccDNA structure, function, and microevolutionary dynamics under various culture conditions could reveal potential engineering targets for cell line optimization. In this study, eccDNA sequences were investigated at the beginning and end of two-week fed-batch cultures in an ambr®250 bioreactor under control and lactate-stressed conditions. This work characterized structure and function of eccDNA in a CHO-K1 clone. Gene annotation identified 1551 unique eccDNA genes including cancer driver genes and genes involved in protein production. Furthermore, RNA-seq data is integrated to identify transcriptionally active eccDNA genes.


Assuntos
Técnicas de Cultura Celular por Lotes , Ácido Láctico , Cricetinae , Animais , Humanos , Cricetulus , Células CHO , Genoma , DNA
5.
Nucleic Acids Res ; 51(D1): D418-D427, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36350672

RESUMO

The InterPro database (https://www.ebi.ac.uk/interpro/) provides an integrative classification of protein sequences into families, and identifies functionally important domains and conserved sites. Here, we report recent developments with InterPro (version 90.0) and its associated software, including updates to data content and to the website. These developments extend and enrich the information provided by InterPro, and provide a more user friendly access to the data. Additionally, we have worked on adding Pfam website features to the InterPro website, as the Pfam website will be retired in late 2022. We also show that InterPro's sequence coverage has kept pace with the growth of UniProtKB. Moreover, we report the development of a card game as a method of engaging the non-scientific community. Finally, we discuss the benefits and challenges brought by the use of artificial intelligence for protein structure prediction.


Assuntos
Bases de Dados de Proteínas , Humanos , Sequência de Aminoácidos , Inteligência Artificial , Internet , Proteínas/química , Software
6.
Mol Omics ; 18(9): 853-864, 2022 10 31.
Artigo em Inglês | MEDLINE | ID: mdl-35975455

RESUMO

The human proteome contains a vast network of interacting kinases and substrates. Even though some kinases have proven to be immensely useful as therapeutic targets, a majority are still understudied. In this work, we present a novel knowledge graph representation learning approach to predict novel interaction partners for understudied kinases. Our approach uses a phosphoproteomic knowledge graph constructed by integrating data from iPTMnet, protein ontology, gene ontology and BioKG. The representations of kinases and substrates in this knowledge graph are learned by performing directed random walks on triples coupled with a modified SkipGram or CBOW model. These representations are then used as an input to a supervised classification model to predict novel interactions for understudied kinases. We also present a post-predictive analysis of the predicted interactions and an ablation study of the phosphoproteomic knowledge graph to gain an insight into the biology of the understudied kinases.


Assuntos
Reconhecimento Automatizado de Padrão , Proteoma , Humanos , Ontologia Genética , Especificidade por Substrato
7.
Methods Mol Biol ; 2499: 187-204, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35696082

RESUMO

iPTMnet is a resource that combines rich information about protein post-translational modifications (PTM) from curated databases as well as text mining tools. Researchers can use the iPTMnet website to query, analyze and download the PTM data. In this chapter we describe the iPTMnet RESTful API which provides a way to streamline the integration of iPTMnet data into an automated data analysis workflow. In the first section, we give an overview of the architecture of the API. In the second section, we describe various function defined by the API and provide detailed examples of using these functions.


Assuntos
Mineração de Dados , Processamento de Proteína Pós-Traducional , Bases de Dados de Proteínas , Proteínas/metabolismo , Fluxo de Trabalho
8.
BMC Plant Biol ; 22(1): 107, 2022 Mar 08.
Artigo em Inglês | MEDLINE | ID: mdl-35260072

RESUMO

BACKGROUND: Sustainable production of high-quality feedstock has been of great interest in bioenergy research. Despite the economic importance, high temperatures and water deficit are limiting factors for the successful cultivation of switchgrass in semi-arid areas. There are limited reports on the molecular basis of combined abiotic stress tolerance in switchgrass, particularly the combination of drought and heat stress. We used transcriptomic approaches to elucidate the changes in the response of switchgrass to drought and high temperature simultaneously. RESULTS: We conducted solely drought treatment in switchgrass plant Alamo AP13 by withholding water after 45 days of growing. For the combination of drought and heat effect, heat treatment (35 °C/25 °C day/night) was imposed after 72 h of the initiation of drought. Samples were collected at 0 h, 72 h, 96 h, 120 h, 144 h, and 168 h after treatment imposition, total RNA was extracted, and RNA-Seq conducted. Out of a total of 32,190 genes, we identified 3912, as drought (DT) responsive genes, 2339 and 4635 as, heat (HT) and drought and heat (DTHT) responsive genes, respectively. There were 209, 106, and 220 transcription factors (TFs) differentially expressed under DT, HT and DTHT respectively. Gene ontology annotation identified the metabolic process as the significant term enriched in DTHT genes. Other biological processes identified in DTHT responsive genes included: response to water, photosynthesis, oxidation-reduction processes, and response to stress. KEGG pathway enrichment analysis on DT and DTHT responsive genes revealed that TFs and genes controlling phenylpropanoid pathways were important for individual as well as combined stress response. For example, hydroxycinnamoyl-CoA shikimate/quinate hydroxycinnamoyl transferase (HCT) from the phenylpropanoid pathway was induced by single DT and combinations of DTHT stress. CONCLUSION: Through RNA-Seq analysis, we have identified unique and overlapping genes in response to DT and combined DTHT stress in switchgrass. The combination of DT and HT stress may affect the photosynthetic machinery and phenylpropanoid pathway of switchgrass which negatively impacts lignin synthesis and biomass production of switchgrass. The biological function of genes identified particularly in response to DTHT stress could further be confirmed by techniques such as single point mutation or RNAi.


Assuntos
Adaptação Fisiológica/genética , Desidratação/genética , Resposta ao Choque Térmico/genética , Panicum/genética , Transcriptoma , Perfilação da Expressão Gênica , Regulação da Expressão Gênica de Plantas , Genes de Plantas
9.
PLoS Biol ; 19(12): e3001464, 2021 12.
Artigo em Inglês | MEDLINE | ID: mdl-34871295

RESUMO

The UniProt knowledgebase is a public database for protein sequence and function, covering the tree of life and over 220 million protein entries. Now, the whole community can use a new crowdsourcing annotation system to help scale up UniProt curation and receive proper attribution for their biocuration work.


Assuntos
Crowdsourcing/métodos , Curadoria de Dados/métodos , Anotação de Sequência Molecular/métodos , Sequência de Aminoácidos/genética , Biologia Computacional/métodos , Bases de Dados de Proteínas/tendências , Humanos , Literatura , Proteínas/metabolismo , Participação dos Interessados
10.
Bioinformatics ; 37(23): 4597-4598, 2021 12 07.
Artigo em Inglês | MEDLINE | ID: mdl-34613368

RESUMO

SUMMARY: The global response to the COVID-19 pandemic has led to a rapid increase of scientific literature on this deadly disease. Extracting knowledge from biomedical literature and integrating it with relevant information from curated biological databases is essential to gain insight into COVID-19 etiology, diagnosis and treatment. We used Semantic Web technology RDF to integrate COVID-19 knowledge mined from literature by iTextMine, PubTator and SemRep with relevant biological databases and formalized the knowledge in a standardized and computable COVID-19 Knowledge Graph (KG). We published the COVID-19 KG via a SPARQL endpoint to support federated queries on the Semantic Web and developed a knowledge portal with browsing and searching interfaces. We also developed a RESTful API to support programmatic access and provided RDF dumps for download. AVAILABILITY AND IMPLEMENTATION: The COVID-19 Knowledge Graph is publicly available under CC-BY 4.0 license at https://research.bioinformatics.udel.edu/covid19kg/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
COVID-19 , Semântica , Humanos , Pandemias , Reconhecimento Automatizado de Padrão , Bases de Dados Factuais
11.
mBio ; 12(5): e0206021, 2021 10 26.
Artigo em Inglês | MEDLINE | ID: mdl-34517763

RESUMO

We describe here the structure and organization of TnCentral (https://tncentral.proteininformationresource.org/ [or the mirror link at https://tncentral.ncc.unesp.br/]), a web resource for prokaryotic transposable elements (TE). TnCentral currently contains ∼400 carefully annotated TE, including transposons from the Tn3, Tn7, Tn402, and Tn554 families; compound transposons; integrons; and associated insertion sequences (IS). These TE carry passenger genes, including genes conferring resistance to over 25 classes of antibiotics and nine types of heavy metal, as well as genes responsible for pathogenesis in plants, toxin/antitoxin gene pairs, transcription factors, and genes involved in metabolism. Each TE has its own entry page, providing details about its transposition genes, passenger genes, and other sequence features required for transposition, as well as a graphical map of all features. TnCentral content can be browsed and queried through text- and sequence-based searches with a graphic output. We describe three use cases, which illustrate how the search interface, results tables, and entry pages can be used to explore and compare TE. TnCentral also includes downloadable software to facilitate user-driven identification, with manual annotation, of certain types of TE in genomic sequences. Through the TnCentral homepage, users can also access TnPedia, which provides comprehensive reviews of the major TE families, including an extensive general section and specialized sections with descriptions of insertion sequence and transposon families. TnCentral and TnPedia are intuitive resources that can be used by clinicians and scientists to assess TE diversity in clinical, veterinary, and environmental samples. IMPORTANCE The ability of bacteria to undergo rapid evolution and adapt to changing environmental circumstances drives the public health crisis of multiple antibiotic resistance, as well as outbreaks of disease in economically important agricultural crops and animal husbandry. Prokaryotic transposable elements (TE) play a critical role in this. Many carry "passenger genes" (not required for the transposition process) conferring resistance to antibiotics or heavy metals or causing disease in plants and animals. Passenger genes are spread by normal TE transposition activities and by insertion into plasmids, which then spread via conjugation within and across bacterial populations. Thus, an understanding of TE composition and transposition mechanisms is key to developing strategies to combat bacterial pathogenesis. Toward this end, we have developed TnCentral, a bioinformatics resource dedicated to describing and exploring the structural and functional features of prokaryotic TE whose use is intuitive and accessible to users with or without bioinformatics expertise.


Assuntos
Bactérias/genética , Biologia Computacional/métodos , Elementos de DNA Transponíveis , Bases de Dados Genéticas , Biologia Computacional/instrumentação , Internet , Software , Navegador
12.
Cancer Res ; 81(11): 3051-3066, 2021 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-33727228

RESUMO

Lung cancer is the leading cause of cancer mortality worldwide. The treatment of patients with lung cancer harboring mutant EGFR with orally administered EGFR tyrosine kinase inhibitors (TKI) has been a paradigm shift. Osimertinib and rociletinib are third-generation irreversible EGFR TKIs targeting the EGFR T790M mutation. Osimertinib is the current standard of care for patients with EGFR mutations due to increased efficacy, lower side effects, and enhanced brain penetrance. Unfortunately, all patients develop resistance. Genomic approaches have primarily been used to interrogate resistance mechanisms. Here we characterized the proteome and phosphoproteome of a series of isogenic EGFR-mutant lung adenocarcinoma cell lines that are either sensitive or resistant to these drugs, comprising the most comprehensive proteomic dataset resource to date to investigate third generation EGFR TKI resistance in lung adenocarcinoma. Unbiased global quantitative mass spectrometry uncovered alterations in signaling pathways, revealed a proteomic signature of epithelial-mesenchymal transition, and identified kinases and phosphatases with altered expression and phosphorylation in TKI-resistant cells. Decreased tyrosine phosphorylation of key sites in the phosphatase SHP2 suggests its inhibition, resulting in subsequent inhibition of RAS/MAPK and activation of PI3K/AKT pathways. Anticorrelation analyses of this phosphoproteomic dataset with published drug-induced P100 phosphoproteomic datasets from the Library of Integrated Network-Based Cellular Signatures program predicted drugs with the potential to overcome EGFR TKI resistance. The PI3K/MTOR inhibitor dactolisib in combination with osimertinib overcame resistance both in vitro and in vivo. Taken together, this study reveals global proteomic alterations upon third generation EGFR TKI resistance and highlights potential novel approaches to overcome resistance. SIGNIFICANCE: Global quantitative proteomics reveals changes in the proteome and phosphoproteome in lung cancer cells resistant to third generation EGFR TKIs, identifying the PI3K/mTOR inhibitor dactolisib as a potential approach to overcome resistance.


Assuntos
Adenocarcinoma de Pulmão/tratamento farmacológico , Resistencia a Medicamentos Antineoplásicos , Imidazóis/farmacologia , Fosfoproteínas/metabolismo , Inibidores de Proteínas Quinases/farmacologia , Proteoma/metabolismo , Quinolinas/farmacologia , Adenocarcinoma de Pulmão/metabolismo , Adenocarcinoma de Pulmão/patologia , Antineoplásicos/farmacologia , Apoptose , Proliferação de Células , Receptores ErbB/antagonistas & inibidores , Humanos , Neoplasias Pulmonares/tratamento farmacológico , Neoplasias Pulmonares/metabolismo , Neoplasias Pulmonares/patologia , Fosfatidilinositol 3-Quinases/química , Fosfoproteínas/análise , Proteoma/análise , Serina-Treonina Quinases TOR/antagonistas & inibidores , Células Tumorais Cultivadas
13.
BMC Biotechnol ; 21(1): 4, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33419422

RESUMO

BACKGROUND: As bioprocess intensification has increased over the last 30 years, yields from mammalian cell processes have increased from 10's of milligrams to over 10's of grams per liter. Most of these gains in productivity can be attributed to increasing cell densities within bioreactors. As such, strategies have been developed to minimize accumulation of metabolic wastes, such as lactate and ammonia. Unfortunately, neither cell growth nor biopharmaceutical production can occur without some waste metabolite accumulation. Inevitably, metabolic waste accumulation leads to decline and termination of the culture. While it is understood that the accumulation of these unwanted compounds imparts a suboptimal culture environment, little is known about the genotoxic properties of these compounds that may lead to global genome instability. In this study, we examined the effects of high and moderate extracellular ammonia on the physiology and genomic integrity of Chinese hamster ovary (CHO) cells. RESULTS: Through whole genome sequencing, we discovered 2394 variant sites within functional genes comprised of both single nucleotide polymorphisms and insertion/deletion mutations as a result of ammonia stress with high or moderate impact on functional genes. Furthermore, several of these de novo mutations were found in genes whose functions are to maintain genome stability, such as Tp53, Tnfsf11, Brca1, as well as Nfkb1. Furthermore, we characterized microsatellite content of the cultures using the CriGri-PICR Chinese hamster genome assembly and discovered an abundance of microsatellite loci that are not replicated faithfully in the ammonia-stressed cultures. Unfaithful replication of these loci is a signature of microsatellite instability. With rigorous filtering, we found 124 candidate microsatellite loci that may be suitable for further investigation to determine whether these loci may be reliable biomarkers to predict genome instability in CHO cultures. CONCLUSION: This study advances our knowledge with regards to the effects of ammonia accumulation on CHO cell culture performance by identifying ammonia-sensitive genes linked to genome stability and lays the foundation for the development of a new diagnostic tool for assessing genome stability.


Assuntos
Amônia/metabolismo , Técnicas de Cultura Celular por Lotes/métodos , Variação Genética , Instabilidade de Microssatélites , Animais , Proteína BRCA1/metabolismo , Biomarcadores , Reatores Biológicos , Células CHO , Contagem de Células , Cricetulus , Meios de Cultura , Feminino , Genes p53 , Variação Genética/genética , Ácido Láctico/metabolismo , Mutação , Subunidade p50 de NF-kappa B/metabolismo , Ovário/metabolismo , Ligante RANK/metabolismo
14.
Nucleic Acids Res ; 49(D1): D344-D354, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33156333

RESUMO

The InterPro database (https://www.ebi.ac.uk/interpro/) provides an integrative classification of protein sequences into families, and identifies functionally important domains and conserved sites. InterProScan is the underlying software that allows protein and nucleic acid sequences to be searched against InterPro's signatures. Signatures are predictive models which describe protein families, domains or sites, and are provided by multiple databases. InterPro combines signatures representing equivalent families, domains or sites, and provides additional information such as descriptions, literature references and Gene Ontology (GO) terms, to produce a comprehensive resource for protein classification. Founded in 1999, InterPro has become one of the most widely used resources for protein family annotation. Here, we report the status of InterPro (version 81.0) in its 20th year of operation, and its associated software, including updates to database content, the release of a new website and REST API, and performance improvements in InterProScan.


Assuntos
Bases de Dados de Proteínas , Proteínas/química , Sequência de Aminoácidos , COVID-19/metabolismo , Internet , Anotação de Sequência Molecular , Domínios Proteicos , Mapas de Interação de Proteínas , SARS-CoV-2/metabolismo , Alinhamento de Sequência
15.
Sci Data ; 7(1): 337, 2020 10 12.
Artigo em Inglês | MEDLINE | ID: mdl-33046717

RESUMO

The Protein Ontology (PRO) provides an ontological representation of protein-related entities, ranging from protein families to proteoforms to complexes. Protein Ontology Linked Open Data (LOD) exposes, shares, and connects knowledge about protein-related entities on the Semantic Web using Resource Description Framework (RDF), thus enabling integration with other Linked Open Data for biological knowledge discovery. For example, proteins (or variants thereof) can be retrieved on the basis of specific disease associations. As a community resource, we strive to follow the Findability, Accessibility, Interoperability, and Reusability (FAIR) principles, disseminate regular updates of our data, support multiple methods for accessing, querying and downloading data in various formats, and provide documentation both for scientists and programmers. PRO Linked Open Data can be browsed via faceted browser interface and queried using SPARQL via YASGUI. RDF data dumps are also available for download. Additionally, we developed RESTful APIs to support programmatic data access. We also provide W3C HCLS specification compliant metadata description for our data. The PRO Linked Open Data is available at https://lod.proconsortium.org/ .


Assuntos
Descoberta do Conhecimento , Proteínas/química , Web Semântica , Conjuntos de Dados como Assunto , Software
16.
Adv Biosyst ; 4(9): e2000119, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-32603024

RESUMO

Late recurrences of breast cancer are hypothesized to originate from disseminated tumor cells that re-activate after a long period of dormancy, ≥5 years for estrogen-receptor positive (ER+) tumors. An outstanding question remains as to what the key microenvironment interactions are that regulate this complex process, and well-defined human model systems are needed for probing this. Here, a robust, bioinspired 3D ER+ dormancy culture model is established and utilized to probe the effects of matrix properties for common sites of late recurrence on breast cancer cell dormancy. Formation of dormant micrometastases over several weeks is examined for ER+ cells (T47D, BT474), where the timing of entry into dormancy versus persistent growth depends on matrix composition and cell type. In contrast, triple negative cells (MDA-MB-231), associated with early recurrence, are not observed to undergo long-term dormancy. Bioinformatic analyses quantitatively support an increased "dormancy score" gene signature for ER+ cells (T47D) and reveal differential expression of genes associated with different biological processes based on matrix composition. Further, these analyses support a link between dormancy and autophagy, a potential survival mechanism. This robust model system will allow systematic investigations of other cell-microenvironment interactions in dormancy and evaluation of therapeutics for preventing late recurrence.


Assuntos
Neoplasias da Mama , Técnicas de Cultura de Células/métodos , Modelos Biológicos , Receptores de Estrogênio/metabolismo , Microambiente Tumoral/fisiologia , Autofagia , Neoplasias da Mama/química , Neoplasias da Mama/metabolismo , Neoplasias da Mama/fisiopatologia , Linhagem Celular Tumoral , Matriz Extracelular/metabolismo , Feminino , Humanos , Biologia Sintética
17.
Nucleic Acids Res ; 48(W1): W85-W93, 2020 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-32469073

RESUMO

Rapid progress in proteomics and large-scale profiling of biological systems at the protein level necessitates the continued development of efficient computational tools for the analysis and interpretation of proteomics data. Here, we present the piNET server that facilitates integrated annotation, analysis and visualization of quantitative proteomics data, with emphasis on PTM networks and integration with the LINCS library of chemical and genetic perturbation signatures in order to provide further mechanistic and functional insights. The primary input for the server consists of a set of peptides or proteins, optionally with PTM sites, and their corresponding abundance values. Several interconnected workflows can be used to generate: (i) interactive graphs and tables providing comprehensive annotation and mapping between peptides and proteins with PTM sites; (ii) high resolution and interactive visualization for enzyme-substrate networks, including kinases and their phospho-peptide targets; (iii) mapping and visualization of LINCS signature connectivity for chemical inhibitors or genetic knockdown of enzymes upstream of their target PTM sites. piNET has been built using a modular Spring-Boot JAVA platform as a fast, versatile and easy to use tool. The Apache Lucene indexing is used for fast mapping of peptides into UniProt entries for the human, mouse and other commonly used model organism proteomes. PTM-centric network analyses combine PhosphoSitePlus, iPTMnet and SIGNOR databases of validated enzyme-substrate relationships, for kinase networks augmented by DeepPhos predictions and sequence-based mapping of PhosphoSitePlus consensus motifs. Concordant LINCS signatures are mapped using iLINCS. For each workflow, a RESTful API counterpart can be used to generate the results programmatically in the json format. The server is available at http://pinet-server.org, and it is free and open to all users without login requirement.


Assuntos
Processamento de Proteína Pós-Traducional , Proteômica/métodos , Software , Animais , Gráficos por Computador , Enzimas/metabolismo , Humanos , Internet , Camundongos , Peptídeos/química , Peptídeos/metabolismo , Proteínas/química , Proteínas/metabolismo , Fluxo de Trabalho
18.
Database (Oxford) ; 20202020 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-32395768

RESUMO

iPTMnet is a bioinformatics resource that integrates protein post-translational modification (PTM) data from text mining and curated databases and ontologies to aid in knowledge discovery and scientific study. The current iPTMnet website can be used for querying and browsing rich PTM information but does not support automated iPTMnet data integration with other tools. Hence, we have developed a RESTful API utilizing the latest developments in cloud technologies to facilitate the integration of iPTMnet into existing tools and pipelines. We have packaged iPTMnet API software in Docker containers and published it on DockerHub for easy redistribution. We have also developed Python and R packages that allow users to integrate iPTMnet for scientific discovery, as demonstrated in a use case that connects PTM sites to kinase signaling pathways.


Assuntos
Biologia Computacional , Software , Mineração de Dados , Processamento de Proteína Pós-Traducional , Proteínas/genética
19.
Med Teach ; 42(2): 187-195, 2020 02.
Artigo em Inglês | MEDLINE | ID: mdl-31608726

RESUMO

Purpose: Human capabilities in medicine, including communication skills, are increasingly important within the complex, challenging and dynamic landscape of healthcare. Supporting medical students to manage unavoidable role-related stressors adaptively may help mitigate the anguish that is too commonly reported among the profession. We developed a model, "MaRIS", underpinned by contemplative pedagogy, to support medical students to enhance their human capabilities, across all three domains of Bloom's taxonomy, and their personal resilience. It is the first to integrate Mindfulness, affective Reflection, Impactive experiences and a Supportive environment into medical curriculum design. Here, we describe the theoretical basis underpinning MaRIS and present a preliminary study to evaluate its impact on students' subjectively-rated capabilities.Materials and Methods: A questionnaire capturing self-ratings of competence, empathy and resilience, as well as impressions of their experiences, was administered to foundation year medical students before (T0), during (T1) and after delivery (T2).Results: Fifty-five students completed the survey at all time points. Mean scores for all domains increased significantly from T0 to T1 and from T0 to T2. Free-text comments suggest learning impact across the cognitive, psychomotor and affective domains.Conclusions: MaRIS appears to facilitate medical students' establishment of the foundations for building the human capabilities and personal resilience required for professional practice.


Assuntos
Educação de Graduação em Medicina/métodos , Relações Interprofissionais , Relações Médico-Paciente , Resiliência Psicológica , Estudantes de Medicina/psicologia , Adulto , Competência Clínica , Comunicação , Currículo , Empatia , Feminino , Humanos , Masculino , Atenção Plena , Inquéritos e Questionários , Adulto Jovem
20.
PLoS One ; 14(7): e0216913, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31361753

RESUMO

Significant progress has been made in applying deep learning on natural language processing tasks recently. However, deep learning models typically require a large amount of annotated training data while often only small labeled datasets are available for many natural language processing tasks in biomedical literature. Building large-size datasets for deep learning is expensive since it involves considerable human effort and usually requires domain expertise in specialized fields. In this work, we consider augmenting manually annotated data with large amounts of data using distant supervision. However, data obtained by distant supervision is often noisy, we first apply some heuristics to remove some of the incorrect annotations. Then using methods inspired from transfer learning, we show that the resulting models outperform models trained on the original manually annotated sets.


Assuntos
Curadoria de Dados , Mineração de Dados , Aprendizado Profundo , Modelos Teóricos , Processamento de Linguagem Natural , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...