Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Patterns (N Y) ; 5(5): 100955, 2024 May 10.
Artigo em Inglês | MEDLINE | ID: mdl-38800367

RESUMO

Materials scientists usually collect experimental data to summarize experiences and predict improved materials. However, a crucial issue is how to proficiently utilize unstructured data to update existing structured data, particularly in applied disciplines. This study introduces a new natural language processing (NLP) task called structured information inference (SII) to address this problem. We propose an end-to-end approach to summarize and organize the multi-layered device-level information from the literature into structured data. After comparing different methods, we fine-tuned LLaMA with an F1 score of 87.14% to update an existing perovskite solar cell dataset with articles published since its release, allowing its direct use in subsequent data analysis. Using structured information, we developed regression tasks to predict the electrical performance of solar cells. Our results demonstrate comparable performance to traditional machine-learning methods without feature selection and highlight the potential of large language models for scientific knowledge acquisition and material development.

2.
Sci Data ; 11(1): 146, 2024 Jan 31.
Artigo em Inglês | MEDLINE | ID: mdl-38296978

RESUMO

The rise of urbanization coupled with pollution has highlighted the importance of outdoor self-cleaning coatings. These revolutionary coatings contribute to the longevity of various surfaces and reduce maintenance costs for a wide range of applications. Despite ongoing research to develop efficient and durable self-cleaning coatings, adopting systematic research methodologies could accelerate these advancements. In this work, we use Natural Language Processing (NLP) strategies to generate open- and traceable-sourced datasets about self-cleaning coating materials from 39,011 multi-disciplinary papers. The data are from function-based and property-based corpora for self-cleaning purposes. These datasets are presented in four different formats for diverse uses or combined uses: material frequency statistics, material dictionary, measurement value datasets for self-cleaning-related properties and optical properties, and sentiment statistics of material stability and durability. This provides a literature-based data resource for the development of self-cleaning coatings and also offers potential pathways for material discovery and prediction by machine learning.

3.
J Chem Inf Model ; 64(7): 2746-2759, 2024 Apr 08.
Artigo em Inglês | MEDLINE | ID: mdl-37982753

RESUMO

The scientific literature contains valuable information that can be used for future applications, but manual analysis presents challenges due to its size and disciplinary boundaries. The prevailing solution involves natural language processing (NLP) techniques such as information retrieval. Nonetheless, existing automated systems primarily provide either statistically based shallow information or deep information without traceability, thereby falling short of delivering high-quality and reliable insights. To address this, we propose an innovative approach of leveraging sentiment information embedded within the literature to track the opinions toward materials. In this study, we integrated material knowledge into text representation and constructed opinion data sets to hierarchically train deep learning models, named as Scientific Sentiment Network (SSNet). SSNet can effectively extract knowledge from the energy material literature and accurately categorize expert opinions into challenges and opportunities (94% and 92% accuracy, respectively). By incorporating sentiment features determined by SSNet, we can predict the ranking of emerging thermoelectric materials with a 70% correlation to experimental outcomes. Furthermore, our model achieves a commendable 68% accuracy in predicting suitable nanomaterials for atomic layer deposition (ALD) over time. These promising results offer a practical framework to extract and synthesize knowledge from the scientific literature, thereby accelerating research in the field of nanomaterials.


Assuntos
Redes Neurais de Computação , Análise de Sentimentos , Armazenamento e Recuperação da Informação
4.
Curr Med Chem ; 2023 Sep 08.
Artigo em Inglês | MEDLINE | ID: mdl-37881092

RESUMO

BACKGROUND: Peptides play crucial roles in diverse cellular functions and participate in many biological processes by interacting with a variety of proteins, which have also been exploited as a promising class of therapeutic agents to target druggable proteins over the past decades. Understanding the intrinsic association between the structure and affinity ofprotein-peptide interactions (PpIs) should be considerably valuable for the computational peptidology area, such as guiding protein-peptide docking calculations, developing protein-peptide affinity scoring functions, and designing peptide ligands for specific protein receptors. OBJECTIVE: We attempted to create a data source for relating PpI structure to affinity. METHODS: By exhaustively surveying the whole protein data bank (PDB) database as well as the ontologically enriched literature information, we manually curated a structure-based data set of protein-peptide affinities, PpI[S/A]DS, which assembled over 350 PpI complex samples with both the experimentally measured structure and affinity data. The data set was further reduced to a nonredundant benchmark consisting of 102 culled samples, PpI[S/A]BM, which only selected those of structurally reliable, functionally diverse and evolutionarily nonhomologous. RESULTS: The collected structures were resolved at a high-resolution level with either X-ray crystallography or solution NMR, while the deposited affinities were characterized by dissociation constant, i.e. Kd value, which is a direct biophysical measure of the intermolecular interaction strength between protein and peptide, ranging from subnanomolar to millimolar levels. The PpI samples in the set/benchmark were arbitrarily classified into α-helix, partial α-helix, ß-sheet formed through binding, ß-strand formed through self-folding, mixed, and other irregular ones, totally resulting in six classes according to the secondary structure of their peptide ligands. In addition, we also categorized these PpIs in terms of their biological function and binding behavior. CONCLUSION: The PpI[S/A]DS set and PpI[S/A]BM benchmark can be considered a valuable data source in the computational peptidology community, aiming to relate the affinity to structure for PpIs.

5.
J Mol Recognit ; 36(6): e3014, 2023 06.
Artigo em Inglês | MEDLINE | ID: mdl-37014036

RESUMO

Human angiotensin-converting enzyme (ACE) is a well-established druggable target for the treatment of hypertension (HTN), which contains two structurally homologous but functionally distinct N- and C-domains. Selective inhibition of the C-domain primarily contributes to the antihypertensive efficiency and can be exploited as medicinal agents and functional additives for regulating blood pressure with high safety. In this study, we used a machine annealing (MA) strategy to guide the navigation of antihypertensive peptides (AHPs) in structurally interacting diversity space with the two ACE domains based on their crystal/modeled complex structures and an in-house protein-peptide affinity scoring function, aiming to optimize the peptide selectivity for C-domain over N-domain. The strategy generated a panel of theoretically designed AHP hits with a satisfactory C-over-N (C > N) selectivity profile, from which several hits were found to have a good C > N selectivity, which is roughly comparable with or even better than the BPPb, a natural C > N-selective ACE-inhibitory peptide. Structural analysis and comparison of domain-peptide noncovalent interaction patterns revealed that (i) longer peptides (>4 amino aids) generally exhibit stronger selectivity than shorter peptides (<4 amino aids), (ii) peptide sequence can be divided into two, section I (including peptide C-terminal region) and section II (including peptide middle and N-terminal regions); the former contributes to both peptide affinity (primarily) and selectivity (secondarily), while the latter is almost only responsible for peptide selectivity, and (iii) charged/polar amino acids confer to peptide selectivity relative to hydrophobic/nonpolar amino acids (that confer to peptide affinity).


Assuntos
Anti-Hipertensivos , Peptídeos , Humanos , Sequência de Aminoácidos , Anti-Hipertensivos/farmacologia , Anti-Hipertensivos/química , Anti-Hipertensivos/metabolismo , Domínios Proteicos
6.
Proteomics ; 23(6): e2200175, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36461811

RESUMO

Peptide-mediated interactions (PMIs) play a crucial role in cell signaling network, which are responsible for about half of cellular protein-protein associations in the human interactome and have recently been recognized as a new kind of promising druggable target for drug development and disease therapy. In this article, we give a systematic review regarding the proteome-wide discovery of PMIs and targeting druggable PMIs (dPMIs) with chemical drugs, self-inhibitory peptides (SIPs) and protein agents, particularly focusing on their implications and applications for therapeutic purpose in omics. We also introduce computational peptidology strategies used to model, analyze, and design PMI-targeted molecular entities and further extend the concepts of protein context, direct/indirect readout, and enthalpy/entropy effect involved in PMIs. Current issues and future perspective on this topic are discussed. There is still a long way to go before establishment of efficient therapeutic strategies to target PMIs on the omics scale.


Assuntos
Peptídeos , Proteínas , Humanos , Peptídeos/química , Proteínas/metabolismo , Entropia
7.
Amino Acids ; 55(2): 235-242, 2023 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-36474016

RESUMO

Peptide quantitative structure-activity relationships (pQSARs) have been widely applied to the statistical modeling and empirical prediction of peptide activity, property and feature. In the procedure, the peptide structure is characterized at sequence level using amino acid descriptors (AADs) and then correlated with observations by machine learning methods (MLMs), consequently resulting in a variety of quantitative regression models used to explain the structural factors that govern peptide activities, to generalize peptide properties of unknown from known samples, and to design new peptides with desired features. In this study, we developed a comprehensive platform, termed PepQSAR database, which is a systematic collection and decomposition of various data sources and abundant information regarding the pQSARs, including AADs, MLMs, data sets, peptide sequences, measured activities, model statistics, and literatures. The database also provides a comparison function for the various previously built pQSAR models reported by different groups via distinct approaches. The structured and searchable PepQSAR database is expected to provide a useful resource and powerful tool for the computational peptidology community, which is freely available at http://i.uestc.edu.cn/PQsarDB .


Assuntos
Fonte de Informação , Relação Quantitativa Estrutura-Atividade , Peptídeos/química , Sequência de Aminoácidos
8.
J Mol Recognit ; 36(3): e3006, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36579779

RESUMO

Protein-peptide interactions (PpIs) play an important role in cell signaling networks and have been exploited as new and attractive therapeutic targets. The affinity and specificity are two unity-of-opposite aspects of PpIs (and other biomolecular interactions); the former indicates the absolute binding strength between the peptide ligand and its cognate protein receptor in a PpI, while the latter represents the relative recognition selectivity of the peptide ligand for its cognate protein receptor in a PpI over those noncognate decoys that could be potentially encountered by the peptide in cell. Although the PpI binding affinity has been widely investigated over the past decades, the peptide recognition specificity (and selectivity) still remains largely unexplored to date. In this study, we classified PpI specificity into three types: (i) class-I specificity: peptide selectivity for its cognate wild-type protein receptor over the noncognate mutant decoys of this receptor, (ii) class-II specificity: peptide selectivity for its cognate protein receptor over other noncognate decoys that are homologous with this receptor, and (iii) class-III specificity: peptide selectivity for its cognate protein receptor over other noncognate decoys that are the cognate receptors of other peptides. We performed affinity and selectivity analysis for the three types of PpI specificity and revealed that the PpIs generally exhibit a moderate or modest specificity; peptide selectivity increases in the order: class-I < class-II < class-III. All the three types of PpI specificity were observed to have no statistically significant correlation with peptide length and hydrophobicity, but the class-I and class-II specificities can be influenced considerably by peptide secondary structures; the high specificity is preferentially associated with ordered structure types as compared to undefined structure types. In addition, the mutation distribution (for class-I specificity), sequence conservation (for class-II specificity), and structural similarity (for class-III specificity) seem also to address effects on peptide selectivity.


Assuntos
Peptídeos , Inibidores da Bomba de Prótons , Ligantes , Peptídeos/química
9.
Membranes (Basel) ; 12(6)2022 Jun 11.
Artigo em Inglês | MEDLINE | ID: mdl-35736317

RESUMO

The separation of chloride and sulphate is important for the treatment of high salt wastewater, and monovalent selective electrodialysis (MSED) has advantages in terms of energy consumption and pre-treatment costs compared to nanofiltration salt separation. Most of the research on monovalent anion-selective membranes (MASM) is still on a laboratory scale due to the preparation process, cost, and other reasons. In this study, a low-cost, easy-to-operate modification scheme was used to prepare MASM, which was applied to assemble a pilot-scale electrodialysis device to treat reverse osmosis concentrated water with a salt content of 4% to 5%. The results indicate that the optimum operating conditions for the device are: 250 L/h influent flow rate for the concentration and dilute compartments, 350 L/h influent flow rate for the electrode compartment and a constant voltage of 20 V. The separation effect of the pilot electrodialysis plant at optimal operating conditions was: the Cl- and SO42- transmission rates of 80% and 2.54% respectively, the separation efficiency (S) of 93.85% and the Energy consumption per unit of NaCl (ENaCl) of 0.344 kWh/kg. The analysis of the variation of the three parameters of selective separation performance during electrodialysis indicates that the separation efficiency (S) is a suitable parameter for measuring the selective separation performance of the device compared to the monovalent selectivity coefficient (PSO42-Cl-).

10.
J Chem Inf Model ; 61(4): 1718-1731, 2021 04 26.
Artigo em Inglês | MEDLINE | ID: mdl-33710894

RESUMO

The peptide quantitative structure-activity relationship (QSAR), also known as the quantitative sequence-activity model (QSAM), has attracted much attention in the bio- and chemoinformatics communities and is a well developed computational peptidology strategy to statistically correlate the sequence/structure and activity/property relationships of functional peptides. Amino acid descriptors (AADs) are one of the most widely used methods to characterize peptide structures by decomposing the peptide into its residue building blocks and sequentially parametrizing each building block with a vector of amino acid principal properties. Considering that various AADs have been proposed over the past decades and new AADs are still emerging today, we herein query the following: is it necessary to develop so many AADs and do we need to continuously develop more new AADs? In this study, we exhaustively collect 80 published AADs and comprehensively evaluate their modeling performance (including fitting ability, internal stability, and predictive power) on 8 QSAR-oriented peptide sample sets (QPSs) by employing 2 sophisticated machine learning methods (MLMs), totally building and systematically comparing 1280 (80 AADs × 8 QPSs × 2 MLMs) peptide QSAR models. The following is revealed: (i) None of the AADs can work best on all or most peptide sets; an AAD usually performs well for some peptides but badly for others. (ii) Modeling performance is primarily determined by the peptide samples and then the MLMs used, while AADs have only a moderate influence on the performance. (iii) There is no essential difference between the modeling performances of different AAD types (physiochemical, topological, 3D-structural, etc.). (iv) Two random descriptors, which are separately generated randomly in standard normal distribution N(0, 1) and uniform distribution U(-1, +1), do not perform significantly worse than these carefully developed AADs. (v) A secondary descriptor, which carries major information involved in the 80 (primary) AADs, does not perform significantly better than these AADs. Overall, we conclude that since there are various AADs available to date and they already cover numerous amino acid properties, further development of new AADs is not an essential choice to improve peptide QSAR modeling; the traditional AAD methodology is believed to have almost reached the theoretical limit nowadays. In addition, the AADs are more likely to be a vector symbol but not informative data; they are utilized to mark and distinguish the 20 amino acids but do not really bring much original property information to these amino acids.


Assuntos
Aminoácidos , Relação Quantitativa Estrutura-Atividade , Modelos Moleculares , Peptídeos
11.
Front Genet ; 12: 800857, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-35096016

RESUMO

The protein-protein association in cellular signaling networks (CSNs) often acts as weak, transient, and reversible domain-peptide interaction (DPI), in which a flexible peptide segment on the surface of one protein is recognized and bound by a rigid peptide-recognition domain from another. Reliable modeling and accurate prediction of DPI binding affinities would help to ascertain the diverse biological events involved in CSNs and benefit our understanding of various biological implications underlying DPIs. Traditionally, peptide quantitative structure-activity relationship (pQSAR) has been widely used to model and predict the biological activity of oligopeptides, which employs amino acid descriptors (AADs) to characterize peptide structures at sequence level and then statistically correlate the resulting descriptor vector with observed activity data via regression. However, the QSAR has not yet been widely applied to treat the direct binding behavior of large-scale peptide ligands to their protein receptors. In this work, we attempted to clarify whether the pQSAR methodology can work effectively for modeling and predicting DPI affinities in a high-throughput manner? Over twenty thousand short linear motif (SLiM)-containing peptide segments involved in SH3, PDZ and 14-3-3 domain-medicated CSNs were compiled to define a comprehensive sequence-based data set of DPI affinities, which were represented by the Boehringer light units (BLUs) derived from previous arbitrary light intensity assays following SPOT peptide synthesis. Four sophisticated MLMs (MLMs) were then utilized to perform pQSAR modeling on the set described with different AADs to systematically create a variety of linear and nonlinear predictors, and then verified by rigorous statistical test. It is revealed that the genome-wide DPI events can only be modeled qualitatively or semiquantitatively with traditional pQSAR strategy due to the intrinsic disorder of peptide conformation and the potential interplay between different peptide residues. In addition, the arbitrary BLUs used to characterize DPI affinity values were measured via an indirect approach, which may not very reliable and may involve strong noise, thus leading to a considerable bias in the modeling. The R prd 2 = 0.7 can be considered as the upper limit of external generalization ability of the pQSAR methodology working on large-scale DPI affinity data.

12.
J Immunol ; 196(2): 715-25, 2016 Jan 15.
Artigo em Inglês | MEDLINE | ID: mdl-26673144

RESUMO

Alternative polyadenylation (APA) has been found to be involved in tumorigenesis, development, and cell differentiation, as well as in the activation of several subsets of immune cells in vitro. Whether APA takes place in immune responses in vivo is largely unknown. We profiled the variation in tandem 3' untranslated regions (UTRs) in pathogen-challenged zebrafish and identified hundreds of APA genes with ∼ 10% being immune response genes. The detected immune response APA genes were enriched in TLR signaling, apoptosis, and JAK-STAT signaling pathways. A greater number of microRNA target sites and AU-rich elements were found in the extended 3' UTRs than in the common 3' UTRs of these APA genes. Further analysis suggested that microRNA and AU-rich element-mediated posttranscriptional regulation plays an important role in modulating the expression of APA genes. These results indicate that APA is extensively involved in immune responses in vivo, and it may be a potential new paradigm for immune regulation.


Assuntos
Poliadenilação/imunologia , Baço/imunologia , Infecções Estafilocócicas/genética , Peixe-Zebra/genética , Peixe-Zebra/imunologia , Regiões 3' não Traduzidas , Animais , Perfilação da Expressão Gênica , Reação em Cadeia da Polimerase
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...