Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
SLAS Technol ; 29(3): 100134, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38670311

RESUMO

Protocol standardization and sharing are crucial for reproducibility in life sciences. In spite of numerous efforts for standardized protocol description, adherence to these standards in literature remains largely inconsistent. Curation of protocols are especially challenging due to the labor intensive process, requiring expert domain knowledge of each experimental procedure. Recent advancements in Large Language Models (LLMs) offer a promising solution to interpret and curate knowledge from complex scientific literature. In this work, we develop ProtoCode, a tool leveraging fine-tune LLMs to curate protocols into intermediate representation formats which can be interpretable by both human and machine interfaces. Our proof-of-concept, focused on polymerase chain reaction (PCR) protocols, retrieves information from PCR protocols at an accuracy ranging 69-100 % depending on the information content. In all tested protocols, we demonstrate that ProtoCode successfully converts literature-based protocols into correct operational files for multiple thermal cycler systems. In conclusion, ProtoCode can alleviate labor intensive curation and standardization of life science protocols to enhance research reproducibility by providing a reliable, automated means to process and standardize protocols. ProtoCode is freely available as a web server at https://curation.taxila.io/ProtoCode/.


Assuntos
Reação em Cadeia da Polimerase , Reação em Cadeia da Polimerase/métodos , Humanos , Software , Reprodutibilidade dos Testes , Publicações
2.
JMIR Form Res ; 8: e51732, 2024 Jan 16.
Artigo em Inglês | MEDLINE | ID: mdl-38227357

RESUMO

BACKGROUND: Maintaining good communication and engagement between people with dementia and their caregivers is a major challenge in dementia care. Cognitive stimulation is a psychosocial intervention that supports communication and engagement, and several digital applications for cognitive stimulation have been developed. Personalization is an important factor for obtaining sustainable benefits, but the time and effort required to personalize and optimize applications often makes them difficult for routine use by nonspecialist caregivers and families. Although artificial intelligence (AI) has great potential to support automation of the personalization process, its use is largely unexplored because of the lack of suitable data from which to develop and train machine learning models. OBJECTIVE: This pilot study aims to evaluate a digital application called Aikomi in Japanese care homes for its potential to (1) create and deliver personalized cognitive stimulation programs to promote communication and engagement between people with dementia and usual care staff and (2) capture meaningful personalized data suitable for the development of AI systems. METHODS: A modular technology platform was developed and used to create personalized programs for 15 people with dementia living in 4 residential care facilities in Japan with the cooperation of a family member or care staff. A single intervention with the program was conducted with the person with dementia together with a care staff member, and for some participants, smell stimulation was provided using selected smell sticks in conjunction with the digital program. All sessions were recorded using a video camera, and the combined personalized data obtained by the platform were analyzed. RESULTS: Most people with dementia (10/15, 67%) showed high levels of engagement (>40 on Engagement of a Person with Dementia Scale), and there were no incidences of negative reactions toward the programs. Care staff reported that some participants showed extended concentration and spontaneous communication while using Aikomi, which was not their usual behavior. Smell stimulation promoted engagement for some participants even when they were unable to identify the smell. No changes in well-being were observed following the intervention according to the Mental Function Impairment Scale. The level of response to each type of content in the stimulation program varied greatly according to the person with dementia, and personalized data captured by the Aikomi platform enabled understanding of correlations between stimulation content and responses for each participant. CONCLUSIONS: This study suggests that the Aikomi digital application is acceptable for use by persons with dementia and care staff and may have the potential to promote communication and engagement. The platform captures personalized data, which can provide suitable input for machine learning. Further investigation of Aikomi will be conducted to develop AI systems and create personalized digital cognitive stimulation applications that can be easily used by nonspecialist caregivers.

3.
NPJ Syst Biol Appl ; 9(1): 63, 2023 Dec 18.
Artigo em Inglês | MEDLINE | ID: mdl-38110446

RESUMO

Assessing the mutagenicity of chemicals is an essential task in the drug development process. Usually, databases and other structured sources for AMES mutagenicity exist, which have been carefully and laboriously curated from scientific publications. As knowledge accumulates over time, updating these databases is always an overhead and impractical. In this paper, we first propose the problem of predicting the mutagenicity of chemicals from textual information in scientific publications. More simply, given a chemical and evidence in the natural language form from publications where the mutagenicity of the chemical is described, the goal of the model/algorithm is to predict if it is potentially mutagenic or not. For this, we first construct a golden standard data set and then propose MutaPredBERT, a prediction model fine-tuned on BioLinkBERT based on a question-answering formulation of the problem. We leverage transfer learning and use the help of large transformer-based models to achieve a Macro F1 score of >0.88 even with relatively small data for fine-tuning. Our work establishes the utility of large language models for the construction of structured sources of knowledge bases directly from scientific publications.


Assuntos
Mutagênicos , Mutagênicos/toxicidade , Bases de Dados Factuais
4.
BMC Bioinformatics ; 24(1): 290, 2023 Jul 19.
Artigo em Inglês | MEDLINE | ID: mdl-37468830

RESUMO

BACKGROUND: The growing recognition of the microbiome's impact on human health and well-being has prompted extensive research into discovering the links between microbiome dysbiosis and disease (healthy) states. However, this valuable information is scattered in unstructured form within biomedical literature. The structured extraction and qualification of microbe-disease interactions are important. In parallel, recent advancements in deep-learning-based natural language processing algorithms have revolutionized language-related tasks such as ours. This study aims to leverage state-of-the-art deep-learning language models to extract microbe-disease relationships from biomedical literature. RESULTS: In this study, we first evaluate multiple pre-trained large language models within a zero-shot or few-shot learning context. In this setting, the models performed poorly out of the box, emphasizing the need for domain-specific fine-tuning of these language models. Subsequently, we fine-tune multiple language models (specifically, GPT-3, BioGPT, BioMedLM, BERT, BioMegatron, PubMedBERT, BioClinicalBERT, and BioLinkBERT) using labeled training data and evaluate their performance. Our experimental results demonstrate the state-of-the-art performance of these fine-tuned models ( specifically GPT-3, BioMedLM, and BioLinkBERT), achieving an average F1 score, precision, and recall of over [Formula: see text] compared to the previous best of  0.74. CONCLUSION: Overall, this study establishes that pre-trained language models excel as transfer learners when fine-tuned with domain and problem-specific data, enabling them to achieve state-of-the-art results even with limited training data for extracting microbiome-disease interactions from scientific publications.


Assuntos
Algoritmos , Idioma , Humanos , Processamento de Linguagem Natural , Nível de Saúde , Aprendizagem
5.
Front Physiol ; 13: 933069, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36117696

RESUMO

Text mining has been shown to be an auxiliary but key driver for modeling, data harmonization, and interpretation in bio-medicine. Scientific literature holds a wealth of information and embodies cumulative knowledge and remains the core basis on which mechanistic pathways, molecular databases, and models are built and refined. Text mining provides the necessary tools to automatically harness the potential of text. In this study, we show the potential of large-scale text mining for deriving novel insights, with a focus on the growing field of microbiome. We first collected the complete set of abstracts relevant to the microbiome from PubMed and used our text mining and intelligence platform Taxila for analysis. We drive the usefulness of text mining using two case studies. First, we analyze the geographical distribution of research and study locations for the field of microbiome by extracting geo mentions from text. Using this analysis, we were able to draw useful insights on the state of research in microbiome w. r.t geographical distributions and economic drivers. Next, to understand the relationships between diseases, microbiome, and food which are central to the field, we construct semantic relationship networks between these different concepts central to the field of microbiome. We show how such networks can be useful to derive useful insight with no prior knowledge encoded.

6.
Mutagenesis ; 37(3-4): 191-202, 2022 10 26.
Artigo em Inglês | MEDLINE | ID: mdl-35554560

RESUMO

Assessing a compound's mutagenicity using machine learning is an important activity in the drug discovery and development process. Traditional methods of mutagenicity detection, such as Ames test, are expensive and time and labor intensive. In this context, in silico methods that predict a compound mutagenicity with high accuracy are important. Recently, machine-learning (ML) models are increasingly being proposed to improve the accuracy of mutagenicity prediction. While these models are used in practice, there is further scope to improve the accuracy of these models. We hypothesize that choosing the right features to train the model can further lead to better accuracy. We systematically consider and evaluate a combination of novel structural and molecular features which have the maximal impact on the accuracy of models. We rigorously evaluate these features against multiple classification models (from classical ML models to deep neural network models). The performance of the models was assessed using 5- and 10-fold cross-validation and we show that our approach using the molecule structure, molecular properties, and structural alerts as feature sets successfully outperform the state-of-the-art methods for mutagenicity prediction for the Hansen et al. benchmark dataset with an area under the receiver operating characteristic curve of 0.93. More importantly, our framework shows how combining features could benefit model accuracy improvements.


Assuntos
Aprendizado de Máquina , Mutagênicos , Mutagênicos/toxicidade , Mutagênicos/química , Redes Neurais de Computação , Mutagênese
7.
Cell Syst ; 11(3): 272-285.e9, 2020 09 23.
Artigo em Inglês | MEDLINE | ID: mdl-32898474

RESUMO

Accurately profiling systemic immune responses to cancer initiation and progression is necessary for understanding tumor surveillance and, ultimately, improving therapy. Here, we describe the SYLARAS software tool (systemic lymphoid architecture response assessment) and a dataset collected with SYLARAS that describes the frequencies of immune cells in primary and secondary lymphoid organs and in the tumor microenvironment of mice engrafted with a standard syngeneic glioblastoma (GBM) model. The data resource involves profiles of 5 lymphoid tissues in 48 mice and shows that GBM causes wide-spread changes in the local and systemic immune architecture. We use SYLARAS to identify a subset of CD45R/B220+ CD8+ T cells that is depleted from circulation but accumulates in the tumor mass and confirm this finding using multiplexed immunofluorescence microscopy. SYLARAS is freely available for download at (https://github.com/gjbaker/sylaras). A record of this paper's transparent peer review process is included in the Supplemental Information.


Assuntos
Neoplasias Encefálicas/epidemiologia , Neoplasias Encefálicas/imunologia , Glioblastoma/epidemiologia , Glioblastoma/imunologia , Animais , Humanos , Camundongos
8.
IEEE/ACM Trans Comput Biol Bioinform ; 17(5): 1691-1702, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-30869630

RESUMO

We are interested in studying the evolution of large homogeneous populations of cells, where each cell is assumed to be composed of a group of biological players (species) whose dynamics is governed by a complex biological pathway, identical for all cells. Modeling the inherent variability of the species concentrations in different cells is crucial to understand the dynamics of the population. In this work, we focus on handling this variability by modeling each species by a random variable that evolves over time. This appealing approach runs into the curse of dimensionality since exactly representing a joint probability distribution involving a large set of random variables quickly becomes intractable as the number of variables grows. To make this approach amenable to biopathways, we explore different techniques to (i) approximate the exact joint distribution at a given time point, and (ii) to track its evolution as time elapses. We start with the problem of approximating the probability distribution of biological species in a population of cells at some given time point. Data come from different fine-grained models of biological pathways of increasing complexities, such as (perturbed) Ordinary Differential Equations (ODEs). Classical approximations rely on the strong and unrealistic assumption that variables/species are independent, or that they can be grouped into small independent clusters. We propose instead to use the Chow-Liu tree representation, based on overlapping clusters of two variables, which better captures correlations between variables. Our experiments show that the proposed approximation scheme is more accurate than existing ones to model probability distributions deriving from biopathways. Then we address the problem of tracking the dynamics of a population of cells, that is computing from an initial distribution the evolution of the (approximate) joint distribution of species over time, called the inference problem. We evaluate several approximate inference algorithms (e.g., [14] , [17] ) for coarse-grained abstractions [12], [16] of biological pathways. Using the Chow-Liu tree approximation, we develop a new inference algorithm which is very accurate according to the experiments we report, for a minimal computation overhead. Our implementation is available at https://codeocean.com/capsule/6491669/tree.


Assuntos
Células Cultivadas , Biologia Computacional/métodos , Modelos Biológicos , Algoritmos , Apoptose , Teorema de Bayes , Células Cultivadas/classificação , Células Cultivadas/citologia , Técnicas Citológicas , Análise Multivariada
9.
Bioinformatics ; 33(13): 1980-1986, 2017 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-28200026

RESUMO

MOTIVATION: Quantitative models are increasingly used in systems biology. Usually, these quantitative models involve many molecular species and their associated reactions. When simulating a tissue with thousands of cells, using these large models becomes computationally and time limiting. RESULTS: In this paper, we propose to construct abstractions using information theory notions. Entropy is used to discretize the state space and mutual information is used to select a subset of all original variables and their mutual dependencies. We apply our method to an hybrid model of TRAIL-induced apoptosis in HeLa cell. Our abstraction, represented as a Dynamic Bayesian Network (DBN), reduces the number of variables from 92 to 10, and accelerates numerical simulation by an order of magnitude, yet preserving essential features of cell death time distributions. AVAILABILITY AND IMPLEMENTATION: This approach is implemented in the tool DBNizer, freely available at http://perso.crans.org/genest/DBNizer . CONTACT: gregory.batt@inria.fr or bgenest@irisa.fr. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Apoptose , Modelos Biológicos , Software , Biologia de Sistemas/métodos , Algoritmos , Entropia , Células HeLa , Humanos , Teoria da Informação
10.
Sci Signal ; 9(436): ra70, 2016 07 12.
Artigo em Inglês | MEDLINE | ID: mdl-27405980

RESUMO

Toll-like receptors (TLRs) recognize pathogen-associated molecular patterns (PAMPs) and stimulate the innate immune response through the production of cytokines. The innate immune response depends on the timing of encountering PAMPs, suggesting a short-term "memory." In particular, activation of TLR3 appears to prime macrophages for the subsequent activation of TLR7, which leads to synergistically increased production of cytokines. By developing a calibrated mathematical model for the kinetics of TLR3 and TLR7 pathway crosstalk and providing experimental validation, we demonstrated the involvement of the Janus-activated kinase (JAK)-signal transducer and activator of transcription (STAT) pathway in controlling the synergistic production of cytokines. Signaling through this pathway played a dual role: It mediated the synergistic production of cytokines, thus boosting the immune response, and it also maintained homeostasis to avoid an excessive inflammatory response. Thus, we propose that the JAK-STAT pathway provides a cytokine rheostat mechanism, which enables macrophages to fine-tune their responses to multiple, temporally separated infection events involving the TLR3 and TLR7 pathways.


Assuntos
Homeostase/imunologia , Imunidade Inata/fisiologia , Memória Imunológica/fisiologia , Glicoproteínas de Membrana/imunologia , Modelos Imunológicos , Transdução de Sinais/imunologia , Receptor 3 Toll-Like/imunologia , Receptor 7 Toll-Like/imunologia , Animais , Linhagem Celular , Feminino , Janus Quinases/genética , Janus Quinases/imunologia , Glicoproteínas de Membrana/genética , Camundongos , Camundongos Endogâmicos BALB C , Camundongos Transgênicos , Fatores de Transcrição STAT/genética , Fatores de Transcrição STAT/imunologia , Transdução de Sinais/genética , Receptor 3 Toll-Like/genética , Receptor 7 Toll-Like/genética
11.
IEEE Trans Biomed Eng ; 63(10): 2007-14, 2016 10.
Artigo em Inglês | MEDLINE | ID: mdl-27305665

RESUMO

OBJECTIVE: Whole-cell (WC) modeling is a promising tool for biological research, bioengineering, and medicine. However, substantial work remains to create accurate comprehensive models of complex cells. METHODS: We organized the 2015 Whole-Cell Modeling Summer School to teach WC modeling and evaluate the need for new WC modeling standards and software by recoding a recently published WC model in the Systems Biology Markup Language. RESULTS: Our analysis revealed several challenges to representing WC models using the current standards. CONCLUSION: We, therefore, propose several new WC modeling standards, software, and databases. SIGNIFICANCE: We anticipate that these new standards and software will enable more comprehensive models.


Assuntos
Simulação por Computador , Modelos Biológicos , Software , Biologia de Sistemas/normas , Biologia Computacional , Técnicas Citológicas , Feminino , Humanos , Masculino , Biologia de Sistemas/educação , Biologia de Sistemas/organização & administração
12.
Bioinformatics ; 28(11): 1508-16, 2012 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-22492313

RESUMO

MOTIVATION: Biopathways are often modeled as systems of ordinary differential equations (ODEs). Such systems will usually have many unknown parameters and hence will be difficult to calibrate. Since the data available for calibration will have limited precision, an approximate representation of the ODEs dynamics should suffice. One must, however, be able to efficiently construct such approximations for large models and perform model calibration and subsequent analysis. RESULTS: We present a graphical processing unit (GPU) based scheme by which a system of ODEs is approximated as a dynamic Bayesian network (DBN). We then construct a model checking procedure for DBNs based on a simple probabilistic linear time temporal logic. The GPU implementation considerably extends the reach of our previous PC-cluster-based implementation (Liu et al., 2011b). Further, the key components of our algorithm can serve as the GPU kernel for other Monte Carlo simulations-based analysis of biopathway dynamics. Similarly, our model checking framework is a generic one and can be applied in other systems biology settings. We have tested our methods on three ODE models of bio-pathways: the epidermal growth factor-nerve growth factor pathway, the segmentation clock network and the MLC-phosphorylation pathway models. The GPU implementation shows significant gains in performance and scalability whereas the model checking framework turns out to be convenient and efficient for specifying and verifying interesting pathways properties. AVAILABILITY: The source code is freely available at http://www.comp.nus.edu.sg/~rpsysbio/pada-gpu/


Assuntos
Relógios Biológicos , Modelos Biológicos , Transdução de Sinais , Biologia de Sistemas/métodos , Algoritmos , Teorema de Bayes , Gráficos por Computador , Fator de Crescimento Epidérmico/metabolismo , Humanos , Método de Monte Carlo , Cadeias Leves de Miosina/metabolismo , Fator de Crescimento Neural/metabolismo , Linguagens de Programação , Software , Trombina/metabolismo
13.
Artigo em Inglês | MEDLINE | ID: mdl-22529330

RESUMO

Dynamic Bayesian Networks (DBNs) can serve as succinct probabilistic dynamic models of biochemical networks. To analyze these models, one must compute the probability distribution over system states at a given time point. Doing this exactly is infeasible for large models; hence one must use approximate algorithms. The Factored Frontier algorithm (FF) is one such algorithm. However FF as well as the earlier Boyen-Koller (BK) algorithm can incur large errors. To address this, we present a new approximate algorithm called the Hybrid Factored Frontier (HFF) algorithm. At each time slice, in addition to maintaining probability distributions over local states-as FF does-HFF explicitly maintains the probabilities of a number of global states called spikes. When the number of spikes is 0, we get FF and with all global states as spikes, we get the exact inference algorithm. We show that by increasing the number of spikes one can reduce errors while the additional computational effort required is only quadratic in the number of spikes. We validated the performance of HFF on large DBN models of biopathways. Each pathway has more than 30 species and the corresponding DBN has more than 3,000 nodes. Comparisons with FF and BK show that HFF is a useful and powerful approximate inferencing algorithm for DBNs.


Assuntos
Algoritmos , Teorema de Bayes , Transdução de Sinais/fisiologia , Modelos Estatísticos
14.
BMC Bioinformatics ; 13 Suppl 17: S15, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23282174

RESUMO

Statistical model checking techniques have been shown to be effective for approximate model checking on large stochastic systems, where explicit representation of the state space is impractical. Importantly, these techniques ensure the validity of results with statistical guarantees on errors. There is an increasing interest in these classes of algorithms in computational systems biology since analysis using traditional model checking techniques does not scale well. In this context, we present two improvements to existing statistical model checking algorithms. Firstly, we construct an algorithm which removes the need of the user to define the indifference region, a critical parameter in previous sequential hypothesis testing algorithms. Secondly, we extend the algorithm to account for the case when there may be a limit on the computational resources that can be spent on verifying a property; i.e, if the original algorithm is not able to make a decision even after consuming the available amount of resources, we resort to a p-value based approach to make a decision. We demonstrate the improvements achieved by our algorithms in comparison to current algorithms first with a straightforward yet representative example, followed by a real biological model on cell fate of gustatory neurons with microRNAs.


Assuntos
Modelos Biológicos , Modelos Estatísticos , Biologia de Sistemas/estatística & dados numéricos , Algoritmos , Animais , Caenorhabditis elegans/fisiologia , Diferenciação Celular , Vias Neurais , Neurônios/fisiologia , Paladar/fisiologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...