Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
Mais filtros











Base de dados
Intervalo de ano de publicação
1.
Mol Cell Proteomics ; 23(1): 100682, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-37993103

RESUMO

Global phosphoproteomics experiments quantify tens of thousands of phosphorylation sites. However, data interpretation is hampered by our limited knowledge on functions, biological contexts, or precipitating enzymes of the phosphosites. This study establishes a repository of phosphosites with associated evidence in biomedical abstracts, using deep learning-based natural language processing techniques. Our model for illuminating the dark phosphoproteome through PubMed mining (IDPpub) was generated by fine-tuning BioBERT, a deep learning tool for biomedical text mining. Trained using sentences containing protein substrates and phosphorylation site positions from 3000 abstracts, the IDPpub model was then used to extract phosphorylation sites from all MEDLINE abstracts. The extracted proteins were normalized to gene symbols using the National Center for Biotechnology Information gene query, and sites were mapped to human UniProt sequences using ProtMapper and mouse UniProt sequences by direct match. Precision and recall were calculated using 150 curated abstracts, and utility was assessed by analyzing the CPTAC (Clinical Proteomics Tumor Analysis Consortium) pan-cancer phosphoproteomics datasets and the PhosphoSitePlus database. Using 10-fold cross validation, pairs of correct substrates and phosphosite positions were extracted with an average precision of 0.93 and recall of 0.94. After entity normalization and site mapping to human reference sequences, an independent validation achieved a precision of 0.91 and recall of 0.77. The IDPpub repository contains 18,458 unique human phosphorylation sites with evidence sentences from 58,227 abstracts and 5918 mouse sites in 14,610 abstracts. This included evidence sentences for 1803 sites identified in CPTAC studies that are not covered by manually curated functional information in PhosphoSitePlus. Evaluation results demonstrate the potential of IDPpub as an effective biomedical text mining tool for collecting phosphosites. Moreover, the repository (http://idppub.ptmax.org), which can be automatically updated, can serve as a powerful complement to existing resources.


Assuntos
Mineração de Dados , Processamento de Linguagem Natural , Humanos , Mineração de Dados/métodos , Bases de Dados Factuais , PubMed
2.
BMC Bioinformatics ; 24(Suppl 3): 477, 2023 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-38102593

RESUMO

BACKGROUND: With more clinical trials are offering optional participation in the collection of bio-specimens for biobanking comes the increasing complexity of requirements of informed consent forms. The aim of this study is to develop an automatic natural language processing (NLP) tool to annotate informed consent documents to promote biorepository data regulation, sharing, and decision support. We collected informed consent documents from several publicly available sources, then manually annotated them, covering sentences containing permission information about the sharing of either bio-specimens or donor data, or conducting genetic research or future research using bio-specimens or donor data. RESULTS: We evaluated a variety of machine learning algorithms including random forest (RF) and support vector machine (SVM) for the automatic identification of these sentences. 120 informed consent documents containing 29,204 sentences were annotated, of which 1250 sentences (4.28%) provide answers to a permission question. A support vector machine (SVM) model achieved a F-1 score of 0.95 on classifying the sentences when using a gold standard, which is a prefiltered corpus containing all relevant sentences. CONCLUSIONS: This study provides the feasibility of using machine learning tools to classify permission-related sentences in informed consent documents.


Assuntos
Bancos de Espécimes Biológicos , Termos de Consentimento , Aprendizado de Máquina , Algoritmos , Processamento de Linguagem Natural
3.
Clin Gastroenterol Hepatol ; 21(5): 1198-1204, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-36115659

RESUMO

BACKGROUND & AIMS: Identifying dysplasia of Barrett's esophagus (BE) in the electronic medical record (EMR) requires manual abstraction of unstructured data. Natural language processing (NLP) creates structure to unstructured free text. We aimed to develop and validate an NLP algorithm to identify dysplasia in BE patients on histopathology reports with varying report formats in a large integrated EMR system. METHODS: We randomly selected 600 pathology reports for NLP development and 400 reports for validation from patients with suspected BE in the national Veterans Affairs databases. BE and dysplasia were verified by manual review of the pathology reports. We used NLP software (Clinical Language Annotation, Modeling, and Processing Toolkit; Melax Tech, Houston, TX) to develop an algorithm to identify dysplasia using findings. The algorithm performance characteristics were calculated as recall, precision, accuracy, and F-measure. RESULTS: In the development set of 600 patients, 457 patients had confirmed BE (60 with dysplasia). The NLP identified dysplasia with 98.0% accuracy, 91.7% recall, and 93.2% precision, with an F-measure of 92.4%. All 7 patients with confirmed high-grade dysplasia were classified by the algorithm as having dysplasia. Among the 400 patients in the validation cohort, 230 had confirmed BE (39 with dysplasia). Compared with manual review, the NLP algorithm identified dysplasia with 98.7% accuracy, 92.3% recall, and 100.0% precision, with an F-measure of 96.0%. CONCLUSIONS: NLP yielded a high degree of sensitivity and accuracy for identifying dysplasia from diverse types of pathology reports for patients with BE. The application of this algorithm would facilitate research and clinical care in an EMR system with text reports in large data repositories.


Assuntos
Esôfago de Barrett , Humanos , Esôfago de Barrett/complicações , Esôfago de Barrett/diagnóstico , Processamento de Linguagem Natural , Software , Algoritmos , Hiperplasia
4.
Annu Int Conf IEEE Eng Med Biol Soc ; 2022: 1456-1459, 2022 07.
Artigo em Inglês | MEDLINE | ID: mdl-36085960

RESUMO

Channel suppression can reduce the redundant information in multiple channel receiver coils and accelerate reconstruction speed to meet real-time imaging requirements. The principal component analysis has been used for channel suppression, but it is difficult to be interpreted because all channels contribute to principal components. Furthermore, the importance of interpretability in machine learning has recently attracted increasing attention in radiology. To improve the interpretability of PCA-based channel suppression, a sparse PCA method is proposed to reduce the most coils' loadings to be zero. Channel suppression is formulated as solving a nonlinear eigenvalue problem using the inverse power method instead of the direct matrix decomposition. Experimental results of in vivo data show that the sparse PCA-based channel suppression not only improves the interpretability with sparse channels, but also improves reconstruction quality compared to the standard PCA-based reconstruction with the similar reconstruction time.


Assuntos
Algoritmos , Procedimentos de Cirurgia Plástica , Imageamento por Ressonância Magnética/métodos , Análise de Componente Principal , Registros
5.
Annu Int Conf IEEE Eng Med Biol Soc ; 2022: 599-602, 2022 07.
Artigo em Inglês | MEDLINE | ID: mdl-36085691

RESUMO

Ker NL is a general kernel-based framework for auto calibrated reconstruction method, which does not need any explicit formulas of the kernel function for characterizing nonlinear relationships between acquired and unacquired k-space data. It is non-iterative without requiring a large amount of computational costs. Since the limited autocalibration signals (ACS) are acquired to perform KerNL calibration and the calibration suffers from the overfitting problem, more training data can improve the kernel model accuracy. In this work, virtual conjugate coil data are incorporated into the KerNL calibration and estimation process for enhancing reconstruction performance. Experimental results show that the proposed method can further suppress noise and aliasing artifacts with fewer ACS data and higher acceleration factors. Computation efficiency is still retained to keep fast reconstruction with the random projection.


Assuntos
Aceleração , Artefatos , Calibragem
6.
Magn Reson Imaging ; 92: 108-119, 2022 10.
Artigo em Inglês | MEDLINE | ID: mdl-35772581

RESUMO

Autocalibration signal is acquired in the k-space-based parallel MRI reconstruction for estimating interpolation coefficients and reconstructing missing unacquired data. Many ACS lines can suppress aliasing artifacts and noise by covering the low-frequency signal region. However, more ACS lines will delay the data acquisition process and therefore elongate the scan time. Furthermore, a single interpolator is often used for recovering missing k-space data, and model error may exist if the single interpolator size is not selected appropriately. In this work, based on the idea of the disagreement-based semi-supervised learning, a dual-interpolator strategy is proposed to collaboratively reconstruct missing k-space data. Two interpolators with different sizes are alternatively applied to estimate and re-estimate missing data in k-space. The disagreement between two interpolators is converged and real missing values are co-estimated from two views. The experimental results show that the proposed method outperforms GRAPPA, SPIRiT, and Nonlinear GRAPPA methods using relatively low number of ACS data, and reduces aliasing artifacts and noise in reconstructed images.


Assuntos
Algoritmos , Aumento da Imagem , Artefatos , Aumento da Imagem/métodos , Processamento de Imagem Assistida por Computador/métodos , Imageamento por Ressonância Magnética/métodos , Cintilografia
7.
Aliment Pharmacol Ther ; 54(4): 481-492, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-34224163

RESUMO

BACKGROUND: Previous studies have demonstrated an association between nonselective beta-blockers (NSBBs) and lower risk of hepatocellular carcinoma (HCC) in cirrhosis. However, there has been no population-based study investigating the risk of HCC among cirrhotic patients treated using carvedilol. AIMS: To determine the risk of HCC among cirrhotic patients with NSBBs including carvedilol. METHODS: This retrospective cohort study utilised the Cerner Health Facts database in the United States from 2000 to 2017. Kaplan-Meier estimate, Cox proportional hazards regression, and propensity score matching (PSM) were used to test the HCC risk among the carvedilol, nadolol, and propranolol groups compared with no beta-blocker group. RESULTS: The final cohort comprised 107 428 eligible patients. The 100-month cumulative HCC incidence of NSBBs was significantly lower than the no beta-blocker group (carvedilol (11.24%) vs no beta-blocker (15.69%), nadolol (27.55%) vs no beta-blocker (32.11%), and propranolol (26.17%) vs no beta-blocker (28.84%) (P values < 0.0001). NSBBs were associated with a significantly lower risk of HCC (Hazard ratio: carvedilol 0.61 (95% CI 0.51-0.73), nadolol 0.74 (95% CI 0.63-0.87), propranolol 0.75 (95% CI 0.66-0.84) after PSM in the multivariate cox analysis. In subgroup analysis, NSBBs reduced the risk of HCC in cirrhosis with complications and non-alcoholic cirrhosis. CONCLUSIONS: NSBBs, including carvedilol, were associated with a significantly decreased risk of HCC in patients with cirrhosis when compared with no beta-blocker regardless of complications status. Future randomised-controlled studies comparing the incidence of HCC among NSBBs should elucidate which NSBB would be the best option to prevent HCC in cirrhosis.


Assuntos
Carcinoma Hepatocelular , Neoplasias Hepáticas , Antagonistas Adrenérgicos beta/uso terapêutico , Carcinoma Hepatocelular/epidemiologia , Carcinoma Hepatocelular/etiologia , Carcinoma Hepatocelular/prevenção & controle , Humanos , Cirrose Hepática/epidemiologia , Neoplasias Hepáticas/epidemiologia , Neoplasias Hepáticas/etiologia , Neoplasias Hepáticas/prevenção & controle , Estudos Retrospectivos , Estados Unidos/epidemiologia
8.
J Am Med Inform Assoc ; 28(7): 1393-1400, 2021 07 14.
Artigo em Inglês | MEDLINE | ID: mdl-33647938

RESUMO

OBJECTIVE: Automated analysis of vaccine postmarketing surveillance narrative reports is important to understand the progression of rare but severe vaccine adverse events (AEs). This study implemented and evaluated state-of-the-art deep learning algorithms for named entity recognition to extract nervous system disorder-related events from vaccine safety reports. MATERIALS AND METHODS: We collected Guillain-Barré syndrome (GBS) related influenza vaccine safety reports from the Vaccine Adverse Event Reporting System (VAERS) from 1990 to 2016. VAERS reports were selected and manually annotated with major entities related to nervous system disorders, including, investigation, nervous_AE, other_AE, procedure, social_circumstance, and temporal_expression. A variety of conventional machine learning and deep learning algorithms were then evaluated for the extraction of the above entities. We further pretrained domain-specific BERT (Bidirectional Encoder Representations from Transformers) using VAERS reports (VAERS BERT) and compared its performance with existing models. RESULTS AND CONCLUSIONS: Ninety-one VAERS reports were annotated, resulting in 2512 entities. The corpus was made publicly available to promote community efforts on vaccine AEs identification. Deep learning-based methods (eg, bi-long short-term memory and BERT models) outperformed conventional machine learning-based methods (ie, conditional random fields with extensive features). The BioBERT large model achieved the highest exact match F-1 scores on nervous_AE, procedure, social_circumstance, and temporal_expression; while VAERS BERT large models achieved the highest exact match F-1 scores on investigation and other_AE. An ensemble of these 2 models achieved the highest exact match microaveraged F-1 score at 0.6802 and the second highest lenient match microaveraged F-1 score at 0.8078 among peer models.


Assuntos
Aprendizado Profundo , Síndrome de Guillain-Barré , Vacinas contra Influenza , Sistemas de Notificação de Reações Adversas a Medicamentos , Sistemas Computacionais , Humanos , Vacinas contra Influenza/efeitos adversos , Estados Unidos
9.
J Am Med Inform Assoc ; 28(6): 1275-1283, 2021 06 12.
Artigo em Inglês | MEDLINE | ID: mdl-33674830

RESUMO

The COVID-19 pandemic swept across the world rapidly, infecting millions of people. An efficient tool that can accurately recognize important clinical concepts of COVID-19 from free text in electronic health records (EHRs) will be valuable to accelerate COVID-19 clinical research. To this end, this study aims at adapting the existing CLAMP natural language processing tool to quickly build COVID-19 SignSym, which can extract COVID-19 signs/symptoms and their 8 attributes (body location, severity, temporal expression, subject, condition, uncertainty, negation, and course) from clinical text. The extracted information is also mapped to standard concepts in the Observational Medical Outcomes Partnership common data model. A hybrid approach of combining deep learning-based models, curated lexicons, and pattern-based rules was applied to quickly build the COVID-19 SignSym from CLAMP, with optimized performance. Our extensive evaluation using 3 external sites with clinical notes of COVID-19 patients, as well as the online medical dialogues of COVID-19, shows COVID-19 SignSym can achieve high performance across data sources. The workflow used for this study can be generalized to other use cases, where existing clinical natural language processing tools need to be customized for specific information needs within a short time. COVID-19 SignSym is freely accessible to the research community as a downloadable package (https://clamp.uth.edu/covid/nlp.php) and has been used by 16 healthcare organizations to support clinical research of COVID-19.


Assuntos
COVID-19/diagnóstico , Registros Eletrônicos de Saúde , Armazenamento e Recuperação da Informação/métodos , Processamento de Linguagem Natural , Aprendizado Profundo , Humanos , Avaliação de Sintomas/métodos
10.
ArXiv ; 2020 Jul 13.
Artigo em Inglês | MEDLINE | ID: mdl-32908948

RESUMO

The COVID-19 pandemic swept across the world rapidly, infecting millions of people. An efficient tool that can accurately recognize important clinical concepts of COVID-19 from free text in electronic health records (EHRs) will be valuable to accelerate COVID-19 clinical research. To this end, this study aims at adapting the existing CLAMP natural language processing tool to quickly build COVID-19 SignSym, which can extract COVID-19 signs/symptoms and their 8 attributes (body location, severity, temporal expression, subject, condition, uncertainty, negation, and course) from clinical text. The extracted information is also mapped to standard concepts in the Observational Medical Outcomes Partnership common data model. A hybrid approach of combining deep learning-based models, curated lexicons, and pattern-based rules was applied to quickly build the COVID-19 SignSym from CLAMP, with optimized performance. Our extensive evaluation using 3 external sites with clinical notes of COVID-19 patients, as well as the online medical dialogues of COVID-19, shows COVID-19 SignSym can achieve high performance across data sources. The workflow used for this study can be generalized to other use cases, where existing clinical natural language processing tools need to be customized for specific information needs within a short time. COVID-19 SignSym is freely accessible to the research community as a downloadable package (https://clamp.uth.edu/covid/nlp.php) and has been used by 16 healthcare organizations to support clinical research of COVID-19.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA