Pesquisa | Portal Regional da BVS (teste)

Personalised health education against health damage of COVID-19 epidemic in the elderly Hungarian population (PROACTIVE-19): protocol of an adaptive randomised controlled clinical trial.

Eross, Bálint; Molnár, Zsolt; Szakács, Zsolt; Zádori, Noémi; Szakó, Lajos; Váncsa, Szilárd; Juhász, Márk Félix; Ocskay, Klementina; Vörhendi, Nóra; Márta, Katalin; Szentesi, Andrea; Párniczky, Andrea; Hegyi, Péter J; Kiss, Szabolcs; Földi, Mária; Dembrovszky, Fanni; Kanjo, Anna; Pázmány, Piroska; Varró, András; Csathó, Árpád; Helyes, Zsuzsanna; Péterfi, Zoltán; Czopf, László; Kiss, István; Zemplényi, Antal; Czapári, Dóra; Hegyi, Eszter; Dobszai, Dalma; Miklós, Emoke; Márta, Attila; Tóth, Dominika; Farkas, Richard; Farkas, Nelli; Birkás, Béla; Pintér, Erika; Petho, Gábor; Zsigmond, Borbála; Sárközi, Andrea; Nagy, Anikó; Hegyi, Péter.

Trials ; 21(1): 809, 2020 Sep 29.

Artigo em Inglês | MEDLINE | ID: mdl-32993779

RESUMO

BACKGROUND: Early reports indicate that COVID-19 may require intensive care unit (ICU) admission in 5-26% and overall mortality can rise to 11% of the recognised cases, particularly affecting the elderly. There is a lack of evidence-based targeted pharmacological therapy for its prevention and treatment. We aim to compare the effects of a World Health Organization recommendation-based education and a personalised complex preventive lifestyle intervention package (based on the same WHO recommendation) on the outcomes of the COVID-19. METHODS: PROACTIVE-19 is a pragmatic, randomised controlled clinical trial with adaptive "sample size re-estimation" design. Hungarian population over the age of 60 years without confirmed COVID-19 will be approached to participate in a telephone health assessment and lifestyle counselling voluntarily. Volunteers will be randomised into two groups: (A) general health education and (B) personalised health education. Participants will go through questioning and recommendation in 5 fields: (1) mental health, (2) smoking habits, (3) physical activity, (4) dietary habits, and (5) alcohol consumption. Both groups A and B will receive the same line of questioning to assess habits concerning these topics. Assessment will be done weekly during the first month, every second week in the second month, then monthly. The composite primary endpoint will include the rate of ICU admission, hospital admission (longer than 48 h), and mortality in COVID-19-positive cases. The estimated sample size is 3788 subjects per study arm. The planned duration of the follow-up is a minimum of 1 year. DISCUSSION: These interventions may boost the body's cardiovascular and pulmonary reserve capacities, leading to improved resistance against the damage caused by COVID-19. Consequently, lifestyle changes can reduce the incidence of life-threatening conditions and attenuate the detrimental effects of the pandemic seriously affecting the older population. TRIAL REGISTRATION: The study has been approved by the Scientific and Research Ethics Committee of the Hungarian Medical Research Council (IV/2428- 2 /2020/EKU) and has been registered at clinicaltrials.gov ( NCT04321928 ) on 25 March 2020.

Assuntos

Betacoronavirus/patogenicidade , Infecções por Coronavirus/prevenção & controle , Educação em Saúde , Conhecimentos, Atitudes e Prática em Saúde , Pandemias/prevenção & controle , Pneumonia Viral/prevenção & controle , Comportamento de Redução do Risco , Ensaios Clínicos Adaptados como Assunto , Fatores Etários , Idoso , Idoso de 80 Anos ou mais , Consumo de Bebidas Alcoólicas/efeitos adversos , COVID-19 , Infecções por Coronavirus/diagnóstico , Infecções por Coronavirus/mortalidade , Infecções por Coronavirus/virologia , Exercício Físico , Comportamento Alimentar , Feminino , Nível de Saúde , Interações Hospedeiro-Patógeno , Humanos , Hungria , Masculino , Saúde Mental , Pessoa de Meia-Idade , Pneumonia Viral/diagnóstico , Pneumonia Viral/mortalidade , Pneumonia Viral/virologia , Ensaios Clínicos Pragmáticos como Assunto , Fatores de Proteção , Medição de Risco , Fatores de Risco , SARS-CoV-2 , Fumar/efeitos adversos

Linguistic scope-based and biological event-based speculation and negation annotations in the BioScope and Genia Event corpora.

Vincze, Veronika; Szarvas, György; Móra, György; Ohta, Tomoko; Farkas, Richárd.

J Biomed Semantics ; 2 Suppl 5: S8, 2011 Oct 06.

Artigo em Inglês | MEDLINE | ID: mdl-22166355

RESUMO

BACKGROUND: The treatment of negation and hedging in natural language processing has received much interest recently, especially in the biomedical domain. However, open access corpora annotated for negation and/or speculation are hardly available for training and testing applications, and even if they are, they sometimes follow different design principles. In this paper, the annotation principles of the two largest corpora containing annotation for negation and speculation - BioScope and Genia Event - are compared. BioScope marks linguistic cues and their scopes for negation and hedging while in Genia biological events are marked for uncertainty and/or negation. RESULTS: Differences among the annotations of the two corpora are thematically categorized and the frequency of each category is estimated. We found that the largest amount of differences is due to the issue that scopes - which cover text spans - deal with the key events and each argument (including events within events) of these events is under the scope as well. In contrast, Genia deals with the modality of events within events independently. CONCLUSIONS: The analysis of multiple layers of annotation (linguistic scopes and biological events) showed that the detection of negation/hedge keywords and their scopes can contribute to determining the modality of key events (denoted by the main predicate). On the other hand, for the detection of the negation and speculation status of events within events, additional syntax-based rules investigating the dependency path between the modality cue and the event cue have to be employed.

Assessment of NER solutions against the first and second CALBC Silver Standard Corpus.

Rebholz-Schuhmann, Dietrich; Jimeno Yepes, Antonio; Li, Chen; Kafkas, Senay; Lewin, Ian; Kang, Ning; Corbett, Peter; Milward, David; Buyko, Ekaterina; Beisswanger, Elena; Hornbostel, Kerstin; Kouznetsov, Alexandre; Witte, René; Laurila, Jonas B; Baker, Christopher Jo; Kuo, Cheng-Ju; Clematide, Simone; Rinaldi, Fabio; Farkas, Richárd; Móra, György; Hara, Kazuo; Furlong, Laura I; Rautschka, Michael; Neves, Mariana Lara; Pascual-Montano, Alberto; Wei, Qi; Collier, Nigel; Chowdhury, Md Faisal Mahbub; Lavelli, Alberto; Berlanga, Rafael; Morante, Roser; Van Asch, Vincent; Daelemans, Walter; Marina, José Luís; van Mulligen, Erik; Kors, Jan; Hahn, Udo.

J Biomed Semantics ; 2 Suppl 5: S11, 2011 Oct 06.

Artigo em Inglês | MEDLINE | ID: mdl-22166494

RESUMO

BACKGROUND: Competitions in text mining have been used to measure the performance of automatic text processing solutions against a manually annotated gold standard corpus (GSC). The preparation of the GSC is time-consuming and costly and the final corpus consists at the most of a few thousand documents annotated with a limited set of semantic groups. To overcome these shortcomings, the CALBC project partners (PPs) have produced a large-scale annotated biomedical corpus with four different semantic groups through the harmonisation of annotations from automatic text mining solutions, the first version of the Silver Standard Corpus (SSC-I). The four semantic groups are chemical entities and drugs (CHED), genes and proteins (PRGE), diseases and disorders (DISO) and species (SPE). This corpus has been used for the First CALBC Challenge asking the participants to annotate the corpus with their text processing solutions. RESULTS: All four PPs from the CALBC project and in addition, 12 challenge participants (CPs) contributed annotated data sets for an evaluation against the SSC-I. CPs could ignore the training data and deliver the annotations from their genuine annotation system, or could train a machine-learning approach on the provided pre-annotated data. In general, the performances of the annotation solutions were lower for entities from the categories CHED and PRGE in comparison to the identification of entities categorized as DISO and SPE. The best performance over all semantic groups were achieved from two annotation solutions that have been trained on the SSC-I.The data sets from participants were used to generate the harmonised Silver Standard Corpus II (SSC-II), if the participant did not make use of the annotated data set from the SSC-I for training purposes. The performances of the participants' solutions were again measured against the SSC-II. The performances of the annotation solutions showed again better results for DISO and SPE in comparison to CHED and PRGE. CONCLUSIONS: The SSC-I delivers a large set of annotations (1,121,705) for a large number of documents (100,000 Medline abstracts). The annotations cover four different semantic groups and are sufficiently homogeneous to be reproduced with a trained classifier leading to an average F-measure of 85%. Benchmarking the annotation solutions against the SSC-II leads to better performance for the CPs' annotation solutions in comparison to the SSC-I.

Semi-automated construction of decision rules to predict morbidities from clinical texts.

Farkas, Richárd; Szarvas, György; Hegedus, István; Almási, Attila; Vincze, Veronika; Ormándi, Róbert; Busa-Fekete, Róbert.

J Am Med Inform Assoc ; 16(4): 601-5, 2009.

Artigo em Inglês | MEDLINE | ID: mdl-19390097

RESUMO

OBJECTIVE In this study the authors describe the system submitted by the team of University of Szeged to the second i2b2 Challenge in Natural Language Processing for Clinical Data. The challenge focused on the development of automatic systems that analyzed clinical discharge summary texts and addressed the following question: "Who's obese and what co-morbidities do they (definitely/most likely) have?". Target diseases included obesity and its 15 most frequent comorbidities exhibited by patients, while the target labels corresponded to expert judgments based on textual evidence and intuition (separately). DESIGN The authors applied statistical methods to preselect the most common and confident terms and evaluated outlier documents by hand to discover infrequent spelling variants. The authors expected a system with dictionaries gathered semi-automatically to have a good performance with moderate development costs (the authors examined just a small proportion of the records manually). MEASUREMENTS Following the standard evaluation method of the second Workshop on challenges in Natural Language Processing for Clinical Data, the authors used both macro- and microaveraged Fbeta=1 measure for evaluation. RESULTS The authors submission achieved a microaverage F(beta=1) score of 97.29% for classification based on textual evidence (macroaverage F(beta=1) = 76.22%) and 96.42% for intuitive judgments (macroaverage F(beta=1) = 67.27%). CONCLUSIONS The results demonstrate the feasibility of the authors approach and show that even very simple systems with a shallow linguistic analysis can achieve remarkable accuracy scores for classifying clinical records on a limited set of concepts.

Assuntos

Armazenamento e Recuperação da Informação/métodos , Sistemas Computadorizados de Registros Médicos , Processamento de Linguagem Natural , Obesidade , Comorbidade , Humanos , Alta do Paciente , Estatística como Assunto

The BioScope corpus: biomedical texts annotated for uncertainty, negation and their scopes.

Vincze, Veronika; Szarvas, György; Farkas, Richárd; Móra, György; Csirik, János.

BMC Bioinformatics ; 9 Suppl 11: S9, 2008 Nov 19.

Artigo em Inglês | MEDLINE | ID: mdl-19025695

RESUMO

BACKGROUND: Detecting uncertain and negative assertions is essential in most BioMedical Text Mining tasks where, in general, the aim is to derive factual knowledge from textual data. This article reports on a corpus annotation project that has produced a freely available resource for research on handling negation and uncertainty in biomedical texts (we call this corpus the BioScope corpus). RESULTS: The corpus consists of three parts, namely medical free texts, biological full papers and biological scientific abstracts. The dataset contains annotations at the token level for negative and speculative keywords and at the sentence level for their linguistic scope. The annotation process was carried out by two independent linguist annotators and a chief linguist--also responsible for setting up the annotation guidelines --who resolved cases where the annotators disagreed. The resulting corpus consists of more than 20.000 sentences that were considered for annotation and over 10% of them actually contain one (or more) linguistic annotation suggesting negation or uncertainty. CONCLUSION: Statistics are reported on corpus size, ambiguity levels and the consistency of annotations. The corpus is accessible for academic purposes and is free of charge. Apart from the intended goal of serving as a common resource for the training, testing and comparing of biomedical Natural Language Processing systems, the corpus is also a good resource for the linguistic analysis of scientific and clinical texts.

Assuntos

Indexação e Redação de Resumos/métodos , Bases de Dados Bibliográficas , Armazenamento e Recuperação da Informação/métodos , Inteligência Artificial , Sistemas de Gerenciamento de Base de Dados , Processamento de Linguagem Natural , Vocabulário Controlado

Automatic construction of rule-based ICD-9-CM coding systems.

Farkas, Richárd; Szarvas, György.

BMC Bioinformatics ; 9 Suppl 3: S10, 2008 Apr 11.

Artigo em Inglês | MEDLINE | ID: mdl-18426545

RESUMO

BACKGROUND: In this paper we focus on the problem of automatically constructing ICD-9-CM coding systems for radiology reports. ICD-9-CM codes are used for billing purposes by health institutes and are assigned to clinical records manually following clinical treatment. Since this labeling task requires expert knowledge in the field of medicine, the process itself is costly and is prone to errors as human annotators have to consider thousands of possible codes when assigning the right ICD-9-CM labels to a document. In this study we use the datasets made available for training and testing automated ICD-9-CM coding systems by the organisers of an International Challenge on Classifying Clinical Free Text Using Natural Language Processing in spring 2007. The challenge itself was dominated by entirely or partly rule-based systems that solve the coding task using a set of hand crafted expert rules. Since the feasibility of the construction of such systems for thousands of ICD codes is indeed questionable, we decided to examine the problem of automatically constructing similar rule sets that turned out to achieve a remarkable accuracy in the shared task challenge. RESULTS: Our results are very promising in the sense that we managed to achieve comparable results with purely hand-crafted ICD-9-CM classifiers. Our best model got a 90.26% F measure on the training dataset and an 88.93% F measure on the challenge test dataset, using the micro-averaged F beta=1 measure, the official evaluation metric of the International Challenge on Classifying Clinical Free Text Using Natural Language Processing. This result would have placed second in the challenge, with a hand-crafted system achieving slightly better results. CONCLUSIONS: Our results demonstrate that hand-crafted systems - which proved to be successful in ICD-9-CM coding - can be reproduced by replacing several laborious steps in their construction with machine learning models. These hybrid systems preserve the favourable aspects of rule-based classifiers like good performance, and their development can be achieved rapidly and requires less human effort. Hence the construction of such hybrid systems can be feasible for a set of labels one magnitude bigger, and with more labeled data.

Assuntos

Algoritmos , Inteligência Artificial , Sistemas de Apoio a Decisões Clínicas , Classificação Internacional de Doenças , Processamento de Linguagem Natural , Reconhecimento Automatizado de Padrão/métodos , Radiologia/métodos , Terminologia como Assunto , Vocabulário Controlado

The strength of co-authorship in gene name disambiguation.

Farkas, Richárd.

BMC Bioinformatics ; 9: 69, 2008 Jan 29.

Artigo em Inglês | MEDLINE | ID: mdl-18230174

RESUMO

BACKGROUND: A biomedical entity mention in articles and other free texts is often ambiguous. For example, 13% of the gene names (aliases) might refer to more than one gene. The task of Gene Symbol Disambiguation (GSD) - a special case of Word Sense Disambiguation (WSD) - is to assign a unique gene identifier for all identified gene name aliases in biology-related articles. Supervised and unsupervised machine learning WSD techniques have been applied in the biomedical field with promising results. We examine here the utilisation potential of the fact - one of the special features of biological articles - that the authors of the documents are known through graph-based semi-supervised methods for the GSD task. RESULTS: Our key hypothesis is that a biologist refers to each particular gene by a fixed gene alias and this holds for the co-authors as well. To make use of the co-authorship information we decided to build the inverse co-author graph on MedLine abstracts. The nodes of the inverse co-author graph are articles and there is an edge between two nodes if and only if the two articles have a mutual author. We introduce here two methods using distances (based on the graph) of abstracts for the GSD task. We found that a disambiguation decision can be made in 85% of cases with an extremely high (99.5%) precision rate just by using information obtained from the inverse co-author graph. We incorporated the co-authorship information into two GSD systems in order to attain full coverage and in experiments our procedure achieved precision of 94.3%, 98.85%, 96.05% and 99.63% on the human, mouse, fly and yeast GSD evaluation sets, respectively. CONCLUSION: Based on the promising results obtained so far we suggest that the co-authorship information and the circumstances of the articles' release (like the title of the journal, the year of publication) can be a crucial building block of any sophisticated similarity measure among biological articles and hence the methods introduced here should be useful for other biomedical natural language processing tasks (like organism or target disease detection) as well.

Assuntos

Autoria , Classificação/métodos , Genes/genética , Processamento de Linguagem Natural , Bases de Dados Genéticas , Árvores de Decisões , MEDLINE , Semântica

State-of-the-art anonymization of medical records using an iterative machine learning framework.

Szarvas, György; Farkas, Richárd; Busa-Fekete, Róbert.

J Am Med Inform Assoc ; 14(5): 574-80, 2007.

Artigo em Inglês | MEDLINE | ID: mdl-17823086

RESUMO

OBJECTIVE: The anonymization of medical records is of great importance in the human life sciences because a de-identified text can be made publicly available for non-hospital researchers as well, to facilitate research on human diseases. Here the authors have developed a de-identification model that can successfully remove personal health information (PHI) from discharge records to make them conform to the guidelines of the Health Information Portability and Accountability Act. DESIGN: We introduce here a novel, machine learning-based iterative Named Entity Recognition approach intended for use on semi-structured documents like discharge records. Our method identifies PHI in several steps. First, it labels all entities whose tags can be inferred from the structure of the text and it then utilizes this information to find further PHI phrases in the flow text parts of the document. MEASUREMENTS: Following the standard evaluation method of the first Workshop on Challenges in Natural Language Processing for Clinical Data, we used token-level Precision, Recall and F(beta=1) measure metrics for evaluation. RESULTS: Our system achieved outstanding accuracy on the standard evaluation dataset of the de-identification challenge, with an F measure of 99.7534% for the best submitted model. CONCLUSION: We can say that our system is competitive with the current state-of-the-art solutions, while we describe here several techniques that can be beneficial in other tasks that need to handle structured documents such as clinical records.

Assuntos

Inteligência Artificial , Confidencialidade , Sistemas Computadorizados de Registros Médicos , Estudos de Avaliação como Assunto , Humanos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA