Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 93
Filtrar
1.
Heliyon ; 10(7): e28560, 2024 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-38590890

RESUMO

Single Sign-On (SSO) methods are the primary solution to authenticate users across multiple web systems. These mechanisms streamline the authentication procedure by avoiding duplicate developments of authentication modules for each application. Besides, these mechanisms also provide convenience to the end-user by keeping the user authenticated when switching between different contexts. To ensure this cross-application authentication, SSO relies on an Identity Provider (IdP), which is commonly set up and managed by each institution that needs to enforce SSO internally. However, the solution is not so straightforward when several institutions need to cooperate in a unique ecosystem. This could be tackled by centralizing the authentication mechanisms in one of the involved entities, a solution raising responsibilities that may be difficult for peers to accept. Moreover, this solution is not appropriate for dynamic groups, where peers may join or leave frequently. In this paper, we propose an architecture that uses a trusted third-party service to authenticate multiple entities, ensuring the isolation of the user's attributes between this service and the institutional SSO systems. This architecture was validated in the EHDEN Portal, which includes web tools and services of this European health project, to establish a Federated Authentication schema.

2.
BMC Med Inform Decis Mak ; 24(1): 65, 2024 Mar 05.
Artigo em Inglês | MEDLINE | ID: mdl-38443881

RESUMO

BACKGROUND: Multimodal histology image registration is a process that transforms into a common coordinate system two or more images obtained from different microscopy modalities. The combination of information from various modalities can contribute to a comprehensive understanding of tissue specimens, aiding in more accurate diagnoses, and improved research insights. Multimodal image registration in histology samples presents a significant challenge due to the inherent differences in characteristics and the need for tailored optimization algorithms for each modality. RESULTS: We developed MMIR a cloud-based system for multimodal histological image registration, which consists of three main modules: a project manager, an algorithm manager, and an image visualization system. CONCLUSION: Our software solution aims to simplify image registration tasks with a user-friendly approach. It facilitates effective algorithm management, responsive web interfaces, supports multi-resolution images, and facilitates batch image registration. Moreover, its adaptable architecture allows for the integration of custom algorithms, ensuring that it aligns with the specific requirements of each modality combination. Beyond image registration, our software enables the conversion of segmented annotations from one modality to another.


Assuntos
Algoritmos , Software , Humanos
3.
J Imaging Inform Med ; 2024 Mar 14.
Artigo em Inglês | MEDLINE | ID: mdl-38485898

RESUMO

Deep learning techniques have recently yielded remarkable results across various fields. However, the quality of these results depends heavily on the quality and quantity of data used during the training phase. One common issue in multi-class and multi-label classification is class imbalance, where one or several classes make up a substantial portion of the total instances. This imbalance causes the neural network to prioritize features of the majority classes during training, as their detection leads to higher scores. In the context of object detection, two types of imbalance can be identified: (1) an imbalance between the space occupied by the foreground and background and (2) an imbalance in the number of instances for each class. This paper aims to address the second type of imbalance without exacerbating the first. To achieve this, we propose a modification of the copy-paste data augmentation technique, combined with weight-balancing methods in the loss function. This strategy was specifically tailored to improve the performance in datasets with a high instance density, where instance overlap could be detrimental. To validate our methodology, we applied it to a highly unbalanced dataset focused on nuclei detection. The results show that this hybrid approach improves the classification of minority classes without significantly compromising the performance of majority classes.

4.
Biomedicines ; 11(8)2023 Jul 28.
Artigo em Inglês | MEDLINE | ID: mdl-37626628

RESUMO

Heart failure with preserved ejection fraction (HFpEF) represents a global health challenge, with limited therapies proven to enhance patient outcomes. This makes the elucidation of disease mechanisms and the identification of novel potential therapeutic targets a priority. Here, we performed RNA sequencing on ventricular myocardial biopsies from patients with HFpEF, prospecting to discover distinctive transcriptomic signatures. A total of 306 differentially expressed mRNAs (DEG) and 152 differentially expressed microRNAs (DEM) were identified and enriched in several biological processes involved in HF. Moreover, by integrating mRNA and microRNA expression data, we identified five potentially novel miRNA-mRNA relationships in HFpEF: the upregulated hsa-miR-25-3p, hsa-miR-26a-5p, and has-miR4429, targeting HAPLN1; and NPPB mRNA, targeted by hsa-miR-26a-5p and miR-140-3p. Exploring the predicted miRNA-mRNA interactions experimentally, we demonstrated that overexpression of the distinct miRNAs leads to the downregulation of their target genes. Interestingly, we also observed that microRNA signatures display a higher discriminative power to distinguish HFpEF sub-groups over mRNA signatures. Our results offer new mechanistic clues, which can potentially translate into new HFpEF therapies.

5.
Comput Biol Med ; 159: 106867, 2023 06.
Artigo em Inglês | MEDLINE | ID: mdl-37060770

RESUMO

A vast number of microarray datasets have been produced as a way to identify differentially expressed genes and gene expression signatures. A better understanding of these biological processes can help in the diagnosis and prognosis of diseases, as well as in the therapeutic response to drugs. However, most of the available datasets are composed of a reduced number of samples, leading to low statistical, predictive and generalization power. One way to overcome this problem is by merging several microarray datasets into a single dataset, which is typically a challenging task. Statistical methods or supervised machine learning algorithms are usually used to determine gene expression signatures. Nevertheless, statistical methods require an arbitrary threshold to be defined, and supervised machine learning methods can be ineffective when applied to high-dimensional datasets like microarrays. We propose a methodology to identify gene expression signatures by merging microarray datasets. This methodology uses statistical methods to obtain several sets of differentially expressed genes and uses supervised machine learning algorithms to select the gene expression signature. This methodology was validated using two distinct research applications: one using heart failure and the other using autism spectrum disorder microarray datasets. For the first, we obtained a gene expression signature composed of 117 genes, with a classification accuracy of approximately 98%. For the second use case, we obtained a gene expression signature composed of 79 genes, with a classification accuracy of approximately 82%. This methodology was implemented in R language and is available, under the MIT licence, at https://github.com/bioinformatics-ua/MicroGES.


Assuntos
Transtorno do Espectro Autista , Perfilação da Expressão Gênica , Humanos , Perfilação da Expressão Gênica/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Transcriptoma , Algoritmos
6.
J Integr Bioinform ; 20(2)2023 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-36880517

RESUMO

Nicotinamide adenine dinucleotide (NAD) levels are essential for the normal physiology of the cell and are strictly regulated to prevent pathological conditions. NAD functions as a coenzyme in redox reactions, as a substrate of regulatory proteins, and as a mediator of protein-protein interactions. The main objectives of this study were to identify the NAD-binding and NAD-interacting proteins, and to uncover novel proteins and functions that could be regulated by this metabolite. It was considered if cancer-associated proteins were potential therapeutic targets. Using multiple experimental databases, we defined datasets of proteins that directly interact with NAD - the NAD-binding proteins (NADBPs) dataset - and of proteins that interact with NADBPs - the NAD-protein-protein interactions (NAD-PPIs) dataset. Pathway enrichment analysis revealed that NADBPs participate in several metabolic pathways, while NAD-PPIs are mostly involved in signalling pathways. These include disease-related pathways, namely, three major neurodegenerative disorders: Alzheimer's disease, Huntington's disease, and Parkinson's disease. Then, the complete human proteome was further analysed to select potential NADBPs. TRPC3 and isoforms of diacylglycerol (DAG) kinases, which are involved in calcium signalling, were identified as new NADBPs. Potential therapeutic targets that interact with NAD were identified, that have regulatory and signalling functions in cancer and neurodegenerative diseases.


Assuntos
Neoplasias , Doenças Neurodegenerativas , Humanos , NAD/metabolismo , NAD/uso terapêutico , Oxirredução , Transdução de Sinais , Doenças Neurodegenerativas/tratamento farmacológico , Doenças Neurodegenerativas/metabolismo
7.
J Biomed Inform ; 137: 104272, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-36563828

RESUMO

BACKGROUND: Secondary use of health data is a valuable source of knowledge that boosts observational studies, leading to important discoveries in the medical and biomedical sciences. The fundamental guiding principle for performing a successful observational study is the research question and the approach in advance of executing a study. However, in multi-centre studies, finding suitable datasets to support the study is challenging, time-consuming, and sometimes impossible without a deep understanding of each dataset. METHODS: We propose a strategy for retrieving biomedical datasets of interest that were semantically annotated, using an interface built by applying a methodology for transforming natural language questions into formal language queries. The advantages of creating biomedical semantic data are enhanced by using natural language interfaces to issue complex queries without manipulating a logical query language. RESULTS: Our methodology was validated using Alzheimer's disease datasets published in a European platform for sharing and reusing biomedical data. We converted data to semantic information format using biomedical ontologies in everyday use in the biomedical community and published it as a FAIR endpoint. We have considered natural language questions of three types: single-concept questions, questions with exclusion criteria, and multi-concept questions. Finally, we analysed the performance of the question-answering module we used and its limitations. The source code is publicly available at https://bioinformatics-ua.github.io/BioKBQA/. CONCLUSION: We propose a strategy for using information extracted from biomedical data and transformed into a semantic format using open biomedical ontologies. Our method uses natural language to formulate questions to be answered by this semantic data without the direct use of formal query languages.


Assuntos
Processamento de Linguagem Natural , Semântica , Software , Idioma , Bases de Dados Factuais
8.
Yearb Med Inform ; 31(1): 262-272, 2022 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-36463884

RESUMO

OBJECTIVES: Existing individual-level human data cover large populations on many dimensions such as lifestyle, demography, laboratory measures, clinical parameters, etc. Recent years have seen large investments in data catalogues to FAIRify data descriptions to capitalise on this great promise, i.e. make catalogue contents more Findable, Accessible, Interoperable and Reusable. However, their valuable diversity also created heterogeneity, which poses challenges to optimally exploit their richness. METHODS: In this opinion review, we analyse catalogues for human subject research ranging from cohort studies to surveillance, administrative and healthcare records. RESULTS: We observe that while these catalogues are heterogeneous, have various scopes, and use different terminologies, still the underlying concepts seem potentially harmonizable. We propose a unified framework to enable catalogue data sharing, with catalogues of multi-center cohorts nested as a special case in catalogues of real-world data sources. Moreover, we list recommendations to create an integrated community of metadata catalogues and an open catalogue ecosystem to sustain these efforts and maximise impact. CONCLUSIONS: We propose to embrace the autonomy of motivated catalogue teams and invest in their collaboration via minimal standardisation efforts such as clear data licensing, persistent identifiers for linking same records between catalogues, minimal metadata 'common data elements' using shared ontologies, symmetric architectures for data sharing (push/pull) with clear provenance tracks to process updates and acknowledge original contributors. And most importantly, we encourage the creation of environments for collaboration and resource sharing between catalogue developers, building on international networks such as OpenAIRE and research data alliance, as well as domain specific ESFRIs such as BBMRI and ELIXIR.


Assuntos
Elementos de Dados Comuns , Ecossistema , Humanos , Estudos de Coortes , Disseminação de Informação
9.
Healthcare (Basel) ; 10(11)2022 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-36421611

RESUMO

Biomedical databases often have restricted access policies and governance rules. Thus, an adequate description of their content is essential for researchers who wish to use them for medical research. A strategy for publishing information without disclosing patient-level data is through database fingerprinting and aggregate characterisations. However, this information is still presented in a format that makes it challenging to search, analyse, and decide on the best databases for a domain of study. Several strategies allow one to visualise and compare the characteristics of multiple biomedical databases. Our study focused on a European platform for sharing and disseminating biomedical data. We use semantic data visualisation techniques to assist in comparing descriptive metadata from several databases. The great advantage lies in streamlining the database selection process, ensuring that sensitive details are not shared. To address this goal, we have considered two levels of data visualisation, one characterising a single database and the other involving multiple databases in network-level visualisations. This study revealed the impact of the proposed visualisations and some open challenges in representing semantically annotated biomedical datasets. Identifying future directions in this scope was one of the outcomes of this work.

10.
J Pathol Inform ; 13: 100103, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36268075

RESUMO

At the end of the twentieth century, a new technology was developed that allowed an entire tissue section to be scanned on an objective slide. Originally called virtual microscopy, this technology is now known as Whole Slide Imaging (WSI). WSI presents new challenges for reading, visualization, storage, and analysis. For this reason, several technologies have been developed to facilitate the handling of these images. In this paper, we analyze the most widely used technologies in the field of digital pathology, ranging from specialized libraries for the reading of these images to complete platforms that allow reading, visualization, and analysis. Our aim is to provide the reader, whether a pathologist or a computational scientist, with the knowledge to choose the technologies to use for new studies, development, or research.

11.
Stud Health Technol Inform ; 298: 163-164, 2022 Aug 31.
Artigo em Inglês | MEDLINE | ID: mdl-36073478

RESUMO

Anonymisation is currently one of the biggest challenges when sharing sensitive personal information. Its importance depends largely on the application domain, but when dealing with health information, this becomes a more serious issue. A simpler approach to avoid inadequate disclosure is to ensure that all data that can be associated directly with an individual is removed from the original dataset. However, some studies have shown that simple anonymisation procedures can sometimes be reverted using specific patients' characteristics. In this work, we propose a secure architecture to share information from distributed databases without compromising the subjects' privacy. The anonymiser system was validated using the OMOP CDM data schema, which is widely adopted in observational research studies.


Assuntos
Informações Pessoalmente Identificáveis , Privacidade , Bases de Dados Factuais , Humanos
12.
Stud Health Technol Inform ; 294: 585-586, 2022 May 25.
Artigo em Inglês | MEDLINE | ID: mdl-35612156

RESUMO

Many clinical studies are greatly dependent on an efficient identification of relevant datasets. This selection can be performed in existing health data catalogues, by searching for available metadata. The search process can be optimised through questioning-answering interfaces, to help researchers explore the available data present. However, when searching the distinct catalogues the lack of metadata harmonisation imposes a few bottlenecks. This paper presents a methodology to allow semantic search over several biomedical database catalogues, by extracting the information using a shared domain knowledge. The resulting pipeline allows the converted data to be published as FAIR endpoints, and it provides an end-user interface that accepts natural language questions.


Assuntos
Metadados , Semântica , Bases de Dados Factuais , Idioma , Processamento de Linguagem Natural
13.
J Biomed Inform ; 120: 103849, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-34214696

RESUMO

BACKGROUND: The content of the clinical notes that have been continuously collected along patients' health history has the potential to provide relevant information about treatments and diseases, and to increase the value of structured data available in Electronic Health Records (EHR) databases. EHR databases are currently being used in observational studies which lead to important findings in medical and biomedical sciences. However, the information present in clinical notes is not being used in those studies, since the computational analysis of this unstructured data is much complex in comparison to structured data. METHODS: We propose a two-stage workflow for solving an existing gap in Extraction, Transformation and Loading (ETL) procedures regarding observational databases. The first stage of the workflow extracts prescriptions present in patient's clinical notes, while the second stage harmonises the extracted information into their standard definition and stores the resulting information in a common database schema used in observational studies. RESULTS: We validated this methodology using two distinct data sets, in which the goal was to extract and store drug related information in a new Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) database. We analysed the performance of the used annotator as well as its limitations. Finally, we described some practical examples of how users can explore these datasets once migrated to OMOP CDM databases. CONCLUSION: With this methodology, we were able to show a strategy for using the information extracted from the clinical notes in business intelligence tools, or for other applications such as data exploration through the use of SQL queries. Besides, the extracted information complements the data present in OMOP CDM databases which was not directly available in the EHR database.


Assuntos
Registros Eletrônicos de Saúde , Preparações Farmacêuticas , Bases de Dados Factuais , Atenção à Saúde , Humanos , Fluxo de Trabalho
14.
Bioinformatics ; 37(Suppl_1): i84-i92, 2021 07 12.
Artigo em Inglês | MEDLINE | ID: mdl-34252946

RESUMO

MOTIVATION: The process of placing new drugs into the market is time-consuming, expensive and complex. The application of computational methods for designing molecules with bespoke properties can contribute to saving resources throughout this process. However, the fundamental properties to be optimized are often not considered or conflicting with each other. In this work, we propose a novel approach to consider both the biological property and the bioavailability of compounds through a deep reinforcement learning framework for the targeted generation of compounds. We aim to obtain a promising set of selective compounds for the adenosine A2A receptor and, simultaneously, that have the necessary properties in terms of solubility and permeability across the blood-brain barrier to reach the site of action. The cornerstone of the framework is based on a recurrent neural network architecture, the Generator. It seeks to learn the building rules of valid molecules to sample new compounds further. Also, two Predictors are trained to estimate the properties of interest of the new molecules. Finally, the fine-tuning of the Generator was performed with reinforcement learning, integrated with multi-objective optimization and exploratory techniques to ensure that the Generator is adequately biased. RESULTS: The biased Generator can generate an interesting set of molecules, with approximately 85% having the two fundamental properties biased as desired. Thus, this approach has transformed a general molecule generator into a model focused on optimizing specific objectives. Furthermore, the molecules' synthesizability and drug-likeness demonstrate the potential applicability of the de novo drug design in medicinal chemistry. AVAILABILITY AND IMPLEMENTATION: All code is publicly available in the https://github.com/larngroup/De-Novo-Drug-Design. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Barreira Hematoencefálica , Desenho de Fármacos , Transporte Biológico , Redes Neurais de Computação
15.
Stud Health Technol Inform ; 281: 327-331, 2021 May 27.
Artigo em Inglês | MEDLINE | ID: mdl-34042759

RESUMO

The process of refining the research question in a medical study depends greatly on the current background of the investigated subject. The information found in prior works can directly impact several stages of the study, namely the cohort definition stage. Besides previous published methods, researchers could also leverage on other materials, such as the output of cohort selection tools, to enrich and to accelerate their own work. However, this kind of information is not always captured by search engines. In this paper, we present a methodology, based on a combination of content-based retrieval and text annotation techniques, to identify relevant scientific publications related to a research question and to the selected data sources.


Assuntos
Armazenamento e Recuperação da Informação , Ferramenta de Busca , Estudos de Coortes
16.
J Integr Bioinform ; 18(2): 101-110, 2021 May 20.
Artigo em Inglês | MEDLINE | ID: mdl-34013675

RESUMO

With the continuous increase in the use of social networks, social mining is steadily becoming a powerful component of digital phenotyping. In this paper we explore social mining for the classification of self-diagnosed depressed users of Reddit as social network. We conduct a cross evaluation study based on two public datasets in order to understand the impact of transfer learning when the data source is virtually the same. We further complement these results with an experiment of transfer learning in post-partum depression classification, using a corpus we have collected for the matter. Our findings show that transfer learning in social mining might still be at an early stage in computational research and we thoroughly discuss its implications.


Assuntos
Mineração de Dados , Aprendizado de Máquina
17.
Comput Biol Med ; 130: 104180, 2021 03.
Artigo em Inglês | MEDLINE | ID: mdl-33360272

RESUMO

Privacy issues limit the analysis and cross-exploration of most distributed and private biobanks, often raised by the multiple dimensionality and sensitivity of the data associated with access restrictions and policies. These characteristics prevent collaboration between entities, constituting a barrier to emergent personalized and public health challenges, namely the discovery of new druggable targets, identification of disease-causing genetic variants, or the study of rare diseases. In this paper, we propose a semi-automatic methodology for the analysis of distributed and private biobanks. The strategies involved in the proposed methodology efficiently enable the creation and execution of unified genomic studies using distributed repositories, without compromising the information present in the datasets. We apply the methodology to a case study in the current Covid-19, ensuring the combination of the diagnostics from multiple entities while maintaining privacy through a completely identical procedure. Moreover, we show that the methodology follows a simple, intuitive, and practical scheme.


Assuntos
Bancos de Espécimes Biológicos , COVID-19 , Saúde Pública , SARS-CoV-2 , Humanos
18.
Biomed Res Int ; 2020: 3041498, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32908882

RESUMO

The Semantic Web and Linked Data concepts and technologies have empowered the scientific community with solutions to take full advantage of the increasingly available distributed and heterogeneous data in distinct silos. Additionally, FAIR Data principles established guidelines for data to be Findable, Accessible, Interoperable, and Reusable, and they are gaining traction in data stewardship. However, to explore their full potential, we must be able to transform legacy solutions smoothly into the FAIR Data ecosystem. In this paper, we introduce SCALEUS-FD, a FAIR Data extension of a legacy semantic web tool successfully used for data integration and semantic annotation and enrichment. The core functionalities of the solution follow the Semantic Web and Linked Data principles, offering a FAIR REST API for machine-to-machine operations. We applied a set of metrics to evaluate its "FAIRness" and created an application scenario in the rare diseases domain.


Assuntos
Web Semântica , Software , Big Data , Ontologias Biológicas , Disciplinas das Ciências Biológicas/estatística & dados numéricos , Biologia Computacional , Bases de Dados Factuais , Humanos , Internet , Metadados , Semântica
19.
BioData Min ; 13: 8, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32670412

RESUMO

BACKGROUND: Heart disease is the leading cause of death worldwide. Knowing a gene expression signature in heart disease can lead to the development of more efficient diagnosis and treatments that may prevent premature deaths. A large amount of microarray data is available in public repositories and can be used to identify differentially expressed genes. However, most of the microarray datasets are composed of a reduced number of samples and to obtain more reliable results, several datasets have to be merged, which is a challenging task. The identification of differentially expressed genes is commonly done using statistical methods. Nonetheless, these methods are based on the definition of an arbitrary threshold to select the differentially expressed genes and there is no consensus on the values that should be used. RESULTS: Nine publicly available microarray datasets from studies of different heart diseases were merged to form a dataset composed of 689 samples and 8354 features. Subsequently, the adjusted p-value and fold change were determined and by combining a set of adjusted p-values cutoffs with a list of different fold change thresholds, 12 sets of differentially expressed genes were obtained. To select the set of differentially expressed genes that has the best accuracy in classifying samples from patients with heart diseases and samples from patients with no heart condition, the random forest algorithm was used. A set of 62 differentially expressed genes having a classification accuracy of approximately 95% was identified. CONCLUSIONS: We identified a gene expression signature common to different cardiac diseases and supported our findings by showing their involvement in the pathophysiology of the heart. The approach used in this study is suitable for the identification of gene expression signatures, and can be extended to different diseases.

20.
Stud Health Technol Inform ; 270: 1183-1184, 2020 Jun 16.
Artigo em Inglês | MEDLINE | ID: mdl-32570570

RESUMO

Aiming to better understand the genetic and environmental associations of Alzheimer's disease, many clinical trials and scientific studies have been conducted. However, these studies are often based on a small number of participants. To address this limitation, there is an increasing demand of multi-cohorts studies, which can provide higher statistical power and clinical evidence. However, this data integration implies dealing with the diversity of cohorts structures and the wide variability of concepts. Moreover, discovering similar cohorts to extend a running study is typically a demanding task. In this paper, we present a recommendation system to allow finding similar cohorts based on profile interests. The method uses collaborative filtering mixed with context-based retrieval techniques to find relevant cohorts on scientific literature about Alzheimer's diseases. The method was validated in a set of 62 cohorts.


Assuntos
Algoritmos , Doença de Alzheimer , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...