Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 79
Filtrar
1.
Infect Dis Model ; 10(1): 110-128, 2025 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-39376223

RESUMO

The level of surveillance and preparedness against epidemics varies across countries, resulting in different responses to outbreaks. When conducting an in-depth analysis of microinfection dynamics, one must account for the substantial heterogeneity across countries. However, many commonly used statistical model specifications lack the flexibility needed for sound and accurate analysis and prediction in such contexts. Nonlinear mixed effects models (NLMMs) constitute a specific statistical tool that can overcome these significant challenges. While compartmental models are well-established in infectious disease modeling and have seen significant advancements, Nonlinear Mixed Models (NLMMs) offer a flexible approach for handling heterogeneous and unbalanced repeated measures data, often with less computational effort than some individual-level compartmental modeling techniques. This study provides an overview of their current use and offers a solid foundation for developing guidelines that may help improve their implementation in real-world situations. Relevant scientific databases in the Research4life Access initiative programs were used to search for papers dealing with key aspects of NLMMs in infectious disease modeling (IDM). From an initial list of 3641 papers, 124 were finally included and used for this systematic and critical review spanning the last two decades, following the PRISMA guidelines. NLMMs have evolved rapidly in the last decade, especially in IDM, with most publications dating from 2017 to 2021 (83.33%). The routine use of normality assumption appeared inappropriate for IDM, leading to a wealth of literature on NLMMs with non-normal errors and random effects under various estimation methods. We noticed that NLMMs have attracted much attention for the latest known epidemics worldwide (COVID-19, Ebola, Dengue and Lassa) with the robustness and reliability of relaxed propositions of the normality assumption. A case study of the application of COVID-19 data helped to highlight NLMMs' performance in modeling infectious diseases. Out of this study, estimation methods, assumptions, and random terms specification in NLMMs are key aspects requiring particular attention for their application in IDM.

2.
Brief Bioinform ; 25(5)2024 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-39256197

RESUMO

Unraveling the intricate network of associations among microRNAs (miRNAs), genes, and diseases is pivotal for deciphering molecular mechanisms, refining disease diagnosis, and crafting targeted therapies. Computational strategies, leveraging link prediction within biological graphs, present a cost-efficient alternative to high-cost empirical assays. However, while plenty of methods excel at predicting specific associations, such as miRNA-disease associations (MDAs), miRNA-target interactions (MTIs), and disease-gene associations (DGAs), a holistic approach harnessing diverse data sources for multifaceted association prediction remains largely unexplored. The limited availability of high-quality data, as vitro experiments to comprehensively confirm associations are often expensive and time-consuming, results in a sparse and noisy heterogeneous graph, hindering an accurate prediction of these complex associations. To address this challenge, we propose a novel framework called Global-local aware Heterogeneous Graph Contrastive Learning (GlaHGCL). GlaHGCL combines global and local contrastive learning to improve node embeddings in the heterogeneous graph. In particular, global contrastive learning enhances the robustness of node embeddings against noise by aligning global representations of the original graph and its augmented counterpart. Local contrastive learning enforces representation consistency between functionally similar or connected nodes across diverse data sources, effectively leveraging data heterogeneity and mitigating the issue of data scarcity. The refined node representations are applied to downstream tasks, such as MDA, MTI, and DGA prediction. Experiments show GlaHGCL outperforming state-of-the-art methods, and case studies further demonstrate its ability to accurately uncover new associations among miRNAs, genes, and diseases. We have made the datasets and source code publicly available at https://github.com/Sue-syx/GlaHGCL.


Assuntos
Biologia Computacional , Redes Reguladoras de Genes , MicroRNAs , MicroRNAs/genética , Humanos , Biologia Computacional/métodos , Aprendizado de Máquina , Algoritmos , Predisposição Genética para Doença
3.
Entropy (Basel) ; 26(9)2024 Sep 03.
Artigo em Inglês | MEDLINE | ID: mdl-39330088

RESUMO

In recent years, urban floods have occurred frequently in China. Therefore, there is an urgent need to strengthen urban flood resilience. This paper proposed a hybrid multi-criteria group decision-making method to assess urban flood resilience based on heterogeneous data, group decision-making methodologies, the pressure-state-response model, and social-economic-natural complex ecosystem theory (PSR-SENCE model). A qualitative and quantitative indicator system is formulated using the PSR-SENCE model. Additionally, a new weighting method for indicators, called the synthesis weighting-group analytic hierarchy process (SW-GAHP), is proposed by considering both intrapersonal consistency and interpersonal consistency of decision-makers. Furthermore, an extensional group decision-making technology (EGDMT) based on heterogeneous data is proposed to evaluate qualitative indicators. The flexible parameterized mapping function (FPMF) is introduced for the evaluation of quantitative indicators. The normal cloud model is employed to handle various uncertainties associated with heterogeneous data. The evaluations for Beijing from 2017 to 2021 reveal a consistent annual improvement in urban flood resilience, with a 14.1% increase. Subsequently, optimization recommendations are presented not only for favorable indicators such as regional economic status, drainability, and public transportation service capacity but also for unfavorable indicators like flood risk and population density. This provides a theoretical foundation and a guide for making decisions about the improvement of urban flood resilience. Finally, our proposed method shows superiority and robustness through comparative and sensitivity analyses.

4.
J Transl Med ; 22(1): 873, 2024 Sep 28.
Artigo em Inglês | MEDLINE | ID: mdl-39342319

RESUMO

BACKGROUND: In the management of complex diseases, the strategic adoption of combination therapy has gained considerable prominence. Combination therapy not only holds the potential to enhance treatment efficacy but also to alleviate the side effects caused by excessive use of a single drug. Presently, the exploration of combination therapy encounters significant challenges due to the vast spectrum of potential drug combinations, necessitating the development of efficient screening strategies. METHODS: In this study, we propose a prediction scoring method that integrates heterogeneous data using a weighted Bayesian method for drug combination prediction. Heterogeneous data refers to different types of data related to drugs, such as chemical, pharmacological, and target profiles. By constructing a multiplex drug similarity network, we formulate new features for drug pairs and propose a novel Bayesian-based integration scheme with the introduction of weights to integrate information from various sources. This method yields support strength scores for drug combinations to assess their potential effectiveness. RESULTS: Upon comprehensive comparison with other methods, our method shows superior performance across multiple metrics, including the Area Under the Receiver Operating Characteristic Curve, accuracy, precision, and recall. Furthermore, literature validation shows that many top-ranked drug combinations based on the support strength score, such as goserelin and letrozole, have been experimentally or clinically validated for their effectiveness. CONCLUSIONS: Our findings have significant clinical and practical implications. This new method enhances the performance of drug combination predictions, enabling effective pre-screening for trials and, thereby, benefiting clinical treatments. Future research should focus on developing new methods for application in various scenarios and for integrating diverse data sources.


Assuntos
Teorema de Bayes , Humanos , Combinação de Medicamentos , Curva ROC , Reprodutibilidade dos Testes , Quimioterapia Combinada
5.
Biometrics ; 80(3)2024 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-39248121

RESUMO

Recent years have witnessed a rise in the popularity of information integration without sharing of raw data. By leveraging and incorporating summary information from external sources, internal studies can achieve enhanced estimation efficiency and prediction accuracy. However, a noteworthy challenge in utilizing summary-level information is accommodating the inherent heterogeneity across diverse data sources. In this study, we delve into the issue of prior probability shift between two cohorts, wherein the difference of two data distributions depends on the outcome. We introduce a novel semi-parametric constrained optimization-based approach to integrate information within this framework, which has not been extensively explored in existing literature. Our proposed method tackles the prior probability shift by introducing the outcome-dependent selection function and effectively addresses the estimation uncertainty associated with summary information from the external source. Our approach facilitates valid inference even in the absence of a known variance-covariance estimate from the external source. Through extensive simulation studies, we observe the superiority of our method over existing ones, showcasing minimal estimation bias and reduced variance for both binary and continuous outcomes. We further demonstrate the utility of our method through its application in investigating risk factors related to essential hypertension, where the reduced estimation variability is observed after integrating summary information from an external data.


Assuntos
Simulação por Computador , Hipertensão Essencial , Probabilidade , Humanos , Modelos Estatísticos , Fatores de Risco , Hipertensão , Interpretação Estatística de Dados , Biometria/métodos
6.
Biom J ; 66(6): e202300198, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-39162085

RESUMO

Lesion-symptom mapping studies provide insight into what areas of the brain are involved in different aspects of cognition. This is commonly done via behavioral testing in patients with a naturally occurring brain injury or lesions (e.g., strokes or brain tumors). This results in high-dimensional observational data where lesion status (present/absent) is nonuniformly distributed, with some voxels having lesions in very few (or no) subjects. In this situation, mass univariate hypothesis tests have severe power heterogeneity where many tests are known a priori to have little to no power. Recent advancements in multiple testing methodologies allow researchers to weigh hypotheses according to side information (e.g., information on power heterogeneity). In this paper, we propose the use of p-value weighting for voxel-based lesion-symptom mapping studies. The weights are created using the distribution of lesion status and spatial information to estimate different non-null prior probabilities for each hypothesis test through some common approaches. We provide a monotone minimum weight criterion, which requires minimum a priori power information. Our methods are demonstrated on dependent simulated data and an aphasia study investigating which regions of the brain are associated with the severity of language impairment among stroke survivors. The results demonstrate that the proposed methods have robust error control and can increase power. Further, we showcase how weights can be used to identify regions that are inconclusive due to lack of power.


Assuntos
Biometria , Humanos , Biometria/métodos , Afasia/fisiopatologia , Encéfalo/diagnóstico por imagem , Mapeamento Encefálico/métodos , Reações Falso-Positivas
7.
Stud Health Technol Inform ; 316: 1385-1389, 2024 Aug 22.
Artigo em Inglês | MEDLINE | ID: mdl-39176639

RESUMO

Interoperability is crucial to overcoming various challenges of data integration in the healthcare domain. While OMOP and FHIR data standards handle syntactic heterogeneity among heterogeneous data sources, ontologies support semantic interoperability to overcome the complexity and disparity of healthcare data. This study proposes an ontological approach in the context of the EUCAIM project to support semantic interoperability among distributed big data repositories that have applied heterogeneous cancer image data models using a semantically well-founded Hyperontology for the oncology domain.


Assuntos
Semântica , Humanos , Ontologias Biológicas , Interoperabilidade da Informação em Saúde , Oncologia , Neoplasias , Big Data
8.
Health Inf Sci Syst ; 12(1): 37, 2024 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38974364

RESUMO

Obtaining high-quality data sets from raw data is a key step before data exploration and analysis. Nowadays, in the medical domain, a large amount of data is in need of quality improvement before being used to analyze the health condition of patients. There have been many researches in data extraction, data cleaning and data imputation, respectively. However, there are seldom frameworks integrating with these three techniques, making the dataset suffer in accuracy, consistency and integrity. In this paper, a multi-source heterogeneous data enhancement framework based on a lakehouse MHDP is proposed, which includes three steps of data extraction, data cleaning and data imputation. In the data extraction step, a data fusion technique is offered to handle multi-modal and multi-source heterogeneous data. In the data cleaning step, we propose HoloCleanX, which provides a convenient interactive procedure. In the data imputation step, multiple imputation (MI) and the SOTA algorithm SAITS, are applied for different situations. We evaluate our framework via three tasks: clustering, classification and strategy prediction. The experimental results prove the effectiveness of our data enhancement framework.

9.
Comput Methods Programs Biomed ; 254: 108294, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-38943984

RESUMO

BACKGROUND AND OBJECTIVE: Recent advancements in brain-computer interface (BCI) technology have seen a significant shift towards incorporating complex decoding models such as deep neural networks (DNNs) to enhance performance. These models are particularly crucial for sophisticated tasks such as regression for decoding arbitrary movements. However, these BCI models trained and tested on individual data often face challenges with limited performance and generalizability across different subjects. This limitation is primarily due to a tremendous number of parameters of DNN models. Training complex models demands extensive datasets. Nevertheless, group data from many subjects may not produce sufficient decoding performance because of inherent variability in neural signals both across individuals and over time METHODS: To address these challenges, this study proposed a transfer learning approach that could effectively adapt to subject-specific variability in cortical regions. Our method involved training two separate movement decoding models: one on individual data and another on pooled group data. We then created a salience map for each cortical region from the individual model, which helped us identify the input's contribution variance across subjects. Based on the contribution variance, we combined individual and group models using a modified knowledge distillation framework. This approach allowed the group model to be universally applicable by assigning greater weights to input data, while the individual model was fine-tuned to focus on areas with significant individual variance RESULTS: Our combined model effectively encapsulated individual variability. We validated this approach with nine subjects performing arm-reaching tasks, with our method outperforming (mean correlation coefficient, r = 0.75) both individual (r = 0.70) and group models (r = 0.40) in decoding performance. In particular, there were notable improvements in cases where individual models showed low performances (e.g., r = 0.50 in the individual decoder to r = 0.61 in the proposed decoder) CONCLUSIONS: These results not only demonstrate the potential of our method for robust BCI, but also underscore its ability to generalize individual data for broader applicability.


Assuntos
Interfaces Cérebro-Computador , Humanos , Redes Neurais de Computação , Eletroencefalografia , Movimento/fisiologia , Algoritmos , Encéfalo/fisiologia , Aprendizado de Máquina , Masculino , Adulto
10.
Comput Biol Med ; 178: 108742, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38875908

RESUMO

In recent years, there has been a significant improvement in the accuracy of the classification of pigmented skin lesions using artificial intelligence algorithms. Intelligent analysis and classification systems are significantly superior to visual diagnostic methods used by dermatologists and oncologists. However, the application of such systems in clinical practice is severely limited due to a lack of generalizability and risks of potential misclassification. Successful implementation of artificial intelligence-based tools into clinicopathological practice requires a comprehensive study of the effectiveness and performance of existing models, as well as further promising areas for potential research development. The purpose of this systematic review is to investigate and evaluate the accuracy of artificial intelligence technologies for detecting malignant forms of pigmented skin lesions. For the study, 10,589 scientific research and review articles were selected from electronic scientific publishers, of which 171 articles were included in the presented systematic review. All selected scientific articles are distributed according to the proposed neural network algorithms from machine learning to multimodal intelligent architectures and are described in the corresponding sections of the manuscript. This research aims to explore automated skin cancer recognition systems, from simple machine learning algorithms to multimodal ensemble systems based on advanced encoder-decoder models, visual transformers (ViT), and generative and spiking neural networks. In addition, as a result of the analysis, future directions of research, prospects, and potential for further development of automated neural network systems for classifying pigmented skin lesions are discussed.


Assuntos
Inteligência Artificial , Redes Neurais de Computação , Neoplasias Cutâneas , Humanos , Neoplasias Cutâneas/classificação , Neoplasias Cutâneas/diagnóstico , Neoplasias Cutâneas/patologia , Diagnóstico por Computador/métodos , Algoritmos , Aprendizado de Máquina
11.
J Neurol Sci ; 462: 123091, 2024 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-38870732

RESUMO

Sex differences affect Parkinson's disease (PD) development and manifestation. Yet, current PD identification and treatments underuse these distinctions. Sex-focused PD literature often prioritizes prevalence rates over feature importance analysis. However, underlying aspects could make a feature significant for predicting PD, despite its score. Interactions between features require consideration, as do distinctions between scoring disparities and actual feature importance. For instance, a higher score in males for a certain feature doesn't necessarily mean it's less important for characterizing PD in females. This article proposes an explainable Machine Learning (ML) model to elucidate these underlying factors, emphasizing the importance of features. This insight could be critical for personalized medicine, suggesting the need to tailor data collection and analysis for males and females. The model identifies sex-specific differences in PD, aiding in predicting outcomes as "Healthy" or "Pathological". It adopts a system-level approach, integrating heterogeneous data - clinical, imaging, genetics, and demographics - to study new biomarkers for diagnosis. The explainable ML approach aids non-ML experts in understanding model decisions, fostering trust and facilitating interpretation of complex ML outcomes, thus enhancing usability and translational research. The ML model identifies muscle rigidity, autonomic and cognitive assessments, and family history as key contributors to PD diagnosis, with sex differences noted. The genetic variant SNCA-rs356181 may be more significant in characterizing PD in males. Interaction analysis reveals a greater occurrence of feature interplay among males compared to females. These disparities offer insights into PD pathophysiology and could guide the development of sex-specific diagnostic and therapeutic approaches.


Assuntos
Aprendizado de Máquina , Doença de Parkinson , Feminino , Humanos , Masculino , Doença de Parkinson/genética , Doença de Parkinson/diagnóstico , Doença de Parkinson/epidemiologia , Doença de Parkinson/fisiopatologia , Fatores Sexuais
12.
Environ Monit Assess ; 196(7): 594, 2024 Jun 04.
Artigo em Inglês | MEDLINE | ID: mdl-38833077

RESUMO

In view of the suitability assessment of forest land resources, a consistent fuzzy assessment method with heterogeneous information is proposed. Firstly, some formulas for transforming large-scale real data and interval data into fuzzy numbers are provided. To derive the unified representation of multi-granularity linguistic assessment information, a fuzzy quantitative transformation for multi-granularity uncertain linguistic information is proposed. The proofs of the desirable properties and some normalized formulas for the trapezoidal fuzzy numbers are presented simultaneously. Next, the objective weight of each assessment indicator is further determined by calculating the Jaccard-Cosine similarity between the trapezoidal fuzzy numbers. Moreover, the trapezoidal fuzzy numbers corresponding to the comprehensive assessment values of each alternative are obtained. The alternatives are effectively ranked according to the distance from the centroid of the trapezoidal fuzzy number to the origin. Finally, based on the proposed consistent fuzzy assessment method, the suitability assessment of forest land resources is achieved under a multi-source heterogeneous data setting.


Assuntos
Conservação dos Recursos Naturais , Monitoramento Ambiental , Florestas , Lógica Fuzzy , Monitoramento Ambiental/métodos , Conservação dos Recursos Naturais/métodos
13.
Comput Biol Med ; 170: 107937, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38217975

RESUMO

Heterogeneous data, especially a mixture of numerical and categorical data, widely exist in bioinformatics. Most of works focus on defining new distance metrics rather than learning discriminative metrics for mixed data. Here, we create a new support vector heterogeneous metric learning framework for mixed data. A heterogeneous sample pair kernel is defined for mixed data and metric learning is then converted to a sample pair classification problem. The suggested approach lends itself well to effective resolution through conventional support vector machine solvers. Empirical assessments conducted on mixed data benchmarks and cancer datasets affirm the exceptional efficacy demonstrated by the proposed modeling technique.


Assuntos
Algoritmos , Biologia Computacional , Máquina de Vetores de Suporte
14.
Artigo em Chinês | WPRIM (Pacífico Ocidental) | ID: wpr-1026846

RESUMO

Objective To explore the effects of different acupoints,different target organs,and different interventions on acupoint efficacy based on ACU&MOX-DATA platform;To illustrate and visualize whether the above factors have the characteristics of"specific effect"or"common effect"of acupoint efficacy.Methods The multi-source heterogeneous data were integrated from the original omics data and public omics data.After standardization,differential gene analysis,disease pathology network analysis,and enrichment analysis were performed using Batch Search and Stimulation Mode modules in ACU&MOX-DATA platform under the conditions of different acupoints,different target organs,and different interventions.Results Under the same disease state and the same intervention,there were differences in effects among different acupoints;under the same disease state,the same acupoint and intervention,the responses produced by different target organs were not completely consistent;under the same disease state and acupoint,there were differences in effects among different intervention measures.Conclusion Based on the analysis of ACU&MOX-DATA platform,it is preliminary clear that acupoints,target organs,and interventions are the key factors affecting acupoint efficacy.Meanwhile,the above results have indicated that there are specific or common regulatory characteristics of acupoint efficacy.Applying ACU&MOX-DATA platform to analyze and visualize the critical scientific problems in the field of acupuncture and moxibustion can provide references for deepening acupoint cognition,guiding clinical acupoint selection,and improving clinical efficacy.

15.
J Biomed Inform ; 149: 104579, 2024 01.
Artigo em Inglês | MEDLINE | ID: mdl-38135173

RESUMO

With the emergence of health data warehouses and major initiatives to collect and analyze multi-modal and multisource data, data organization becomes central. In the PACIFIC-PRESERVED (PhenomApping, ClassIFication, and Innovation for Cardiac Dysfunction - Heart Failure with PRESERVED LVEF Study, NCT04189029) study, a data driven research project aiming at redefining and profiling the Heart Failure with preserved Ejection Fraction (HFpEF), an ontology was developed by different data experts in cardiology to enable better data management in a complex study context (multisource, multiformat, multimodality, multipartners). The PACIFIC ontology provides a cardiac data management framework for the phenomapping of patients. It was built upon the BMS-LM (Biomedical Study -Lifecycle Management) core ontology and framework, proposed in a previous work to ensure data organization and provenance throughout the study lifecycle (specification, acquisition, analysis, publication). The BMS-LM design pattern was applied to the PACIFIC multisource variables. In addition, data was structured using a subset of MeSH headings for diseases, technical procedures, or biological processes, and using the Uberon ontology anatomical entities. A total of 1372 variables were organized and enriched with annotations and description from existing ontologies and taxonomies such as LOINC to enable later semantic interoperability. Both, data structuring using the BMS-LM framework, and its mapping with published standards, foster interoperability of multimodal cardiac phenomapping datasets.


Assuntos
Ontologias Biológicas , Cardiologia , Insuficiência Cardíaca , Humanos , Gerenciamento de Dados , Insuficiência Cardíaca/terapia , Cuidados Paliativos , Semântica , Volume Sistólico , Estudos Clínicos como Assunto
16.
J Med Internet Res ; 25: e45225, 2023 10 20.
Artigo em Inglês | MEDLINE | ID: mdl-37862061

RESUMO

BACKGROUND: The global pandemics of severe acute respiratory syndrome, Middle East respiratory syndrome, and COVID-19 have caused unprecedented crises for public health. Coronaviruses are constantly evolving, and it is unknown which new coronavirus will emerge and when the next coronavirus will sweep across the world. Knowledge graphs are expected to help discover the pathogenicity and transmission mechanism of viruses. OBJECTIVE: The aim of this study was to discover potential targets and candidate drugs to repurpose for coronaviruses through a knowledge graph-based approach. METHODS: We propose a computational and evidence-based knowledge discovery approach to identify potential targets and candidate drugs for coronaviruses from biomedical literature and well-known knowledge bases. To organize the semantic triples extracted automatically from biomedical literature, a semantic conversion model was designed. The literature knowledge was associated and integrated with existing drug and gene knowledge through semantic mapping, and the coronavirus knowledge graph (CovKG) was constructed. We adopted both the knowledge graph embedding model and the semantic reasoning mechanism to discover unrecorded mechanisms of drug action as well as potential targets and drug candidates. Furthermore, we have provided evidence-based support with a scoring and backtracking mechanism. RESULTS: The constructed CovKG contains 17,369,620 triples, of which 641,195 were extracted from biomedical literature, covering 13,065 concept unique identifiers, 209 semantic types, and 97 semantic relations of the Unified Medical Language System. Through multi-source knowledge integration, 475 drugs and 262 targets were mapped to existing knowledge, and 41 new drug mechanisms of action were found by semantic reasoning, which were not recorded in the existing knowledge base. Among the knowledge graph embedding models, TransR outperformed others (mean reciprocal rank=0.2510, Hits@10=0.3505). A total of 33 potential targets and 18 drug candidates were identified for coronaviruses. Among them, 7 novel drugs (ie, quinine, nelfinavir, ivermectin, asunaprevir, tylophorine, Artemisia annua extract, and resveratrol) and 3 highly ranked targets (ie, angiotensin converting enzyme 2, transmembrane serine protease 2, and M protein) were further discussed. CONCLUSIONS: We showed the effectiveness of a knowledge graph-based approach in potential target discovery and drug repurposing for coronaviruses. Our approach can be extended to other viruses or diseases for biomedical knowledge discovery and relevant applications.


Assuntos
COVID-19 , Reposicionamento de Medicamentos , Humanos , Reconhecimento Automatizado de Padrão , Bases de Conhecimento , Unified Medical Language System
17.
Front Big Data ; 6: 1278153, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37841897

RESUMO

The knowledge graph is one of the essential infrastructures of artificial intelligence. It is a challenge for knowledge engineering to construct a high-quality domain knowledge graph for multi-source heterogeneous data. We propose a complete process framework for constructing a knowledge graph that combines structured data and unstructured data, which includes data processing, information extraction, knowledge fusion, data storage, and update strategies, aiming to improve the quality of the knowledge graph and extend its life cycle. Specifically, we take the construction process of an enterprise knowledge graph as an example and integrate enterprise register information, litigation-related information, and enterprise announcement information to enrich the enterprise knowledge graph. For the unstructured text, we improve existing model to extract triples and the F1-score of our model reached 72.77%. The number of nodes and edges in our constructed enterprise knowledge graph reaches 1,430,000 and 3,170,000, respectively. Furthermore, for each type of multi-source heterogeneous data, we apply corresponding methods and strategies for information extraction and data storage and carry out a detailed comparative analysis of graph databases. From the perspective of practical use, the informative enterprise knowledge graph and its timely update can serve many actual business needs. Our proposed enterprise knowledge graph has been deployed in HuaRong RongTong (Beijing) Technology Co., Ltd. and is used by the staff as a powerful tool for corporate due diligence. The key features are reported and analyzed in the case study. Overall, this paper provides an easy-to-follow solution and practice for domain knowledge graph construction, as well as demonstrating its application in corporate due diligence.

18.
Sensors (Basel) ; 23(17)2023 Aug 23.
Artigo em Inglês | MEDLINE | ID: mdl-37687804

RESUMO

The safety of flight operations depends on the cognitive abilities of pilots. In recent years, there has been growing concern about potential accidents caused by the declining mental states of pilots. We have developed a novel multimodal approach for mental state detection in pilots using electroencephalography (EEG) signals. Our approach includes an advanced automated preprocessing pipeline to remove artefacts from the EEG data, a feature extraction method based on Riemannian geometry analysis of the cleaned EEG data, and a hybrid ensemble learning technique that combines the results of several machine learning classifiers. The proposed approach provides improved accuracy compared to existing methods, achieving an accuracy of 86% when tested on cleaned EEG data. The EEG dataset was collected from 18 pilots who participated in flight experiments and publicly released at NASA's open portal. This study presents a reliable and efficient solution for detecting mental states in pilots and highlights the potential of EEG signals and ensemble learning algorithms in developing cognitive cockpit systems. The use of an automated preprocessing pipeline, feature extraction method based on Riemannian geometry analysis, and hybrid ensemble learning technique set this work apart from previous efforts in the field and demonstrates the innovative nature of the proposed approach.


Assuntos
Algoritmos , Artefatos , Cognição , Eletroencefalografia , Aprendizado de Máquina
19.
Biostatistics ; 2023 Jul 26.
Artigo em Inglês | MEDLINE | ID: mdl-37494883

RESUMO

Radionuclide imaging plays a critical role in the diagnosis and management of kidney obstruction. However, most practicing radiologists in US hospitals have insufficient time and resources to acquire training and experience needed to interpret radionuclide images, leading to increased diagnostic errors. To tackle this problem, Emory University embarked on a study that aims to develop a computer-assisted diagnostic (CAD) tool for kidney obstruction by mining and analyzing patient data comprised of renogram curves, ordinal expert ratings on the obstruction status, pharmacokinetic variables, and demographic information. The major challenges here are the heterogeneity in data modes and the lack of gold standard for determining kidney obstruction. In this article, we develop a statistically principled CAD tool based on an integrative latent class model that leverages heterogeneous data modalities available for each patient to provide accurate prediction of kidney obstruction. Our integrative model consists of three sub-models (multilevel functional latent factor regression model, probit scalar-on-function regression model, and Gaussian mixture model), each of which is tailored to the specific data mode and depends on the unknown obstruction status (latent class). An efficient MCMC algorithm is developed to train the model and predict kidney obstruction with associated uncertainty. Extensive simulations are conducted to evaluate the performance of the proposed method. An application to an Emory renal study demonstrates the usefulness of our model as a CAD tool for kidney obstruction.

20.
Med Image Anal ; 89: 102906, 2023 10.
Artigo em Inglês | MEDLINE | ID: mdl-37499333

RESUMO

Automatic vertebral body contour extraction (AVBCE) from heterogeneous spinal MRI is indispensable for the comprehensive diagnosis and treatment of spinal diseases. However, AVBCE is challenging due to data heterogeneity, image characteristics complexity, and vertebral body morphology variations, which may cause morphology errors in semantic segmentation. Deep active contour-based (deep ACM-based) methods provide a promising complement for tackling morphology errors by directly parameterizing the contour coordinates. Extending the target contours' capture range and providing morphology-aware parameter maps are crucial for deep ACM-based methods. For this purpose, we propose a novel Attractive Deep Morphology-aware actIve contouR nEtwork (ADMIRE) that embeds an elaborated contour attraction term (CAT) and a comprehensive contour quality (CCQ) loss into the deep ACM-based framework. The CAT adaptively extends the target contours' capture range by designing an all-to-all force field to enable the target contours' energy to contribute to farther locations. Furthermore, the CCQ loss is carefully designed to generate morphology-aware active contour parameters by simultaneously supervising the contour shape, tension, and smoothness. These designs, in cooperation with the deep ACM-based framework, enable robustness to data heterogeneity, image characteristics complexity, and target contour morphology variations. Furthermore, the deep ACM-based ADMIRE is able to cooperate well with semi-supervised strategies such as mean teacher, which enables its function in semi-supervised scenarios. ADMIRE is trained and evaluated on four challenging datasets, including three spinal datasets with more than 1000 heterogeneous images and more than 10000 vertebrae bodies, as well as a cardiac dataset with both normal and pathological cases. Results show ADMIRE achieves state-of-the-art performance on all datasets, which proves ADMIRE's accuracy, robustness, and generalization ability.


Assuntos
Processamento de Imagem Assistida por Computador , Corpo Vertebral , Processamento de Imagem Assistida por Computador/métodos , Imageamento por Ressonância Magnética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA