Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 82
Filtrar
1.
Artigo em Inglês | MEDLINE | ID: mdl-38985556

RESUMO

Congenital heart disease (CHD) is the most common congenital disability affecting healthy development and growth, even resulting in pregnancy termination or fetal death. Recently, deep learning techniques have made remarkable progress to assist in diagnosing CHD. One very popular method is directly classifying fetal ultrasound images, recognized as abnormal and normal, which tends to focus more on global features and neglects semantic knowledge of anatomical structures. The other approach is segmentation-based diagnosis, which requires a large number of pixel-level annotation masks for training. However, the detailed pixel-level segmentation annotation is costly or even unavailable. Based on the above analysis, we propose SKGC, a universal framework to identify normal or abnormal four-chamber heart (4CH) images, guided by a few annotation masks, while improving accuracy remarkably. SKGC consists of a semantic-level knowledge extraction module (SKEM), a multi-knowledge fusion module (MFM), and a classification module (CM). SKEM is responsible for obtaining high-level semantic knowledge, serving as an abstract representation of the anatomical structures that obstetricians focus on. MFM is a lightweight but efficient module that fuses semantic-level knowledge with the original specific knowledge in ultrasound images. CM classifies the fused knowledge and can be replaced by any advanced classifier. Moreover, we design a new loss function that enhances the constraint between the foreground and background predictions, improving the quality of the semantic-level knowledge. Experimental results on the collected real-world NA-4CH and the publicly FEST datasets show that SKGC achieves impressive performance with the best accuracy of 99.68% and 95.40%, respectively. Notably, the accuracy improves from 74.68% to 88.14% using only 10 labeled masks.

2.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38581415

RESUMO

Discovering hit molecules with desired biological activity in a directed manner is a promising but profound task in computer-aided drug discovery. Inspired by recent generative AI approaches, particularly Diffusion Models (DM), we propose Graph Latent Diffusion Model (GLDM)-a latent DM that preserves both the effectiveness of autoencoders of compressing complex chemical data and the DM's capabilities of generating novel molecules. Specifically, we first develop an autoencoder to encode the molecular data into low-dimensional latent representations and then train the DM on the latent space to generate molecules inducing targeted biological activity defined by gene expression profiles. Manipulating DM in the latent space rather than the input space avoids complicated operations to map molecule decomposition and reconstruction to diffusion processes, and thus improves training efficiency. Experiments show that GLDM not only achieves outstanding performances on molecular generation benchmarks, but also generates samples with optimal chemical properties and potentials to induce desired biological activity.


Assuntos
Benchmarking , Descoberta de Drogas , Difusão
3.
IEEE J Transl Eng Health Med ; 12: 371-381, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38633564

RESUMO

Brain state classification by applying deep learning techniques on neuroimaging data has become a recent topic of research. However, unlike domains where the data is low dimensional or there are large number of available training samples, neuroimaging data is high dimensional and has few training samples. To tackle these issues, we present a sparse feedforward deep neural architecture for encoding and decoding the structural connectome of the human brain. We use a sparsely connected element-wise multiplication as the first hidden layer and a fixed transform layer as the output layer. The number of trainable parameters and the training time is significantly reduced compared to feedforward networks. We demonstrate superior performance of this architecture in encoding the structural connectome implicated in Alzheimer's disease (AD) and Parkinson's disease (PD) from DTI brain scans. For decoding, we propose recursive feature elimination (RFE) algorithm based on DeepLIFT, layer-wise relevance propagation (LRP), and Integrated Gradients (IG) algorithms to remove irrelevant features and thereby identify key biomarkers associated with AD and PD. We show that the proposed architecture reduces 45.1% and 47.1% of the trainable parameters compared to a feedforward DNN with an increase in accuracy by 2.6 % and 3.1% for cognitively normal (CN) vs AD and CN vs PD classification, respectively. We also show that the proposed RFE method leads to a further increase in accuracy by 2.1% and 4% for CN vs AD and CN vs PD classification, while removing approximately 90% to 95% irrelevant features. Furthermore, we argue that the biomarkers (i.e., key brain regions and connections) identified are consistent with previous literature. We show that relevancy score-based methods can yield high discriminative power and are suitable for brain decoding. We also show that the proposed approach led to a reduction in the number of trainable network parameters, an increase in classification accuracy, and a detection of brain connections and regions that were consistent with earlier studies.


Assuntos
Doença de Alzheimer , Conectoma , Humanos , Imageamento por Ressonância Magnética/métodos , Conectoma/métodos , Redes Neurais de Computação , Neuroimagem/métodos , Biomarcadores
4.
Heliyon ; 9(12): e22412, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38046150

RESUMO

A supervised deep learning network like the UNet has performed well in segmenting brain anomalies such as lesions and tumours. However, such methods were proposed to perform on single-modality or multi-modality images. We use the Hybrid UNet Transformer (HUT) to improve performance in single-modality lesion segmentation and multi-modality brain tumour segmentation. The HUT consists of two pipelines running in parallel, one of which is UNet-based and the other is Transformer-based. The Transformer-based pipeline relies on feature maps in the intermediate layers of the UNet decoder during training. The HUT network takes in the available modalities of 3D brain volumes and embeds the brain volumes into voxel patches. The transformers in the system improve global attention and long-range correlation between the voxel patches. In addition, we introduce a self-supervised training approach in the HUT framework to enhance the overall segmentation performance. We demonstrate that HUT performs better than the state-of-the-art network SPiN in the single-modality segmentation on Anatomical Tracings of Lesions After Stroke (ATLAS) dataset by 4.84% of Dice score and a significant 41% in the Hausdorff Distance score. HUT also performed well on brain scans in the Brain Tumour Segmentation (BraTS20) dataset and demonstrated an improvement over the state-of-the-art network nnUnet by 0.96% in the Dice score and 4.1% in the Hausdorff Distance score.

5.
Front Neurosci ; 17: 1298514, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38105927

RESUMO

A hybrid UNet and Transformer (HUT) network is introduced to combine the merits of the UNet and Transformer architectures, improving brain lesion segmentation from MRI and CT scans. The HUT overcomes the limitations of conventional approaches by utilizing two parallel stages: one based on UNet and the other on Transformers. The Transformer-based stage captures global dependencies and long-range correlations. It uses intermediate feature vectors from the UNet decoder and improves segmentation accuracy by enhancing the attention and relationship modeling between voxel patches derived from the 3D brain volumes. In addition, HUT incorporates self-supervised learning on the transformer network. This allows the transformer network to learn by maintaining consistency between the classification layers of the different resolutions of patches and augmentations. There is an improvement in the rate of convergence of the training and the overall capability of segmentation. Experimental results on benchmark datasets, including ATLAS and ISLES2018, demonstrate HUT's advantage over the state-of-the-art methods. HUT achieves higher Dice scores and reduced Hausdorff Distance scores in single-modality and multi-modality lesion segmentation. HUT outperforms the state-the-art network SPiN in the single-modality MRI segmentation on Anatomical Tracings of lesion After Stroke (ATLAS) dataset by 4.84% of Dice score and a large margin of 40.7% in the Hausdorff Distance score. HUT also performed well on CT perfusion brain scans in the Ischemic Stroke Lesion Segmentation (ISLES2018) dataset and demonstrated an improvement over the recent state-of-the-art network USSLNet by 3.3% in the Dice score and 12.5% in the Hausdorff Distance score. With the analysis of both single and multi-modalities datasets (ATLASR12 and ISLES2018), we show that HUT can perform and generalize well on different datasets. Code is available at: https://github.com/vicsohntu/HUT_CT.

6.
Sci Rep ; 13(1): 21047, 2023 11 29.
Artigo em Inglês | MEDLINE | ID: mdl-38030699

RESUMO

Schizophrenia is a highly heterogeneous disorder and salient functional connectivity (FC) features have been observed to vary across study sites, warranting the need for methods that can differentiate between site-invariant FC biomarkers and site-specific salient FC features. We propose a technique named Semi-supervised learning with data HaRmonisation via Encoder-Decoder-classifier (SHRED) to examine these features from resting state functional magnetic resonance imaging scans gathered from four sites. Our approach involves an encoder-decoder-classifier architecture that simultaneously performs data harmonisation and semi-supervised learning (SSL) to deal with site differences and labelling inconsistencies across sites respectively. The minimisation of reconstruction loss from SSL was shown to improve model performance even within small datasets whilst data harmonisation often led to lower model generalisability, which was unaffected using the SHRED technique. We show that our proposed model produces site-invariant biomarkers, most notably the connection between transverse temporal gyrus and paracentral lobule. Site-specific salient FC features were also elucidated, especially implicating the paracentral lobule for our local dataset. Our examination of these salient FC features demonstrates how site-specific features and site-invariant biomarkers can be differentiated, which can deepen our understanding of the neurobiology of schizophrenia.


Assuntos
Esquizofrenia , Humanos , Encéfalo/patologia , Imageamento por Ressonância Magnética/métodos , Lobo Frontal , Redes Neurais de Computação , Mapeamento Encefálico/métodos
7.
Comput Biol Med ; 164: 107328, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37573721

RESUMO

In recent years, deep learning models have been applied to neuroimaging data for early diagnosis of Alzheimer's disease (AD). Structural magnetic resonance imaging (sMRI) and positron emission tomography (PET) images provide structural and functional information about the brain, respectively. Combining these features leads to improved performance than using a single modality alone in building predictive models for AD diagnosis. However, current multi-modal approaches in deep learning, based on sMRI and PET, are mostly limited to convolutional neural networks, which do not facilitate integration of both image and phenotypic information of subjects. We propose to use graph neural networks (GNN) that are designed to deal with problems in non-Euclidean domains. In this study, we demonstrate how brain networks are created from sMRI or PET images and can be used in a population graph framework that combines phenotypic information with imaging features of the brain networks. Then, we present a multi-modal GNN framework where each modality has its own branch of GNN and a technique that combines the multi-modal data at both the level of node vectors and adjacency matrices. Finally, we perform late fusion to combine the preliminary decisions made in each branch and produce a final prediction. As multi-modality data becomes available, multi-source and multi-modal is the trend of AD diagnosis. We conducted explorative experiments based on multi-modal imaging data combined with non-imaging phenotypic information for AD diagnosis and analyzed the impact of phenotypic information on diagnostic performance. Results from experiments demonstrated that our proposed multi-modal approach improves performance for AD diagnosis. Our study also provides technical reference and support the need for multivariate multi-modal diagnosis methods.


Assuntos
Doença de Alzheimer , Humanos , Doença de Alzheimer/diagnóstico por imagem , Imageamento por Ressonância Magnética/métodos , Redes Neurais de Computação , Tomografia por Emissão de Pósitrons/métodos , Neuroimagem/métodos , Diagnóstico Precoce
8.
IEEE J Biomed Health Inform ; 27(9): 4591-4600, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37307177

RESUMO

With the development of biotechnology, a large amount of multi-omics data have been collected for precision medicine. There exists multiple graph-based prior biological knowledge about omics data, such as gene-gene interaction networks. Recently, there has been an increasing interest in introducing graph neural networks (GNNs) into multi-omics learning. However, existing methods have not fully exploited these graphical priors since none have been able to integrate knowledge from multiple sources simultaneously. To solve this problem, we propose a multi-omics data analysis framework by incorporating multiple prior knowledge into graph neural network (MPK-GNN). To the best of our knowledge, this is the first attempt to introduce multiple prior graphs into multi-omics data analysis. Specifically, the proposed method contains four parts: (1) a feature-level learning module to aggregate information from prior graphs; (2) a projection module to maximize the agreement among prior networks by optimizing a contrastive loss; (3) a sample-level module to learn a global representation from input multi-omics features; (4) a task-specific module to flexibly extend MPK-GNN for various downstream multi-omics analysis tasks. Finally, we verify the effectiveness of the proposed multi-omics learning algorithm on the cancer molecular subtype classification task. Experimental results show that MPK-GNN outperforms other state-of-the-art algorithms, including multi-view learning methods and multi-omics integrative approaches.


Assuntos
Multiômica , Redes Neurais de Computação , Humanos , Algoritmos , Biotecnologia , Análise de Dados
9.
BMC Bioinformatics ; 22(Suppl 10): 632, 2022 Nov 28.
Artigo em Inglês | MEDLINE | ID: mdl-36443676

RESUMO

BACKGROUND: Cancers are genetically heterogeneous, so anticancer drugs show varying degrees of effectiveness on patients due to their differing genetic profiles. Knowing patient's responses to numerous cancer drugs are needed for personalized treatment for cancer. By using molecular profiles of cancer cell lines available from Cancer Cell Line Encyclopedia (CCLE) and anticancer drug responses available in the Genomics of Drug Sensitivity in Cancer (GDSC), we will build computational models to predict anticancer drug responses from molecular features. RESULTS: We propose a novel deep neural network model that integrates multi-omics data available as gene expressions, copy number variations, gene mutations, reverse phase protein array expressions, and metabolomics expressions, in order to predict cellular responses to known anti-cancer drugs. We employ a novel graph embedding layer that incorporates interactome data as prior information for prediction. Moreover, we propose a novel attention layer that effectively combines different omics features, taking their interactions into account. The network outperformed feedforward neural networks and reported 0.90 for [Formula: see text] values for prediction of drug responses from cancer cell lines data available in CCLE and GDSC. CONCLUSION: The outstanding results of our experiments demonstrate that the proposed method is capable of capturing the interactions of genes and proteins, and integrating multi-omics features effectively. Furthermore, both the results of ablation studies and the investigations of the attention layer imply that gene mutation has a greater influence on the prediction of drug responses than other omics data types. Therefore, we conclude that our approach can not only predict the anti-cancer drug response precisely but also provides insights into reaction mechanisms of cancer cell lines and drugs as well.


Assuntos
Aprendizado Profundo , Neoplasias , Humanos , Variações do Número de Cópias de DNA , Neoplasias/tratamento farmacológico , Neoplasias/genética , Mutação , Genômica
10.
Sci Rep ; 12(1): 15425, 2022 09 14.
Artigo em Inglês | MEDLINE | ID: mdl-36104347

RESUMO

Multi-omics data are increasingly being gathered for investigations of complex diseases such as cancer. However, high dimensionality, small sample size, and heterogeneity of different omics types pose huge challenges to integrated analysis. In this paper, we evaluate two network-based approaches for integration of multi-omics data in an application of clinical outcome prediction of neuroblastoma. We derive Patient Similarity Networks (PSN) as the first step for individual omics data by computing distances among patients from omics features. The fusion of different omics can be investigated in two ways: the network-level fusion is achieved using Similarity Network Fusion algorithm for fusing the PSNs derived for individual omics types; and the feature-level fusion is achieved by fusing the network features obtained from individual PSNs. We demonstrate our methods on two high-risk neuroblastoma datasets from SEQC project and TARGET project. We propose Deep Neural Network and Machine Learning methods with Recursive Feature Elimination as the predictor of survival status of neuroblastoma patients. Our results indicate that network-level fusion outperformed feature-level fusion for integration of different omics data whereas feature-level fusion is more suitable incorporating different feature types derived from same omics type. We conclude that the network-based methods are capable of handling heterogeneity and high dimensionality well in the integration of multi-omics.


Assuntos
Neuroblastoma , Algoritmos , Humanos , Aprendizado de Máquina , Redes Neurais de Computação , Neuroblastoma/genética , Prognóstico
11.
Bioinformatics ; 38(Suppl_2): ii113-ii119, 2022 09 16.
Artigo em Inglês | MEDLINE | ID: mdl-36124784

RESUMO

MOTIVATION: While it has been well established that drugs affect and help patients differently, personalized drug response predictions remain challenging. Solutions based on single omics measurements have been proposed, and networks provide means to incorporate molecular interactions into reasoning. However, how to integrate the wealth of information contained in multiple omics layers still poses a complex problem. RESULTS: We present DrDimont, Drug response prediction from Differential analysis of multi-omics networks. It allows for comparative conclusions between two conditions and translates them into differential drug response predictions. DrDimont focuses on molecular interactions. It establishes condition-specific networks from correlation within an omics layer that are then reduced and combined into heterogeneous, multi-omics molecular networks. A novel semi-local, path-based integration step ensures integrative conclusions. Differential predictions are derived from comparing the condition-specific integrated networks. DrDimont's predictions are explainable, i.e. molecular differences that are the source of high differential drug scores can be retrieved. We predict differential drug response in breast cancer using transcriptomics, proteomics, phosphosite and metabolomics measurements and contrast estrogen receptor positive and receptor negative patients. DrDimont performs better than drug prediction based on differential protein expression or PageRank when evaluating it on ground truth data from cancer cell lines. We find proteomic and phosphosite layers to carry most information for distinguishing drug response. AVAILABILITY AND IMPLEMENTATION: DrDimont is available on CRAN: https://cran.r-project.org/package=DrDimont. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Neoplasias da Mama , Software , Neoplasias da Mama/tratamento farmacológico , Feminino , Humanos , Proteômica , Receptores de Estrogênio , Transcriptoma
12.
Front Neurosci ; 16: 866666, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35677355

RESUMO

Both neuroimaging and genomics datasets are often gathered for the detection of neurodegenerative diseases. Huge dimensionalities of neuroimaging data as well as omics data pose tremendous challenge for methods integrating multiple modalities. There are few existing solutions that can combine both multi-modal imaging and multi-omics datasets to derive neurological insights. We propose a deep neural network architecture that combines both structural and functional connectome data with multi-omics data for disease classification. A graph convolution layer is used to model functional magnetic resonance imaging (fMRI) and diffusion tensor imaging (DTI) data simultaneously to learn compact representations of the connectome. A separate set of graph convolution layers are then used to model multi-omics datasets, expressed in the form of population graphs, and combine them with latent representations of the connectome. An attention mechanism is used to fuse these outputs and provide insights on which omics data contributed most to the model's classification decision. We demonstrate our methods for Parkinson's disease (PD) classification by using datasets from the Parkinson's Progression Markers Initiative (PPMI). PD has been shown to be associated with changes in the human connectome and it is also known to be influenced by genetic factors. We combine DTI and fMRI data with multi-omics data from RNA Expression, Single Nucleotide Polymorphism (SNP), DNA Methylation and non-coding RNA experiments. A Matthew Correlation Coefficient of greater than 0.8 over many combinations of multi-modal imaging data and multi-omics data was achieved with our proposed architecture. To address the paucity of paired multi-modal imaging data and the problem of imbalanced data in the PPMI dataset, we compared the use of oversampling against using CycleGAN on structural and functional connectomes to generate missing imaging modalities. Furthermore, we performed ablation studies that offer insights into the importance of each imaging and omics modality for the prediction of PD. Analysis of the generated attention matrices revealed that DNA Methylation and SNP data were the most important omics modalities out of all the omics datasets considered. Our work motivates further research into imaging genetics and the creation of more multi-modal imaging and multi-omics datasets to study PD and other complex neurodegenerative diseases.

13.
Hum Brain Mapp ; 43(9): 2801-2816, 2022 06 15.
Artigo em Inglês | MEDLINE | ID: mdl-35224817

RESUMO

Functional magnetic resonance imaging (fMRI) is used to capture complex and dynamic interactions between brain regions while performing tasks. Task related alterations in the brain have been classified as task specific and task general, depending on whether they are particular to a task or common across multiple tasks. Using recent attempts in interpreting deep learning models, we propose an approach to determine both task specific and task general architectures of the functional brain. We demonstrate our methods with a reference-based decoder on deep learning classifiers trained on 12,500 rest and task fMRI samples from the Human Connectome Project (HCP). The decoded task general and task specific motor and language architectures were validated with findings from previous studies. We found that unlike intersubject variability that is characteristic of functional pathology of neurological diseases, a small set of connections are sufficient to delineate the rest and task states. The nodes and connections in the task general architecture could serve as potential disease biomarkers as alterations in task general brain modulations are known to be implicated in several neuropsychiatric disorders.


Assuntos
Conectoma , Encéfalo/diagnóstico por imagem , Conectoma/métodos , Humanos , Idioma , Imageamento por Ressonância Magnética/métodos , Rede Nervosa , Descanso
14.
BMC Bioinformatics ; 21(Suppl 16): 560, 2020 Dec 16.
Artigo em Inglês | MEDLINE | ID: mdl-33323115

RESUMO

BACKGROUND: Protein-protein interaction (PPI) prediction is an important task towards the understanding of many bioinformatics functions and applications, such as predicting protein functions, gene-disease associations and disease-drug associations. However, many previous PPI prediction researches do not consider missing and spurious interactions inherent in PPI networks. To address these two issues, we define two corresponding tasks, namely missing PPI prediction and spurious PPI prediction, and propose a method that employs graph embeddings that learn vector representations from constructed Gene Ontology Annotation (GOA) graphs and then use embedded vectors to achieve the two tasks. Our method leverages on information from both term-term relations among GO terms and term-protein annotations between GO terms and proteins, and preserves properties of both local and global structural information of the GO annotation graph. RESULTS: We compare our method with those methods that are based on information content (IC) and one method that is based on word embeddings, with experiments on three PPI datasets from STRING database. Experimental results demonstrate that our method is more effective than those compared methods. CONCLUSION: Our experimental results demonstrate the effectiveness of using graph embeddings to learn vector representations from undirected GOA graphs for our defined missing and spurious PPI tasks.


Assuntos
Ontologia Genética , Anotação de Sequência Molecular , Mapeamento de Interação de Proteínas/métodos , Animais , Área Sob a Curva , Biologia Computacional/métodos , Humanos , Camundongos , Curva ROC , Saccharomyces cerevisiae/genética , Análise e Desempenho de Tarefas
15.
Sci Rep ; 10(1): 7590, 2020 05 05.
Artigo em Inglês | MEDLINE | ID: mdl-32371990

RESUMO

Specialized processing in the brain is performed by multiple groups of brain regions organized as functional modules. Although, in vivo studies of brain functional modules involve multiple functional Magnetic Resonance Imaging (fMRI) scans, the methods used to derive functional modules from functional networks of the brain ignore individual differences in the functional architecture and use incomplete functional connectivity information. To correct this, we propose an Iterative Consensus Spectral Clustering (ICSC) algorithm that detects the most representative modules from individual dense weighted connectivity matrices derived from multiple scans. The ICSC algorithm derives group-level modules from modules of multiple individuals by iteratively minimizing the consensus-cost between the two. We demonstrate that the ICSC algorithm can be used to derive biologically plausible group-level (for multiple subjects) and subject-level (for multiple subject scans) brain modules, using resting-state fMRI scans of 589 subjects from the Human Connectome Project. We employed a multipronged strategy to show the validity of the modularizations obtained from the ICSC algorithm. We show a heterogeneous variability in the modular structure across subjects where modules involved in visual and motor processing were highly stable across subjects. Conversely, we found a lower variability across scans of the same subject. The performance of our algorithm was compared with existing functional brain modularization methods and we show that our method detects group-level modules that are more representative of the modules of multiple individuals. Finally, the experiments on synthetic images quantitatively demonstrate that the ICSC algorithm detects group-level and subject-level modules accurately under varied conditions. Therefore, besides identifying functional modules for a population of subjects, the proposed method can be used for applications in personalized neuroscience. The ICSC implementation is available at https://github.com/SCSE-Biomedical-Computing-Group/ICSC.

16.
Neuroimage Clin ; 25: 102186, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32000101

RESUMO

Functional modules in the human brain support its drive for specialization whereas brain hubs act as focal points for information integration. Brain hubs are brain regions that have a large number of both within and between module connections. We argue that weak connections in brain functional networks lead to misclassification of brain regions as hubs. In order to resolve this, we propose a new measure called ambivert degree that considers the node's degree as well as connection weights in order to identify nodes with both high degree and high connection weights as hubs. Using resting-state functional MRI scans from the Human Connectome Project, we show that ambivert degree identifies brain hubs that are not only crucial but also invariable across subjects. We hypothesize that nodal measures based on ambivert degree can be effectively used to classify patients from healthy controls for diseases that are known to have widespread hub disruption. Using patient data for Alzheimer's Disease and Autism Spectrum Disorder, we show that the hubs in the patient and healthy groups are very different for both the diseases and deep feedforward neural networks trained on nodal hub features lead to a significantly higher classification accuracy with significantly fewer trainable weights compared to using functional connectivity features. Thus, the ambivert degree improves identification of crucial brain hubs in healthy subjects and can be used as a diagnostic feature to detect neurological diseases characterized by hub disruption.


Assuntos
Doença de Alzheimer/diagnóstico por imagem , Doença de Alzheimer/fisiopatologia , Transtorno do Espectro Autista/diagnóstico por imagem , Transtorno do Espectro Autista/fisiopatologia , Córtex Cerebral/diagnóstico por imagem , Conectoma/métodos , Aprendizado Profundo , Rede Nervosa/diagnóstico por imagem , Adolescente , Adulto , Idoso , Córtex Cerebral/fisiopatologia , Criança , Humanos , Imageamento por Ressonância Magnética , Rede Nervosa/fisiopatologia , Adulto Jovem
17.
BMC Genomics ; 20(Suppl 9): 918, 2019 Dec 24.
Artigo em Inglês | MEDLINE | ID: mdl-31874639

RESUMO

BACKGROUND: Semantic similarity between Gene Ontology (GO) terms is a fundamental measure for many bioinformatics applications, such as determining functional similarity between genes or proteins. Most previous research exploited information content to estimate the semantic similarity between GO terms; recently some research exploited word embeddings to learn vector representations for GO terms from a large-scale corpus. In this paper, we proposed a novel method, named GO2Vec, that exploits graph embeddings to learn vector representations for GO terms from GO graph. GO2Vec combines the information from both GO graph and GO annotations, and its learned vectors can be applied to a variety of bioinformatics applications, such as calculating functional similarity between proteins and predicting protein-protein interactions. RESULTS: We conducted two kinds of experiments to evaluate the quality of GO2Vec: (1) functional similarity between proteins on the Collaborative Evaluation of GO-based Semantic Similarity Measures (CESSM) dataset and (2) prediction of protein-protein interactions on the Yeast and Human datasets from the STRING database. Experimental results demonstrate the effectiveness of GO2Vec over the information content-based measures and the word embedding-based measures. CONCLUSION: Our experimental results demonstrate the effectiveness of using graph embeddings to learn vector representations from undirected GO and GOA graphs. Our results also demonstrate that GO annotations provide useful information for computing the similarity between GO terms and between proteins.


Assuntos
Ontologia Genética , Mapeamento de Interação de Proteínas/métodos , Humanos , Proteínas de Saccharomyces cerevisiae/metabolismo
18.
BMC Genomics ; 20(Suppl 9): 901, 2019 Dec 24.
Artigo em Inglês | MEDLINE | ID: mdl-31874644

RESUMO

BACKGROUND: Module detection algorithms relying on modularity maximization suffer from an inherent resolution limit that hinders detection of small topological modules, especially in molecular networks where most biological processes are believed to form small and compact communities. We propose a novel modular refinement approach that helps finding functionally significant modules of molecular networks. RESULTS: The module refinement algorithm improves the quality of topological modules in protein-protein interaction networks by finding biologically functionally significant modules. The algorithm is based on the fact that functional modules in biology do not necessarily represent those corresponding to maximum modularity. Larger modules corresponding to maximal modularity are incrementally re-modularized again under specific constraints so that smaller yet topologically and biologically valid modules are recovered. We show improvement in quality and functional coverage of modules using experiments on synthetic and real protein-protein interaction networks. We also compare our results with six existing methods available for clustering biological networks. CONCLUSION: The proposed algorithm finds smaller but functionally relevant modules that are undetected by classical quality maximization approaches for modular detection. The refinement procedure helps to detect more functionally enriched modules in protein-protein interaction networks, which are also more coherent with functionally characterised gene sets.


Assuntos
Algoritmos , Mapeamento de Interação de Proteínas/métodos , Análise por Conglomerados , Humanos
19.
BMC Med Genomics ; 12(Suppl 8): 178, 2019 12 20.
Artigo em Inglês | MEDLINE | ID: mdl-31856829

RESUMO

BACKGROUND: The availability of high-throughput omics datasets from large patient cohorts has allowed the development of methods that aim at predicting patient clinical outcomes, such as survival and disease recurrence. Such methods are also important to better understand the biological mechanisms underlying disease etiology and development, as well as treatment responses. Recently, different predictive models, relying on distinct algorithms (including Support Vector Machines and Random Forests) have been investigated. In this context, deep learning strategies are of special interest due to their demonstrated superior performance over a wide range of problems and datasets. One of the main challenges of such strategies is the "small n large p" problem. Indeed, omics datasets typically consist of small numbers of samples and large numbers of features relative to typical deep learning datasets. Neural networks usually tackle this problem through feature selection or by including additional constraints during the learning process. METHODS: We propose to tackle this problem with a novel strategy that relies on a graph-based method for feature extraction, coupled with a deep neural network for clinical outcome prediction. The omics data are first represented as graphs whose nodes represent patients, and edges represent correlations between the patients' omics profiles. Topological features, such as centralities, are then extracted from these graphs for every node. Lastly, these features are used as input to train and test various classifiers. RESULTS: We apply this strategy to four neuroblastoma datasets and observe that models based on neural networks are more accurate than state of the art models (DNN: 85%-87%, SVM/RF: 75%-82%). We explore how different parameters and configurations are selected in order to overcome the effects of the small data problem as well as the curse of dimensionality. CONCLUSIONS: Our results indicate that the deep neural networks capture complex features in the data that help predicting patient clinical outcomes.


Assuntos
Biologia Computacional/métodos , Aprendizado Profundo , Neuroblastoma/diagnóstico , Perfilação da Expressão Gênica , Humanos , Neuroblastoma/genética , Prognóstico
20.
F1000Res ; 8: 465, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31559017

RESUMO

Background: Biological entities such as genes, promoters, mRNA, metabolites or proteins do not act alone, but in concert in their network context. Modules, i.e., groups of nodes with similar topological properties in these networks characterize important biological functions of the underlying biomolecular system. Edges in such molecular networks represent regulatory and physical interactions, and comparing them between conditions provides valuable information on differential molecular mechanisms. However, biological data is inherently noisy and network reduction techniques can propagate errors particularly to the level of edges. We aim to improve the analysis of networks of biological molecules by deriving modules together with edge relevance estimations that are based on global network characteristics.  Methods: We propose to fit the networks to stochastic block models (SBM), a method that has not yet been investigated for the analysis of biomolecular networks. This procedure both delivers modules of the networks and enables the derivation of edge confidence scores. We apply it to correlation-based networks of breast cancer data originating from high-throughput measurements of diverse molecular layers such as transcriptomics, proteomics, and metabolomics. The networks were reduced by thresholding for correlation significance or by requirements on scale-freeness.  Results and discussion: We find that the networks are best represented by the hierarchical version of the SBM, and many of the predicted blocks have a biological meaning according to functional annotation. The edge confidence scores are overall in concordance with the biological evidence given by the measurements. As they are based on global network connectivity characteristics and potential hierarchies within the biomolecular networks are taken into account, they could be used as additional, integrated features in network-based data comparisons. Their tight relationship to edge existence probabilities can be exploited to predict missing or spurious edges in order to improve the network representation of the underlying biological system.


Assuntos
Biologia Computacional , Proteômica , Metabolômica , Proteínas
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...