Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
BMC Bioinformatics ; 23(Suppl 2): 154, 2022 Dec 12.
Artigo em Inglês | MEDLINE | ID: mdl-36510125

RESUMO

BACKGROUND: Cis-regulatory regions (CRRs) are non-coding regions of the DNA that fine control the spatio-temporal pattern of transcription; they are involved in a wide range of pivotal processes such as the development of specific cell-lines/tissues and the dynamic cell response to physiological stimuli. Recent studies showed that genetic variants occurring in CRRs are strongly correlated with pathogenicity or deleteriousness. Considering the central role of CRRs in the regulation of physiological and pathological conditions, the correct identification of CRRs and of their tissue-specific activity status through Machine Learning methods plays a major role in dissecting the impact of genetic variants on human diseases. Unfortunately, the problem is still open, though some promising results have been already reported by (deep) machine-learning based methods that predict active promoters and enhancers in specific tissues or cell lines by encoding epigenetic or spectral features directly extracted from DNA sequences. RESULTS: We present the experiments we performed to compare two Deep Neural Networks, a Feed-Forward Neural Network model working on epigenomic features, and a Convolutional Neural Network model working only on genomic sequence, targeted to the identification of enhancer- and promoter-activity in specific cell lines. While performing experiments to understand how the experimental setup influences the prediction performance of the methods, we particularly focused on (1) automatic model selection performed by Bayesian optimization and (2) exploring different data rebalancing setups for reducing negative unbalancing effects. CONCLUSIONS: Results show that (1) automatic model selection by Bayesian optimization improves the quality of the learner; (2) data rebalancing considerably impacts the prediction performance of the models; test set rebalancing may provide over-optimistic results, and should therefore be cautiously applied; (3) despite working on sequence data, convolutional models obtain performance close to those of feed forward models working on epigenomic information, which suggests that also sequence data carries informative content for CRR-activity prediction. We therefore suggest combining both models/data types in future works.


Assuntos
Aprendizado Profundo , Humanos , Teorema de Bayes , Sequências Reguladoras de Ácido Nucleico , Redes Neurais de Computação , Aprendizado de Máquina
2.
Brief Bioinform ; 23(4)2022 07 18.
Artigo em Inglês | MEDLINE | ID: mdl-35679533

RESUMO

Patient similarity networks (PSNs), where patients are represented as nodes and their similarities as weighted edges, are being increasingly used in clinical research. These networks provide an insightful summary of the relationships among patients and can be exploited by inductive or transductive learning algorithms for the prediction of patient outcome, phenotype and disease risk. PSNs can also be easily visualized, thus offering a natural way to inspect complex heterogeneous patient data and providing some level of explainability of the predictions obtained by machine learning algorithms. The advent of high-throughput technologies, enabling us to acquire high-dimensional views of the same patients (e.g. omics data, laboratory data, imaging data), calls for the development of data fusion techniques for PSNs in order to leverage this rich heterogeneous information. In this article, we review existing methods for integrating multiple biomedical data views to construct PSNs, together with the different patient similarity measures that have been proposed. We also review methods that have appeared in the machine learning literature but have not yet been applied to PSNs, thus providing a resource to navigate the vast machine learning literature existing on this topic. In particular, we focus on methods that could be used to integrate very heterogeneous datasets, including multi-omics data as well as data derived from clinical information and medical imaging.


Assuntos
Algoritmos , Aprendizado de Máquina
3.
PLoS One ; 17(1): e0263183, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35085372

RESUMO

Focus of this study is to design an automated image processing pipeline for handling uncontrolled acquisition conditions of images acquired in the field. The pipeline has been tested on the automated identification and count of uncapped brood cells in honeybee (Apis Mellifera) comb images to reduce the workload of beekeepers during the study of the hygienic behavior of honeybee colonies. The images used to develop and test the model were acquired by beekeepers on different days and hours in summer 2020 and under uncontrolled conditions. This resulted in images differing for background noise, illumination, color, comb tilts, scaling, and comb sizes. All the available 127 images were manually cropped to approximately include the comb area. To obtain an unbiased evaluation, the cropped images were randomly split into a training image set (50 images), which was used to develop and tune the proposed model, and a test image set (77 images), which was solely used to test the model. To reduce the effects of varied illuminations or exposures, three image enhancement algorithms were tested and compared followed by the Hough Transform, which allowed identifying individual cells to be automatically counted. All the algorithm parameters were automatically chosen on the training set by grid search. When applied to the 77 test images the model obtained a correlation of 0.819 between the automated counts and the experts' counts. To provide an assessment of our model with publicly available images acquired by a different equipment and under different acquisition conditions, we randomly extracted 100 images from a comb image dataset made available by a recent literature work. Though it has been acquired under controlled exposure, the images in this new set have varied illuminations; anyhow, our pipeline obtains a correlation between automatic and manual counts equal to 0.997. In conclusion, our tests on the automatic count of uncapped honey bee comb cells acquired in the field and on images extracted from a publicly available dataset suggest that the hereby generated pipeline successfully handles varied noise artifacts, illumination, and exposure conditions, therefore allowing to generalize our method to different acquisition settings. Results further improve when the acquisition conditions are controlled.


Assuntos
Abelhas/fisiologia , Comportamento Animal/fisiologia , Higiene , Processamento de Imagem Assistida por Computador/métodos , Algoritmos , Animais , Aumento da Imagem/métodos , Estações do Ano
4.
J Imaging ; 7(12)2021 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-34940725

RESUMO

The aim of this retrospective study is to assess any association between abdominal CT findings and the radiological stage of COVID-19 pneumonia, pulmonary embolism and patient outcomes. We included 158 adult hospitalized COVID-19 patients between 1 March 2020 and 1 March 2021 who underwent 206 abdominal CTs. Two radiologists reviewed all CT images. Pathological findings were classified as acute or not. A subset of patients with inflammatory pathology in ACE2 organs (bowel, biliary tract, pancreas, urinary system) was identified. The radiological stage of COVID pneumonia, pulmonary embolism, overall days of hospitalization, ICU admission and outcome were registered. Univariate statistical analysis coupled with explainable artificial intelligence (AI) techniques were used to discover associations between variables. The most frequent acute findings were bowel abnormalities (n = 58), abdominal fluid (n = 42), hematomas (n = 28) and acute urologic conditions (n = 8). According to univariate statistical analysis, pneumonia stage > 2 was significantly associated with increased frequency of hematomas, active bleeding and fluid-filled colon. The presence of at least one hepatobiliary finding was associated with all the COVID-19 stages > 0. Free abdominal fluid, acute pathologies in ACE2 organs and fluid-filled colon were associated with ICU admission; free fluid also presented poor patient outcomes. Hematomas and active bleeding with at least a progressive stage of COVID pneumonia. The explainable AI techniques find no strong relationship between variables.

5.
Sci Rep ; 11(1): 22587, 2021 11 19.
Artigo em Inglês | MEDLINE | ID: mdl-34799624

RESUMO

Concrete conceptual knowledge is supported by a distributed neural network representing different semantic features according to the neuroanatomy of sensory and motor systems. If and how this framework applies to abstract knowledge is currently debated. Here we investigated the specific brain correlates of different abstract categories. After a systematic a priori selection of brain regions involved in semantic cognition, i.e. responsible of, respectively, semantic representations and cognitive control, we used a fMRI-adaptation paradigm with a passive reading task, in order to modulate the neural response to abstract (emotions, cognitions, attitudes, human actions) and concrete (biological entities, artefacts) categories. Different portions of the left anterior temporal lobe responded selectively to abstract and concrete concepts. Emotions and attitudes adapted the left middle temporal gyrus, whereas concrete items adapted the left fusiform gyrus. Our results suggest that, similarly to concrete concepts, some categories of abstract knowledge have specific brain correlates corresponding to the prevalent semantic dimensions involved in their representation.


Assuntos
Encéfalo/diagnóstico por imagem , Idioma , Imageamento por Ressonância Magnética/métodos , Adulto , Mapeamento Encefálico , Cognição , Formação de Conceito , Feminino , Humanos , Itália , Conhecimento , Masculino , Leitura , Reprodutibilidade dos Testes , Semântica , Lobo Temporal/fisiologia , Adulto Jovem
6.
Bioinformatics ; 37(23): 4526-4533, 2021 12 07.
Artigo em Inglês | MEDLINE | ID: mdl-34240108

RESUMO

MOTIVATION: Automated protein function prediction is a complex multi-class, multi-label, structured classification problem in which protein functions are organized in a controlled vocabulary, according to the Gene Ontology (GO). 'Hierarchy-unaware' classifiers, also known as 'flat' methods, predict GO terms without exploiting the inherent structure of the ontology, potentially violating the True-Path-Rule (TPR) that governs the GO, while 'hierarchy-aware' approaches, even if they obey the TPR, do not always show clear improvements with respect to flat methods, or do not scale well when applied to the full GO. RESULTS: To overcome these limitations, we propose Hierarchical Ensemble Methods for Directed Acyclic Graphs (HEMDAG), a family of highly modular hierarchical ensembles of classifiers, able to build upon any flat method and to provide 'TPR-safe' predictions, by leveraging a combination of isotonic regression and TPR learning strategies. Extensive experiments on synthetic and real data across several organisms firstly show that HEMDAG can be used as a general tool to improve the predictions of flat classifiers, and secondly that HEMDAG is competitive versus state-of-the-art hierarchy-aware learning methods proposed in the last CAFA international challenges. AVAILABILITY AND IMPLEMENTATION: Fully tested R code freely available at https://anaconda.org/bioconda/r-hemdag. Tutorial and documentation at https://hemdag.readthedocs.io. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Biologia Computacional , Ontologia Genética , Biologia Computacional/métodos , Proteínas/metabolismo
7.
Hum Brain Mapp ; 42(5): 1268-1286, 2021 04 01.
Artigo em Inglês | MEDLINE | ID: mdl-33274823

RESUMO

Along-tract statistics analysis enables the extraction of quantitative diffusion metrics along specific white matter fiber tracts. Besides quantitative metrics derived from classical diffusion tensor imaging (DTI), such as fractional anisotropy and diffusivities, new parameters reflecting the relative contribution of different diffusion compartments in the tissue can be estimated through advanced diffusion MRI methods as neurite orientation dispersion and density imaging (NODDI), leading to a more specific microstructural characterization. In this study, we extracted both DTI- and NODDI-derived quantitative microstructural diffusion metrics along the most eloquent fiber tracts in 15 healthy subjects and in 22 patients with brain tumors. We obtained a robust intraprotocol reference database of normative along-tract microstructural metrics, and their corresponding plots, from healthy fiber tracts. Each diffusion metric of individual patient's fiber tract was then plotted and statistically compared to the normative profile of the corresponding metric from the healthy fiber tracts. NODDI-derived metrics appeared to account for the pathological microstructural changes of the peritumoral tissue more accurately than DTI-derived ones. This approach may be useful for future studies that may compare healthy subjects to patients diagnosed with other pathological conditions.


Assuntos
Neoplasias Encefálicas/patologia , Imagem de Difusão por Ressonância Magnética/normas , Neuritos/patologia , Substância Branca/patologia , Adulto , Idoso , Neoplasias Encefálicas/diagnóstico por imagem , Imagem de Difusão por Ressonância Magnética/métodos , Imagem de Tensor de Difusão/métodos , Imagem de Tensor de Difusão/normas , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Substância Branca/diagnóstico por imagem , Adulto Jovem
8.
Gigascience ; 9(5)2020 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-32444882

RESUMO

BACKGROUND: Several prediction problems in computational biology and genomic medicine are characterized by both big data as well as a high imbalance between examples to be learned, whereby positive examples can represent a tiny minority with respect to negative examples. For instance, deleterious or pathogenic variants are overwhelmed by the sea of neutral variants in the non-coding regions of the genome: thus, the prediction of deleterious variants is a challenging, highly imbalanced classification problem, and classical prediction tools fail to detect the rare pathogenic examples among the huge amount of neutral variants or undergo severe restrictions in managing big genomic data. RESULTS: To overcome these limitations we propose parSMURF, a method that adopts a hyper-ensemble approach and oversampling and undersampling techniques to deal with imbalanced data, and parallel computational techniques to both manage big genomic data and substantially speed up the computation. The synergy between Bayesian optimization techniques and the parallel nature of parSMURF enables efficient and user-friendly automatic tuning of the hyper-parameters of the algorithm, and allows specific learning problems in genomic medicine to be easily fit. Moreover, by using MPI parallel and machine learning ensemble techniques, parSMURF can manage big data by partitioning them across the nodes of a high-performance computing cluster. Results with synthetic data and with single-nucleotide variants associated with Mendelian diseases and with genome-wide association study hits in the non-coding regions of the human genome, involhing millions of examples, show that parSMURF achieves state-of-the-art results and an 80-fold speed-up with respect to the sequential version. CONCLUSIONS: parSMURF is a parallel machine learning tool that can be trained to learn different genomic problems, and its multiple levels of parallelization and high scalability allow us to efficiently fit problems characterized by big and imbalanced genomic data. The C++ OpenMP multi-core version tailored to a single workstation and the C++ MPI/OpenMP hybrid multi-core and multi-node parSMURF version tailored to a High Performance Computing cluster are both available at https://github.com/AnacletoLAB/parSMURF.


Assuntos
Biologia Computacional/métodos , Predisposição Genética para Doença , Variação Genética , Estudo de Associação Genômica Ampla/métodos , Software , Algoritmos , Bases de Dados Genéticas , Genômica/métodos , Humanos , Aprendizado de Máquina , Reprodutibilidade dos Testes
9.
Sci Rep ; 10(1): 3612, 2020 02 27.
Artigo em Inglês | MEDLINE | ID: mdl-32107391

RESUMO

Methods for phenotype and outcome prediction are largely based on inductive supervised models that use selected biomarkers to make predictions, without explicitly considering the functional relationships between individuals. We introduce a novel network-based approach named Patient-Net (P-Net) in which biomolecular profiles of patients are modeled in a graph-structured space that represents gene expression relationships between patients. Then a kernel-based semi-supervised transductive algorithm is applied to the graph to explore the overall topology of the graph and to predict the phenotype/clinical outcome of patients. Experimental tests involving several publicly available datasets of patients afflicted with pancreatic, breast, colon and colorectal cancer show that our proposed method is competitive with state-of-the-art supervised and semi-supervised predictive systems. Importantly, P-Net also provides interpretable models that can be easily visualized to gain clues about the relationships between patients, and to formulate hypotheses about their stratification.


Assuntos
Neoplasias da Mama/diagnóstico , Neoplasias Colorretais/diagnóstico , Redes Reguladoras de Genes , Redes Neurais de Computação , Neoplasias Pancreáticas/diagnóstico , Algoritmos , Inteligência Artificial , Neoplasias da Mama/epidemiologia , Neoplasias Colorretais/epidemiologia , Biologia Computacional/métodos , Conjuntos de Dados como Assunto , Feminino , Humanos , Individualidade , Masculino , Neoplasias Pancreáticas/epidemiologia , Fenótipo , Prognóstico , Transcriptoma , Resultado do Tratamento
10.
Cereb Cortex Commun ; 1(1): tgaa008, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-34296089

RESUMO

Recent evidence has shown that patterns of cortico-cortical functional synchronization are consistently traceable by the end of the third trimester of pregnancy. The involvement of subcortical structures in early functional and cognitive development has never been explicitly investigated, notwithstanding their pivotal role in different cognitive processes. We address this issue by exploring subcortico-cortical functional connectivity at rest in a group of normally developing fetuses between the 25th and 32nd weeks of gestation. Results show significant functional coupling between subcortical nuclei and cortical networks related to: (i) sensorimotor processing, (ii) decision making, and (iii) learning capabilities. This functional maturation framework unearths a Cognitive Development Blueprint, according to which grounding cognitive skills are planned to develop with higher ontogenetic priority. Specifically, our evidence suggests that a newborn already possesses the ability to: (i) perceive the world and interact with it, (ii) create salient representations for the selection of adaptive behaviors, and (iii) store, retrieve, and evaluate the outcomes of interactions, in order to gradually improve adaptation to the extrauterine environment.

11.
BMC Bioinformatics ; 20(1): 422, 2019 Aug 14.
Artigo em Inglês | MEDLINE | ID: mdl-31412768

RESUMO

BACKGROUND: One of the main issues in the automated protein function prediction (AFP) problem is the integration of multiple networked data sources. The UNIPred algorithm was thereby proposed to efficiently integrate -in a function-specific fashion- the protein networks by taking into account the imbalance that characterizes protein annotations, and to subsequently predict novel hypotheses about unannotated proteins. UNIPred is publicly available as R code, which might result of limited usage for non-expert users. Moreover, its application requires efforts in the acquisition and preparation of the networks to be integrated. Finally, the UNIPred source code does not handle the visualization of the resulting consensus network, whereas suitable views of the network topology are necessary to explore and interpret existing protein relationships. RESULTS: We address the aforementioned issues by proposing UNIPred-Web, a user-friendly Web tool for the application of the UNIPred algorithm to a variety of biomolecular networks, already supplied by the system, and for the visualization and exploration of protein networks. We support different organisms and different types of networks -e.g., co-expression, shared domains and physical interaction networks. Users are supported in the different phases of the process, ranging from the selection of the networks and the protein function to be predicted, to the navigation of the integrated network. The system also supports the upload of user-defined protein networks. The vertex-centric and the highly interactive approach of UNIPred-Web allow a narrow exploration of specific proteins, and an interactive analysis of large sub-networks with only a few mouse clicks. CONCLUSIONS: UNIPred-Web offers a practical and intuitive (visual) guidance to biologists interested in gaining insights into protein biomolecular functions. UNIPred-Web provides facilities for the integration of networks, and supplies a framework for the imbalance-aware protein network integration of nine organisms, the prediction of thousands of GO protein functions, and a easy-to-use graphical interface for the visual analysis, navigation and interpretation of the integrated networks and of the functional predictions.


Assuntos
Biologia Computacional/métodos , Internet , Mapas de Interação de Proteínas , Proteínas/metabolismo , Software , Algoritmos , Interface Usuário-Computador
12.
BMC Bioinformatics ; 19(Suppl 10): 353, 2018 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-30367594

RESUMO

BACKGROUND: Several problems in network biology and medicine can be cast into a framework where entities are represented through partially labeled networks, and the aim is inferring the labels (usually binary) of the unlabeled part. Connections represent functional or genetic similarity between entities, while the labellings often are highly unbalanced, that is one class is largely under-represented: for instance in the automated protein function prediction (AFP) for most Gene Ontology terms only few proteins are annotated, or in the disease-gene prioritization problem only few genes are actually known to be involved in the etiology of a given disease. Imbalance-aware approaches to accurately predict node labels in biological networks are thereby required. Furthermore, such methods must be scalable, since input data can be large-sized as, for instance, in the context of multi-species protein networks. RESULTS: We propose a novel semi-supervised parallel enhancement of COSNET, an imbalance-aware algorithm build on Hopfield neural model recently suggested to solve the AFP problem. By adopting an efficient representation of the graph and assuming a sparse network topology, we empirically show that it can be efficiently applied to networks with millions of nodes. The key strategy to speed up the computations is to partition nodes into independent sets so as to process each set in parallel by exploiting the power of GPU accelerators. This parallel technique ensures the convergence to asymptotically stable attractors, while preserving the asynchronous dynamics of the original model. Detailed experiments on real data and artificial big instances of the problem highlight scalability and efficiency of the proposed method. CONCLUSIONS: By parallelizing COSNET we achieved on average a speed-up of 180x in solving the AFP problem in the S. cerevisiae, Mus musculus and Homo sapiens organisms, while lowering memory requirements. In addition, to show the potential applicability of the method to huge biomolecular networks, we predicted node labels in artificially generated sparse networks involving hundreds of thousands to millions of nodes.


Assuntos
Algoritmos , Gráficos por Computador , Redes Reguladoras de Genes , Animais , Ontologia Genética , Humanos , Camundongos , Mapas de Interação de Proteínas/genética , Proteínas/genética , Saccharomyces cerevisiae/genética , Fatores de Tempo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...