Pesquisa | Portal Regional da BVS (teste)

Using Deep Learning to Extrapolate Protein Expression Measurements.

Barzine, Mitra Parissa; Freivalds, Karlis; Wright, James C; Opmanis, Martins; Rituma, Darta; Ghavidel, Fatemeh Zamanzad; Jarnuczak, Andrew F; Celms, Edgars; Cerans, Karlis; Jonassen, Inge; Lace, Lelde; Vizcaíno, Juan Antonio; Choudhary, Jyoti Sharma; Brazma, Alvis; Viksna, Juris.

Proteomics ; 20(21-22): e2000009, 2020 11.

Artigo em Inglês | MEDLINE | ID: mdl-32937025

RESUMO

Mass spectrometry (MS)-based quantitative proteomics experiments typically assay a subset of up to 60% of the ≈20 000 human protein coding genes. Computational methods for imputing the missing values using RNA expression data usually allow only for imputations of proteins measured in at least some of the samples. In silico methods for comprehensively estimating abundances across all proteins are still missing. Here, a novel method is proposed using deep learning to extrapolate the observed protein expression values in label-free MS experiments to all proteins, leveraging gene functional annotations and RNA measurements as key predictive attributes. This method is tested on four datasets, including human cell lines and human and mouse tissues. This method predicts the protein expression values with average R2 scores between 0.46 and 0.54, which is significantly better than predictions based on correlations using the RNA expression data alone. Moreover, it is demonstrated that the derived models can be "transferred" across experiments and species. For instance, the model derived from human tissues gave a R2=0.51 when applied to mouse tissue data. It is concluded that protein abundances generated in label-free MS experiments can be computationally predicted using functional annotated attributes and can be used to highlight aberrant protein abundance values.

Assuntos

Aprendizado Profundo , Animais , Espectrometria de Massas , Camundongos , Anotação de Sequência Molecular , Proteínas , Proteômica

Network motif-based analysis of regulatory patterns in paralogous gene pairs.

Melkus, Gatis; Rucevskis, Peteris; Celms, Edgars; Cerans, Karlis; Freivalds, Karlis; Kikusts, Paulis; Lace, Lelde; Opmanis, Martins; Rituma, Darta; Viksna, Juris.

J Bioinform Comput Biol ; 18(3): 2040008, 2020 06.

Artigo em Inglês | MEDLINE | ID: mdl-32698721

RESUMO

Current high-throughput experimental techniques make it feasible to infer gene regulatory interactions at the whole-genome level with reasonably good accuracy. Such experimentally inferred regulatory networks have become available for a number of simpler model organisms such as S. cerevisiae, and others. The availability of such networks provides an opportunity to compare gene regulatory processes at the whole genome level, and in particular, to assess similarity of regulatory interactions for homologous gene pairs either from the same or from different species. We present here a new technique for analyzing the regulatory interaction neighborhoods of paralogous gene pairs. Our central focus is the analysis of S. cerevisiae gene interaction graphs, which are of particular interest due to the ancestral whole-genome duplication (WGD) that allows to distinguish between paralogous transcription factors that are traceable to this duplication event and other paralogues. Similar analysis is also applied to E. coli and C. elegans networks. We compare paralogous gene pairs according to the presence and size of bi-fan arrays, classically associated in the literature with gene duplication, within other network motifs. We further extend this framework beyond transcription factor comparison to obtain topology-based similarity metrics based on the overlap of interaction neighborhoods applicable to most genes in a given organism. We observe that our network divergence metrics show considerably larger similarity between paralogues, especially those traceable to WGD. This is the case for both yeast and C. elegans, but not for E. coli regulatory network. While there is no obvious cross-species link between metrics, different classes of paralogues show notable differences in interaction overlap, with traceable duplications tending toward higher overlap compared to genes with shared protein families. Our findings indicate that divergence in paralogous interaction networks reflects a shared genetic origin, and that our approach may be useful for investigating structural similarity in the interaction networks of paralogous genes.

Assuntos

Caenorhabditis elegans/genética , Biologia Computacional/métodos , Escherichia coli/genética , Redes Reguladoras de Genes , Saccharomyces cerevisiae/genética , Animais , Evolução Molecular , Duplicação Gênica , Genoma , Fatores de Transcrição/genética

Topological structure analysis of chromatin interaction networks.

Viksna, Juris; Melkus, Gatis; Celms, Edgars; Cerans, Karlis; Freivalds, Karlis; Kikusts, Paulis; Lace, Lelde; Opmanis, Martins; Rituma, Darta; Rucevskis, Peteris.

BMC Bioinformatics ; 20(Suppl 23): 618, 2019 Dec 27.

Artigo em Inglês | MEDLINE | ID: mdl-31881819

RESUMO

BACKGROUND: Current Hi-C technologies for chromosome conformation capture allow to understand a broad spectrum of functional interactions between genome elements. Although significant progress has been made into analysis of Hi-C data to identify biologically significant features, many questions still remain open, in particular regarding potential biological significance of various topological features that are characteristic for chromatin interaction networks. RESULTS: It has been previously observed that promoter capture Hi-C (PCHi-C) interaction networks tend to separate easily into well-defined connected components that can be related to certain biological functionality, however, such evidence was based on manual analysis and was limited. Here we present a novel method for analysis of chromatin interaction networks aimed towards identifying characteristic topological features of interaction graphs and confirming their potential significance in chromatin architecture. Our method automatically identifies all connected components with an assigned significance score above a given threshold. These components can be subjected afterwards to different assessment methods for their biological role and/or significance. The method was applied to the largest PCHi-C data set available to date that contains interactions for 17 haematopoietic cell types. The results demonstrate strong evidence of well-pronounced component structure of chromatin interaction networks and provide some characterisation of this component structure. We also performed an indicative assessment of potential biological significance of identified network components with the results confirming that the network components can be related to specific biological functionality. CONCLUSIONS: The obtained results show that the topological structure of chromatin interaction networks can be well described in terms of isolated connected components of the network and that formation of these components can be often explained by biological features of functionally related gene modules. The presented method allows automatic identification of all such components and evaluation of their significance in PCHi-C dataset for 17 haematopoietic cell types. The method can be adapted for exploration of other chromatin interaction data sets that include information about sufficiently large number of different cell types, and, in principle, also for analysis of other kinds of cell type-specific networks.

Assuntos

Cromatina/química , Redes Reguladoras de Genes , Algoritmos , Regulação da Expressão Gênica , Hematopoese/genética , Humanos , Regiões Promotoras Genéticas

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA