Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Artigo em Inglês | MEDLINE | ID: mdl-38145512

RESUMO

In this brief paper, we study the size and width of autoencoders consisting of Boolean threshold functions, where an autoencoder is a layered neural network whose structure can be viewed as consisting of an encoder, which compresses an input vector to a lower dimensional vector, and a decoder which transforms the low-dimensional vector back to the original input vector exactly (or approximately). We focus on the decoder part and show that [Formula: see text] and O(√{Dn}) nodes are required to transform n vectors in d -dimensional binary space to D -dimensional binary space. We also show that the width can be reduced if we allow small errors, where the error is defined as the average of the Hamming distance between each vector input to the encoder part and the resulting vector output by the decoder.

2.
IEEE Trans Neural Netw Learn Syst ; 34(2): 921-931, 2023 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-34428155

RESUMO

An autoencoder is a layered neural network whose structure can be viewed as consisting of an encoder, which compresses an input vector to a lower dimensional vector, and a decoder, which transforms the low-dimensional vector back to the original input vector (or one that is very similar). In this article, we explore the compressive power of autoencoders that are Boolean threshold networks by studying the numbers of nodes and layers that are required to ensure that each vector in a given set of distinct input binary vectors is transformed back to its original. We show that for any set of n distinct vectors there exists a seven-layer autoencoder with the optimal compression ratio, (i.e., the size of the middle layer is logarithmic in n ), but that there is a set of n vectors for which there is no three-layer autoencoder with a middle layer of logarithmic size. In addition, we present a kind of tradeoff: if the compression ratio is allowed to be considerably larger than the optimal, then there is a five-layer autoencoder. We also study the numbers of nodes and layers required only for encoding, and the results suggest that the decoding part is the bottleneck of autoencoding. For example, there always is a three-layer Boolean threshold encoder that compresses n vectors into a dimension that is twice the logarithm of n .

3.
Neural Netw ; 126: 300-311, 2020 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-32278262

RESUMO

This paper presents two approaches to extracting rules from a trained neural network consisting of linear threshold functions. The first one leads to an algorithm that extracts rules in the form of Boolean functions. Compared with an existing one, this algorithm outputs much more concise rules if the threshold functions correspond to 1-decision lists, majority functions, or certain combinations of these. The second one extracts probabilistic rules representing relations between some of the input variables and the output using a dynamic programming algorithm. The algorithm runs in pseudo-polynomial time if each hidden layer has a constant number of neurons. We demonstrate the effectiveness of these two approaches by computational experiments.


Assuntos
Algoritmos , Redes Neurais de Computação , Probabilidade , Animais , Humanos , Neurônios/fisiologia
4.
IEEE Trans Neural Netw Learn Syst ; 30(8): 2383-2396, 2019 08.
Artigo em Inglês | MEDLINE | ID: mdl-30582556

RESUMO

We study the problem of identifying the structure of a probabilistic Boolean network (PBN), a probabilistic model of biological networks, from a given set of samples. This problem can be regarded as an identification of a set of Boolean functions from samples. Existing studies on the identification of the structure of a PBN only use information on the occurrences of samples. In this paper, we also make use of the frequencies of occurrences of subtuples, information that is obtainable from the samples. We show that under this model, it is possible to identify a PBN from among a class of PBNs, for much broader classes of PBNs. In particular, we prove that, under a reasonable assumption, the structure of a PBN can be identified from among the class of PBNs that have at most three functions assigned to each node, but that identification may be impossible if four or more functions are assigned to each node. We also analyze the sample complexity for exactly identifying the structure of a PBN, and present an efficient algorithm for the identification of a PBN consisting of threshold functions from samples.

5.
IEEE Trans Neural Netw Learn Syst ; 29(4): 869-881, 2018 04.
Artigo em Inglês | MEDLINE | ID: mdl-28129190

RESUMO

This paper studies the problem of exactly identifying the structure of a probabilistic Boolean network (PBN) from a given set of samples, where PBNs are probabilistic extensions of Boolean networks. Cheng et al. studied the problem while focusing on PBNs consisting of pairs of AND/OR functions. This paper considers PBNs consisting of Boolean threshold functions while focusing on those threshold functions that have unit coefficients. The treatment of Boolean threshold functions, and triplets and -tuplets of such functions, necessitates a deepening of the theoretical analyses. It is shown that wide classes of PBNs with such threshold functions can be exactly identified from samples under reasonable constraints, which include: 1) PBNs in which any number of threshold functions can be assigned provided that all have the same number of input variables and 2) PBNs consisting of pairs of threshold functions with different numbers of input variables. It is also shown that the problem of deciding the equivalence of two Boolean threshold functions is solvable in pseudopolynomial time but remains co-NP complete.

6.
J Comput Biol ; 20(12): 958-69, 2013 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-24073923

RESUMO

We study the problem of finding a 0-1 assignment to Boolean variables satisfying a given set of nested canalyzing functions, a class of Boolean functions that is known to be of interest in biology. For this problem, an extension of the satisfiability problem for a conjunctive normal form formula, an O(min(2(k), 2((k+m)/2))poly(m)) time algorithm has been known, where m and k are the number of nested canalyzing functions and variables, respectively. Here we present an improved O(min(2(k), 1.325(k+m), 2(m))poly(m)) time algorithm for this problem. We also study the problem of finding a singleton attractor of a Boolean network consisting of n nested canalyzing functions. Although an O(1.799(n)) time algorithm was proposed in a previous study, it was implicitly assumed that the network does not contain any positive self-loops. By utilizing the improved satisfiability algorithm for nested canalyzing functions, while allowing for the presence of positive self-loops, we show that the general case can be solved in O(1.871(n)) time.


Assuntos
Algoritmos , Redes Reguladoras de Genes , Modelos Estatísticos , Animais , Simulação por Computador , Interpretação Estatística de Dados , Regulação da Expressão Gênica , Humanos , Modelos Genéticos , Biologia de Sistemas
7.
Artigo em Inglês | MEDLINE | ID: mdl-22689081

RESUMO

In this paper, we study the problem of finding a periodic attractor of a Boolean network (BN), which arises in computational systems biology and is known to be NP-hard. Since a general case is quite hard to solve, we consider special but biologically important subclasses of BNs. For finding an attractor of period 2 of a BN consisting of n OR functions of positive literals, we present a polynomial time algorithm. For finding an attractor of period 2 of a BN consisting of n AND/OR functions of literals, we present an O(1:985(n)) time algorithm. For finding an attractor of a fixed period of a BN consisting of n nested canalyzing functions and having constant treewidth w, we present an O(n(2p(w+1))poly(n)) time algorithm.


Assuntos
Algoritmos , Biologia de Sistemas , Perfilação da Expressão Gênica
8.
J Comput Biol ; 18(10): 1275-90, 2011 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-21554129

RESUMO

In this article, we study the problem of finding a singleton attractor for several biologically important subclasses of Boolean networks (BNs). The problem of finding a singleton attractor in a BN is known to be NP-hard in general. For BNs consisting of n nested canalyzing functions, we present an O(1.799(n)) time algorithm. The core part of this development is an O(min(2(k/2) · 2(m/2), 2(k)) · poly(k, m)) time algorithm for the satisfiability problem for m nested canalyzing functions over k variables. For BNs consisting of chain functions, a subclass of nested canalyzing functions, we present an O(1.619(n)) time algorithm and show that the problem remains NP-hard, even though the satisfiability problem for m chain functions over k variables is solvable in polynomial time. Finally, we present an o(2(n)) time algorithm for bounded degree BNs consisting of canalyzing functions.


Assuntos
Algoritmos , Modelos Genéticos , Biologia de Sistemas , Biologia Computacional , Simulação por Computador , Regulação da Expressão Gênica , Cinética , Dinâmica não Linear
9.
In Silico Biol ; 11(1-2): 11-7, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-22475748

RESUMO

The low reproducibility of differential expression of individual genes in microarray experiments has led to the suggestion that experiments be analyzed in terms of gene characteristics, such as GO categories or pathways, in order to enhance the robustness of the results. An implicit assumption of this approach is that the different experiments in effect randomly sample the genes participating in an active process. We argue that by the same rationale it is possible to perform this higher-level analysis on the aggregation of genes that are differentially-expressed in different expression-based studies, even if the experiments used different platforms. The aggregation increases the reliability of the results, it has the potential for uncovering signals that are liable to escape detection in the individual experiments, and it enables a more thorough mining of the ever more plentiful microarray data. We present here a proof-of-concept study of these ideas, using ten studies describing the changes in expression profiles of human host genes in response to infection by Retroviridae or Herpesviridae viral families. We supply a tool (accessible at www.cs.bgu.ac.il/∼waytogo) which enables the user to learn about genes and processes of interest in this study.


Assuntos
Perfilação da Expressão Gênica , Herpesviridae/genética , Retroviridae/genética , Perfilação da Expressão Gênica/métodos , Herpesviridae/fisiologia , Infecções por Herpesviridae/genética , Humanos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Reprodutibilidade dos Testes , Retroviridae/fisiologia , Infecções por Retroviridae/genética
10.
Bioinformatics ; 25(14): 1789-95, 2009 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-19497934

RESUMO

MOTIVATION: There is a growing interest in improving the cluster analysis of expression data by incorporating into it prior knowledge, such as the Gene Ontology (GO) annotations of genes, in order to improve the biological relevance of the clusters that are subjected to subsequent scrutiny. The structure of the GO is another source of background knowledge that can be exploited through the use of semantic similarity. RESULTS: We propose here a novel algorithm that integrates semantic similarities (derived from the ontology structure) into the procedure of deriving clusters from the dendrogram constructed during expression-based hierarchical clustering. Our approach can handle the multiple annotations, from different levels of the GO hierarchy, which most genes have. Moreover, it treats annotated and unannotated genes in a uniform manner. Consequently, the clusters obtained by our algorithm are characterized by significantly enriched annotations. In both cross-validation tests and when using an external index such as protein-protein interactions, our algorithm performs better than previous approaches. When applied to human cancer expression data, our algorithm identifies, among others, clusters of genes related to immune response and glucose metabolism. These clusters are also supported by protein-protein interaction data.


Assuntos
Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Algoritmos , Análise por Conglomerados , Bases de Dados de Proteínas , Humanos , Proteínas/química , Proteínas/metabolismo
11.
PLoS One ; 4(4): e5313, 2009.
Artigo em Inglês | MEDLINE | ID: mdl-19390589

RESUMO

BACKGROUND: The traditional approach to studying complex biological networks is based on the identification of interactions between internal components of signaling or metabolic pathways. By comparison, little is known about interactions between higher order biological systems, such as biological pathways and processes. We propose a methodology for gleaning patterns of interactions between biological processes by analyzing protein-protein interactions, transcriptional co-expression and genetic interactions. At the heart of the methodology are the concept of Linked Processes and the resultant network of biological processes, the Process Linkage Network (PLN). RESULTS: We construct, catalogue, and analyze different types of PLNs derived from different data sources and different species. When applied to the Gene Ontology, many of the resulting links connect processes that are distant from each other in the hierarchy, even though the connection makes eminent sense biologically. Some others, however, carry an element of surprise and may reflect mechanisms that are unique to the organism under investigation. In this aspect our method complements the link structure between processes inherent in the Gene Ontology, which by its very nature is species-independent. As a practical application of the linkage of processes we demonstrate that it can be effectively used in protein function prediction, having the power to increase both the coverage and the accuracy of predictions, when carefully integrated into prediction methods. CONCLUSIONS: Our approach constitutes a promising new direction towards understanding the higher levels of organization of the cell as a system which should help current efforts to re-engineer ontologies and improve our ability to predict which proteins are involved in specific biological processes.


Assuntos
Redes Reguladoras de Genes , Mapeamento de Interação de Proteínas , Algoritmos , Fenômenos Biológicos , Biologia Computacional , Bases de Dados de Proteínas , Proteínas/química , Saccharomyces cerevisiae/metabolismo
12.
Bioinformatics ; 23(24): 3335-42, 2007 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-17989094

RESUMO

MOTIVATION: Hierarchical clustering is widely used to cluster genes into groups based on their expression similarity. This method first constructs a tree. Next this tree is partitioned into subtrees by cutting all edges at some level, thereby inducing a clustering. Unfortunately, the resulting clusters often do not exhibit significant functional coherence. RESULTS: To improve the biological significance of the clustering, we develop a new framework of partitioning by snipping--cutting selected edges at variable levels. The snipped edges are selected to induce clusters that are maximally consistent with partially available background knowledge such as functional classifications. Algorithms for two key applications are presented: functional prediction of genes, and discovery of functionally enriched clusters of co-expressed genes. Simulation results and cross-validation tests indicate that the algorithms perform well even when the actual number of clusters differs considerably from the requested number. Performance is improved compared with a previously proposed algorithm. AVAILABILITY: A java package is available at http://www.cs.bgu.ac.il/~dotna/ TreeSnipping


Assuntos
Algoritmos , Inteligência Artificial , Perfilação da Expressão Gênica/métodos , Família Multigênica/fisiologia , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Reconhecimento Automatizado de Padrão/métodos , Proteoma/metabolismo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...