Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 5 de 5
Filter
Add more filters










Database
Language
Publication year range
1.
Sci Rep ; 12(1): 11655, 2022 07 08.
Article in English | MEDLINE | ID: mdl-35803984

ABSTRACT

The function of most genes is unknown. The best results in automated function prediction are obtained with machine learning-based methods that combine multiple data sources, typically sequence derived features, protein structure and interaction data. Even though there is ample evidence showing that a gene's function is not independent of its location, the few available examples of gene function prediction based on gene location rely on sequence identity between genes of different organisms and are thus subjected to the limitations of the relationship between sequence and function. Here we predict thousands of gene functions in five model eukaryotes (Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila melanogaster, Mus musculus and Homo sapiens) using machine learning models exclusively trained with features derived from the location of genes in the genomes to which they belong. Our aim was not to obtain the best performing method to automated function prediction but to explore the extent to which a gene's location can predict its function in eukaryotes. We found that our models outperform BLAST when predicting terms from Biological Process and Cellular Component Ontologies, showing that, at least in some cases, gene location alone can be more useful than sequence to infer gene function.


Subject(s)
Drosophila melanogaster , Machine Learning , Animals , Caenorhabditis elegans/genetics , Caenorhabditis elegans/metabolism , Drosophila melanogaster/genetics , Genome , Mice , Phenotype , Saccharomyces cerevisiae/genetics
2.
J Theor Biol ; 534: 110942, 2022 02 07.
Article in English | MEDLINE | ID: mdl-34717934

ABSTRACT

In this paper we introduce random proliferation models on graphs. We consider two types of particles: type-1/mutant/invader/red particles proliferates on a population of type-2/wild-type/resident/blue particles. Unlike the well-known Moran model on graphs -as introduced in Lieberman et al. (2005)-, type-1 particles can occupy in a single iteration several neighbouring sites previously occupied by type-2 particles. Two variants are considered, depending on the random distribution involving the proliferation mechanism: Bernoulli and binomial proliferation. By comparison with fixation probability of type-1 particles in the Moran process, critical parameters are introduced. Properties of proliferation are studied and some particular cases are analytically solved. Finally, by updating the parameters that drive the processes through a density-dependent mechanism, it is possible to capture additional relevant features as fluctuating waves of type-1 particles over long periods of time. In fact, the models can be adapted to tackle more general, complex and realistic situations.


Subject(s)
Biological Evolution , Cell Proliferation , Probability
3.
BMC Genomics ; 20(1): 1011, 2019 Dec 23.
Article in English | MEDLINE | ID: mdl-31870293

ABSTRACT

BACKGROUND: Assembly and function of neuronal synapses require the coordinated expression of a yet undetermined set of genes. Previously, we had trained an ensemble machine learning model to assign a probability of having synaptic function to every protein-coding gene in Drosophila melanogaster. This approach resulted in the publication of a catalogue of 893 genes which we postulated to be very enriched in genes with a still undocumented synaptic function. Since then, the scientific community has experimentally identified 79 new synaptic genes. Here we use these new empirical data to evaluate our original prediction. We also implement a series of changes to the training scheme of our model and using the new data we demonstrate that this improves its predictive power. Finally, we added the new synaptic genes to the training set and trained a new model, obtaining a new, enhanced catalogue of putative synaptic genes. RESULTS: The retrospective analysis demonstrate that our original catalogue was significantly enriched in new synaptic genes. When the changes to the training scheme were implemented using the original training set we obtained even higher enrichment. Finally, applying the new training scheme with a training set including the 79 new synaptic genes, resulted in an enhanced catalogue of putative synaptic genes. Here we present this new catalogue and announce that a regularly updated version will be available online at: http://synapticgenes.bnd.edu.uy CONCLUSIONS: We show that training an ensemble of machine learning classifiers solely with the whole-body temporal transcription profiles of known synaptic genes resulted in a catalogue with a significant enrichment in undiscovered synaptic genes. Using new empirical data provided by the scientific community, we validated our original approach, improved our model an obtained an arguably more precise prediction. This approach reduces the number of genes to be tested through hypothesis-driven experimentation and will facilitate our understanding of neuronal function. AVAILABILITY: http://synapticgenes.bnd.edu.uy.


Subject(s)
Drosophila melanogaster/genetics , Gene Expression Profiling , Machine Learning , Synapses/genetics , Transcription, Genetic , Animals , Gene Ontology
4.
BMC Genomics ; 16: 694, 2015 Sep 15.
Article in English | MEDLINE | ID: mdl-26370122

ABSTRACT

BACKGROUND: Assembly and function of neuronal synapses require the coordinated expression of a yet undetermined set of genes. Although roughly a thousand genes are expected to be important for this function in Drosophila melanogaster, just a few hundreds of them are known so far. RESULTS: In this work we trained three learning algorithms to predict a "synaptic function" for genes of Drosophila using data from a whole-body developmental transcriptome published by others. Using statistical and biological criteria to analyze and combine the predictions, we obtained a gene catalogue that is highly enriched in genes of relevance for Drosophila synapse assembly and function but still not recognized as such. CONCLUSIONS: The utility of our approach is that it reduces the number of genes to be tested through hypothesis-driven experimentation.


Subject(s)
Drosophila/embryology , Drosophila/genetics , Gene Expression Regulation, Developmental , Machine Learning , Synapses/genetics , Transcriptome , Algorithms , Animals , Computational Biology , Datasets as Topic , Gene Expression Profiling , Humans , Models, Biological , Organ Specificity/genetics , Rats , Synapses/metabolism
5.
J Theor Biol ; 380: 489-98, 2015 Sep 07.
Article in English | MEDLINE | ID: mdl-26116367

ABSTRACT

We present a novel model that describes the within-host evolutionary dynamics of parasites undergoing antigenic variation. The approach uses a multi-type branching process with two types of entities defined according to their relationship with the immune system: clans of resistant parasitic cells (i.e. groups of cells sharing the same antigen not yet recognized by the immune system) that may become sensitive, and individual sensitive cells that can acquire a new resistance thus giving rise to the emergence of a new clan. The simplicity of the model allows analytical treatment to determine the subcritical and supercritical regimes in the space of parameters. By incorporating a density-dependent mechanism the model is able to capture additional relevant features observed in experimental data, such as the characteristic parasitemia waves. In summary our approach provides a new general framework to address the dynamics of antigenic variation which can be easily adapted to cope with broader and more complex situations.


Subject(s)
Antigenic Variation , Microscopy/methods , Stochastic Processes , Probability
SELECTION OF CITATIONS
SEARCH DETAIL
...