Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 16 de 16
Filter
1.
Biosystems ; 241: 105231, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38754621

ABSTRACT

OBJECTIVE: Dynamic cerebral autoregulation (dCA) has been addressed through different approaches for discriminating between normal and impaired conditions based on spontaneous fluctuations in arterial blood pressure (ABP) and cerebral blood flow (CF). This work presents a novel multi-objective optimisation (MO) approach for finding good configurations of a cerebrovascular resistance-compliance model. METHODS: Data from twenty-nine subjects under normo and hypercapnic (5% CO2 in air) conditions was used. Cerebrovascular resistance and vessel compliance models with ABP as input and CF velocity as output were fitted using a MO approach, considering fitting Pearson's correlation and error. RESULTS: MO approach finds better model configurations than the single-objective (SO) approach, especially for hypercapnic conditions. In addition, the Pareto-optimal front from the multi-objective approach enables new information on dCA, reflecting a higher contribution of myogenic mechanism for explaining dCA impairment.


Subject(s)
Cerebrovascular Circulation , Homeostasis , Humans , Cerebrovascular Circulation/physiology , Homeostasis/physiology , Linear Models , Male , Adult , Blood Pressure/physiology , Brain/physiology , Models, Cardiovascular , Hypercapnia/physiopathology , Female , Vascular Resistance/physiology
2.
Biosystems ; 213: 104606, 2022 Mar.
Article in English | MEDLINE | ID: mdl-35033628

ABSTRACT

The analysis of evolutionary data allows uncovering information about the organisms and how they have adapted and evolved. This information could provide us with new insights about the specialisation of organisms (or part of them), how they adapt, how similar they are with other species, among others. Unfortunately, this evolutionary history can only be estimated, and for that, several computational methods exist. Among the methods, optimisation methods are one of the main approaches to deal with this problem, with multiobjective optimisation producing promising results. In this paper, we deal with multiobjective phylogenetic inference, using a multi-modal metaheuristic approach that exploits the decision space in the multiobjective formulation of the problem. In particular, we incorporate a new metric based on a topological tree distance. We compare the method with state of the art algorithms in terms of performance. Additionally, we perform a thorough analysis of a study case on a yeast Saccharomyces cerevisiae dataset. Results show that our proposal is able to improve the diversity of solutions while improving or keeping the quality of solutions in terms of hypervolume.


Subject(s)
Algorithms , Biological Evolution , Computer Simulation , Phylogeny
3.
J Bioinform Comput Biol ; 18(6): 2050040, 2020 12.
Article in English | MEDLINE | ID: mdl-33155874

ABSTRACT

Phylogenetic inference proposes an evolutionary hypothesis for a group of taxa which is usually represented as a phylogenetic tree. The use of several distinct biological evidence has shown to produce more resolved phylogenies than single evidence approaches. Currently, two conflicting paradigms are applied to combine biological evidence: taxonomic congruence (TC) and total evidence (TE). Although the literature recommends the application of these paradigms depending on the congruence of the input data, the resultant evolutionary hypotheses could vary according to the strategy used to combine the biological evidence biasing the resultant topologies of the trees. In this work, we evaluate the ability of different strategies associated with both paradigms to produce integrated evolutionary hypotheses by considering different features of the data: missing biological evidence, diversity among sequences, complexity, and congruence. Using datasets from the literature, we compare the resultant trees with reference hypotheses obtained by applying two inference criteria: maximum parsimony and likelihood. The results show that methods associated with TE paradigm are more robust compared to TC methods, obtaining trees with more similar topologies in relation to reference trees. These results are obtained regardless of (1) the features of the data, (2) the estimated evolutionary rates, and (3) the criteria used to infer the reference evolutionary hypotheses.


Subject(s)
Biological Evolution , Phylogeny , Animals , Bayes Theorem , Classification/methods , Computational Biology , Consensus Sequence , Databases, Genetic/statistics & numerical data , Humans , Least-Squares Analysis , Likelihood Functions , Models, Genetic , Primates/classification , Primates/genetics , Software
4.
J Bioinform Comput Biol ; 18(6): 2050038, 2020 12.
Article in English | MEDLINE | ID: mdl-33148094

ABSTRACT

Using a prior biological knowledge of relationships and genetic functions for gene similarity, from repository such as the Gene Ontology (GO), has shown good results in multi-objective gene clustering algorithms. In this scenario and to obtain useful clustering results, it would be helpful to know which measure of biological similarity between genes should be employed to yield meaningful clusters that have both similar expression patterns (co-expression) and biological homogeneity. In this paper, we studied the influence of the four most used GO-based semantic similarity measures in the performance of a multi-objective gene clustering algorithm. We used four publicly available datasets and carried out comparative studies based on performance metrics for the multi-objective optimization field and clustering performance indexes. In most of the cases, using Jiang-Conrath and Wang similarities stand in terms of multi-objective metrics. In clustering properties, Resnik similarity allows to achieve the best values of compactness and separation and therefore of co-expression of groups of genes. Meanwhile, in biological homogeneity, the Wang similarity reports greater number of significant GO terms. However, statistical, visual, and biological significance tests showed that none of the GO-based semantic similarity measures stand out above the rest in order to significantly improve the performance of the multi-objective gene clustering algorithm.


Subject(s)
Algorithms , Multigene Family , Cluster Analysis , Computational Biology , Databases, Genetic/statistics & numerical data , Gene Ontology/statistics & numerical data , Semantics , Transcriptome
5.
Microorganisms ; 8(1)2019 Dec 23.
Article in English | MEDLINE | ID: mdl-31877949

ABSTRACT

Massive sequencing projects executed in Saccharomyces cerevisiae have revealed in detail its population structure. The recent "1002 yeast genomes project" has become the most complete catalogue of yeast genetic diversity and a powerful resource to analyse the evolutionary history of genes affecting specific phenotypes. In this work, we selected 22 nitrogen associated genes and analysed the sequence information from the 1011 strains of the "1002 yeast genomes project". We constructed a total evidence (TE) phylogenetic tree using concatenated information, which showed a 27% topology similarity with the reference (REF) tree of the "1002 yeast genomes project". We also generated individual phylogenetic trees for each gene and compared their topologies, identifying genes with similar topologies (suggesting a shared evolutionary history). Furthermore, we pruned the constructed phylogenetic trees to compare the REF tree topology versus the TE tree and the individual genes trees, considering each phylogenetic cluster/subcluster within the population, observing genes with cluster/subcluster topologies of high similarity to the REF tree. Finally, we used the pruned versions of the phylogenetic trees to compare four strains considered as representatives of S. cerevisiae clean lineages, observing for 15 genes that its cluster topologies match 100% the REF tree, supporting that these strains represent main lineages of yeast population. Altogether, our results showed the potential of tree topologies comparison for exploring the evolutionary history of a specific group of genes.

6.
BioData Min ; 11: 16, 2018.
Article in English | MEDLINE | ID: mdl-30100924

ABSTRACT

BACKGROUND: Biologists aim to understand the genetic background of diseases, metabolic disorders or any other genetic condition. Microarrays are one of the main high-throughput technologies for collecting information about the behaviour of genetic information on different conditions. In order to analyse this data, clustering arises as one of the main techniques used, and it aims at finding groups of genes that have some criterion in common, like similar expression profile. However, the problem of finding groups is normally multi dimensional, making necessary to approach the clustering as a multi-objective problem where various cluster validity indexes are simultaneously optimised. They are usually based on criteria like compactness and separation, which may not be sufficient since they can not guarantee the generation of clusters that have both similar expression patterns and biological coherence. METHOD: We propose a Multi-Objective Clustering algorithm Guided by a-Priori Biological Knowledge (MOC-GaPBK) to find clusters of genes with high levels of co-expression, biological coherence, and also good compactness and separation. Cluster quality indexes are used to optimise simultaneously gene relationships at expression level and biological functionality. Our proposal also includes intensification and diversification strategies to improve the search process. RESULTS: The effectiveness of the proposed algorithm is demonstrated on four publicly available datasets. Comparative studies of the use of different objective functions and other widely used microarray clustering techniques are reported. Statistical, visual and biological significance tests are carried out to show the superiority of the proposed algorithm. CONCLUSIONS: Integrating a-priori biological knowledge into a multi-objective approach and using intensification and diversification strategies allow the proposed algorithm to find solutions with higher quality than other microarray clustering techniques available in the literature in terms of co-expression, biological coherence, compactness and separation.

7.
Article in English | MEDLINE | ID: mdl-28029627

ABSTRACT

Memetic Algorithms are population-based metaheuristics intrinsically concerned with exploiting all available knowledge about the problem under study. The incorporation of problem domain knowledge is not an optional mechanism, but a fundamental feature of the Memetic Algorithms. In this paper, we present a Memetic Algorithm to tackle the three-dimensional protein structure prediction problem. The method uses a structured population and incorporates a Simulated Annealing algorithm as a local search strategy, as well as ad-hoc crossover and mutation operators to deal with the problem. It takes advantage of structural knowledge stored in the Protein Data Bank, by using an Angle Probability List that helps to reduce the search space and to guide the search strategy. The proposed algorithm was tested on nineteen protein sequences of amino acid residues, and the results show the ability of the algorithm to find native-like protein structures. Experimental results have revealed that the proposed algorithm can find good solutions regarding root-mean-square deviation and global distance total score test in comparison with the experimental protein structures. We also show that our results are comparable in terms of folding organization with state-of-the-art prediction methods, corroborating the effectiveness of our proposal.

8.
Article in English | MEDLINE | ID: mdl-27925594

ABSTRACT

Memetic Algorithms are population-based metaheuristics intrinsically concerned with exploiting all available knowledge about the problem under study. The incorporation of problem domain knowledge is not an optional mechanism, but a fundamental feature of the Memetic Algorithms. In this paper, we present a Memetic Algorithm to tackle the three-dimensional protein structure prediction problem. The method uses a structured population and incorporates a Simulated Annealing algorithm as a local search strategy, as well as ad-hoc crossover and mutation operators to deal with the problem. It takes advantage of structural knowledge stored in the Protein Data Bank, by using an Angle Probability List that helps to reduce the search space and to guide the search strategy. The proposed algorithm was tested on nineteen protein sequences of amino acid residues, and the results show the ability of the algorithm to find native-like protein structures. Experimental results have revealed that the proposed algorithm can find good solutions regarding root-mean-square deviation and global distance total score test in comparison with the experimental protein structures. We also show that our results are comparable in terms of folding organization with state-of-the-art prediction methods, corroborating the effectiveness of our proposal.

9.
Methods Mol Biol ; 1526: 271-297, 2017.
Article in English | MEDLINE | ID: mdl-27896748

ABSTRACT

In this chapter, we illustrate the use of an integrated mathematical method for joint clustering and visualization of large-scale datasets. In applying these clustering methodologies to biological datasets, we aim to identify differentially expressed genes according to cell type by building molecular signatures supported by statistical scores. In doing so, we also aim to find a global map of highly co-expressed clusters. Variations in these clusters may well indicate other pathological trends and changes.


Subject(s)
Computational Biology/methods , Transcriptome/genetics , Algorithms , Biomarkers , Models, Theoretical
10.
J Comput Biol ; 24(3): 255-265, 2017 Mar.
Article in English | MEDLINE | ID: mdl-27494258

ABSTRACT

The exponential growth in the number of experimentally determined three-dimensional protein structures provide a new and relevant knowledge about the conformation of amino acids in proteins. Only a few of probability densities of amino acids are publicly available for use in structure validation and prediction methods. NIAS (Neighbors Influence of Amino acids and Secondary structures) is a web-based tool used to extract information about conformational preferences of amino acid residues and secondary structures in experimental-determined protein templates. This information is useful, for example, to characterize folds and local motifs in proteins, molecular folding, and can help the solution of complex problems such as protein structure prediction, protein design, among others. The NIAS-Server and supplementary data are available at http://sbcb.inf.ufrgs.br/nias .


Subject(s)
Algorithms , Amino Acids/chemistry , Computational Biology/methods , Proteins/chemistry , Software , Amino Acid Motifs , Databases, Protein , Internet , Models, Molecular , Protein Folding , Protein Structure, Secondary
11.
Comput Biol Chem ; 59 Pt A: 142-57, 2015 Dec.
Article in English | MEDLINE | ID: mdl-26495908

ABSTRACT

Tertiary protein structure prediction is one of the most challenging problems in structural bioinformatics. Despite the advances in algorithm development and computational strategies, predicting the folded structure of a protein only from its amino acid sequence remains as an unsolved problem. We present a new computational approach to predict the native-like three-dimensional structure of proteins. Conformational preferences of amino acid residues and secondary structure information were obtained from protein templates stored in the Protein Data Bank and represented as an Angle Probability List. Two knowledge-based prediction methods based on Genetic Algorithms and Particle Swarm Optimization were developed using this information. The proposed method has been tested with twenty-six case studies selected to validate our approach with different classes of proteins and folding patterns. Stereochemical and structural analysis were performed for each predicted three-dimensional structure. Results achieved suggest that the Angle Probability List can improve the effectiveness of metaheuristics used to predicted the three-dimensional structure of protein molecules by reducing its conformational search space.


Subject(s)
Algorithms , Computational Biology , Knowledge Bases , Proteins/chemistry , Protein Conformation , Protein Structure, Tertiary
12.
PLoS One ; 9(8): e105870, 2014.
Article in English | MEDLINE | ID: mdl-25171185

ABSTRACT

In microbiology, identification of all isolates by sequencing is still unfeasible in small research laboratories. Therefore, many yeast diversity studies follow a screening procedure consisting of clustering the yeast isolates using MSP-PCR fingerprinting, followed by identification of one or a few selected representatives of each cluster by sequencing. Although this procedure has been widely applied in the literature, it has not been properly validated. We evaluated a standardized protocol using MSP-PCR fingerprinting with the primers (GTG)5 and M13 for the discrimination of wine associated yeasts in South Brazil. Two datasets were used: yeasts isolated from bottled wines and vineyard environments. We compared the discriminatory power of both primers in a subset of 16 strains, choosing the primer (GTG)5 for further evaluation. Afterwards, we applied this technique to 245 strains, and compared the results with the identification obtained by partial sequencing of the LSU rRNA gene, considered as the gold standard. An array matrix was constructed for each dataset and used as input for clustering with two methods (hierarchical dendrograms and QAPGrid layout). For both yeast datasets, unrelated species were clustered in the same group. The sensitivity score of (GTG)5 MSP-PCR fingerprinting was high, but specificity was low. As a conclusion, the yeast diversity inferred in several previous studies may have been underestimated and some isolates were probably misidentified due to the compliance to this screening procedure.


Subject(s)
DNA Fingerprinting/methods , DNA Primers/genetics , Polymerase Chain Reaction/methods , Wine/microbiology , Yeasts/genetics , Cluster Analysis , DNA, Fungal/genetics , Genetic Variation , Phylogeny , RNA, Ribosomal/genetics , Sequence Analysis, DNA , Species Specificity , Yeasts/classification , Yeasts/isolation & purification
13.
Genome Res ; 22(5): 885-98, 2012 May.
Article in English | MEDLINE | ID: mdl-22406755

ABSTRACT

Transcriptomic analyses have identified tens of thousands of intergenic, intronic, and cis-antisense long noncoding RNAs (lncRNAs) that are expressed from mammalian genomes. Despite progress in functional characterization, little is known about the post-transcriptional regulation of lncRNAs and their half-lives. Although many are easily detectable by a variety of techniques, it has been assumed that lncRNAs are generally unstable, but this has not been examined genome-wide. Utilizing a custom noncoding RNA array, we determined the half-lives of ∼800 lncRNAs and ∼12,000 mRNAs in the mouse Neuro-2a cell line. We find only a minority of lncRNAs are unstable. LncRNA half-lives vary over a wide range, comparable to, although on average less than, that of mRNAs, suggestive of complex metabolism and widespread functionality. Combining half-lives with comprehensive lncRNA annotations identified hundreds of unstable (half-life < 2 h) intergenic, cis-antisense, and intronic lncRNAs, as well as lncRNAs showing extreme stability (half-life > 16 h). Analysis of lncRNA features revealed that intergenic and cis-antisense RNAs are more stable than those derived from introns, as are spliced lncRNAs compared to unspliced (single exon) transcripts. Subcellular localization of lncRNAs indicated widespread trafficking to different cellular locations, with nuclear-localized lncRNAs more likely to be unstable. Surprisingly, one of the least stable lncRNAs is the well-characterized paraspeckle RNA Neat1, suggesting Neat1 instability contributes to the dynamic nature of this subnuclear domain. We have created an online interactive resource (http://stability.matticklab.com) that allows easy navigation of lncRNA and mRNA stability profiles and provides a comprehensive annotation of ~7200 mouse lncRNAs.


Subject(s)
Genome , Mice/genetics , RNA Stability , RNA, Untranslated/metabolism , Analysis of Variance , Animals , Cell Line, Tumor , Cluster Analysis , Gene Expression , Half-Life , Humans , Molecular Sequence Annotation , Oligonucleotide Array Sequence Analysis , RNA, Messenger/genetics , RNA, Messenger/metabolism , RNA, Untranslated/genetics
14.
PLoS One ; 6(1): e14468, 2011 Jan 18.
Article in English | MEDLINE | ID: mdl-21267077

ABSTRACT

BACKGROUND: The visualization of large volumes of data is a computationally challenging task that often promises rewarding new insights. There is great potential in the application of new algorithms and models from combinatorial optimisation. Datasets often contain "hidden regularities" and a combined identification and visualization method should reveal these structures and present them in a way that helps analysis. While several methodologies exist, including those that use non-linear optimization algorithms, severe limitations exist even when working with only a few hundred objects. METHODOLOGY/PRINCIPAL FINDINGS: We present a new data visualization approach (QAPgrid) that reveals patterns of similarities and differences in large datasets of objects for which a similarity measure can be computed. Objects are assigned to positions on an underlying square grid in a two-dimensional space. We use the Quadratic Assignment Problem (QAP) as a mathematical model to provide an objective function for assignment of objects to positions on the grid. We employ a Memetic Algorithm (a powerful metaheuristic) to tackle the large instances of this NP-hard combinatorial optimization problem, and we show its performance on the visualization of real data sets. CONCLUSIONS/SIGNIFICANCE: Overall, the results show that QAPgrid algorithm is able to produce a layout that represents the relationships between objects in the data set. Furthermore, it also represents the relationships between clusters that are feed into the algorithm. We apply the QAPgrid on the 84 Indo-European languages instance, producing a near-optimal layout. Next, we produce a layout of 470 world universities with an observed high degree of correlation with the score used by the Academic Ranking of World Universities compiled in the The Shanghai Jiao Tong University Academic Ranking of World Universities without the need of an ad hoc weighting of attributes. Finally, our Gene Ontology-based study on Saccharomyces cerevisiae fully demonstrates the scalability and precision of our method as a novel alternative tool for functional genomics.


Subject(s)
Algorithms , Computer Graphics , Databases, Factual , Models, Theoretical , Cluster Analysis , Genomics/methods , Methods , Saccharomyces cerevisiae/genetics
15.
PLoS One ; 5(12): e14176, 2010 Dec 01.
Article in English | MEDLINE | ID: mdl-21152067

ABSTRACT

BACKGROUND: Several lines of evidence suggest that transcription factors are involved in the pathogenesis of Multiple Sclerosis (MS) but complete mapping of the whole network has been elusive. One of the reasons is that there are several clinical subtypes of MS and transcription factors that may be involved in one subtype may not be in others. We investigate the possibility that this network could be mapped using microarray technologies and contemporary bioinformatics methods on a dataset derived from whole blood in 99 untreated MS patients (36 Relapse Remitting MS, 43 Primary Progressive MS, and 20 Secondary Progressive MS) and 45 age-matched healthy controls. METHODOLOGY/PRINCIPAL FINDINGS: We have used two different analytical methodologies: a non-standard differential expression analysis and a differential co-expression analysis, which have converged on a significant number of regulatory motifs that are statistically overrepresented in genes that are either differentially expressed (or differentially co-expressed) in cases and controls (e.g., V$KROX_Q6, p-value <3.31E-6; V$CREBP1_Q2, p-value <9.93E-6, V$YY1_02, p-value <1.65E-5). CONCLUSIONS/SIGNIFICANCE: Our analysis uncovered a network of transcription factors that potentially dysregulate several genes in MS or one or more of its disease subtypes. The most significant transcription factor motifs were for the Early Growth Response EGR/KROX family, ATF2, YY1 (Yin and Yang 1), E2F-1/DP-1 and E2F-4/DP-2 heterodimers, SOX5, and CREB and ATF families. These transcription factors are involved in early T-lymphocyte specification and commitment as well as in oligodendrocyte dedifferentiation and development, both pathways that have significant biological plausibility in MS causation.


Subject(s)
Gene Expression Profiling , Genome-Wide Association Study , Multiple Sclerosis/blood , RNA, Messenger/metabolism , Transcription Factors/metabolism , Adolescent , Adult , Aged , Aged, 80 and over , Case-Control Studies , Cohort Studies , Female , Humans , Male , Middle Aged , Multiple Sclerosis/metabolism , Oligodendroglia/cytology
16.
Radiother Oncol ; 90(3): 400-7, 2009 Mar.
Article in English | MEDLINE | ID: mdl-18952309

ABSTRACT

PURPOSE: We sought to categorize longitudinal radiation-induced rectal toxicity data obtained from men participating in a randomised controlled trial for locally advanced prostate cancer. MATERIALS AND METHODS: Data from self-assessed questionnaires of rectal symptoms and clinician recorded remedial interventions were collected during the TROG 96.01 trial. In this trial, volunteers were randomised to radiation with or without neoadjuvant androgen deprivation. Characterization of longitudinal variations in symptom intensity was achieved using prevalence data. An integrated visualization and clustering approach based on memetic algorithms was used to define the compositions of symptom clusters occurring before, during and after radiation. The utility of the CTC grading system as a means of identifying specific injury profiles was evaluated using concordance analyses. RESULTS: Seven well-defined clusters of rectal symptoms were present prior to treatment, 25 were seen immediately following radiation and 7 at years 1, 2 and 3 following radiation. CTC grading did not concord with the degree of rectal 'distress' and 'problems' at all time points. Concordance was not improved by adding urgency to the CTC scale. CONCLUSIONS: The CTC scale has serious shortcomings. A powerful new technique for non-hierarchical clustering may contribute to the categorization of rectal toxicity data for genomic profiling studies and detailed patho-physiological studies.


Subject(s)
Proctitis/etiology , Prostatic Neoplasms/radiotherapy , Radiation Injuries/etiology , Androgen Antagonists/therapeutic use , Humans , Male , Neoadjuvant Therapy , Prospective Studies , Radiotherapy/adverse effects , Syndrome
SELECTION OF CITATIONS
SEARCH DETAIL
...