Search | VHL Regional Portal

1.

Testing extinction events and temporal shifts in diversification and fossilization rates through the skyline Fossilized Birth-Death (FBD) model: The example of some mid-Permian synapsid extinctions.

Didier, Gilles; Laurin, Michel.

Cladistics ; 40(3): 282-306, 2024 Jun.

Article in English | MEDLINE | ID: mdl-38651531

ABSTRACT

In the last decade, the Fossilized Birth-Death (FBD) process has yielded interesting clues about the evolution of biodiversity through time. To facilitate such studies, we extend our method to compute the probability density of phylogenetic trees of extant and extinct taxa in which the only temporal information is provided by the fossil ages (i.e. without the divergence times) in order to deal with the piecewise constant FBD process, known as the "skyline FBD", which allows rates to change between pre-defined time intervals, as well as modelling extinction events at the bounds of these intervals. We develop approaches based on this method to assess hypotheses about the diversification process and to answer questions such as "Does a mass extinction occur at this time?" or "Is there a change in the fossilization rate between two given periods?". Our software can also yield Bayesian and maximum-likelihood estimates of the parameters of the skyline FBD model under various constraints. These approaches are applied to a simulated dataset in order to test their ability to answer the questions above. Finally, we study an updated dataset of Permo-Carboniferous synapsids to get additional insights into the dynamics of biodiversity change in three clades (Ophiacodontidae, Edaphosauridae and Sphenacodontidae) in the Pennsylvanian (Late Carboniferous) and Cisuralian (Early Permian), and to assess support for end-Sakmarian (or Artinskian) and end-Cisuralian mass extinction events discussed in previous studies.

Subject(s)

Biodiversity , Extinction, Biological , Fossils , Phylogeny , Animals , Biological Evolution , Bayes Theorem , Computer Simulation

2.

The Cauchy Process on Phylogenies: A Tractable Model for Pulsed Evolution.

Bastide, Paul; Didier, Gilles.

Syst Biol ; 72(6): 1296-1315, 2023 Dec 30.

Article in English | MEDLINE | ID: mdl-37603537

ABSTRACT

Phylogenetic comparative methods use random processes, such as the Brownian Motion, to model the evolution of continuous traits on phylogenetic trees. Growing evidence for non-gradual evolution motivated the development of complex models, often based on Lévy processes. However, their statistical inference is computationally intensive and currently relies on approximations, high-dimensional sampling, or numerical integration. We consider here the Cauchy Process (CP), a particular pure-jump Lévy process in which the trait increment along each branch follows a centered Cauchy distribution with a dispersion proportional to its length. In this work, we derive an exact algorithm to compute both the joint probability density of the tip trait values of a phylogeny under a CP and the ancestral trait values and branch increments posterior densities in quadratic time. A simulation study shows that the CP generates patterns in comparative data that are distinct from any Gaussian process, and that restricted maximum likelihood parameter estimates and root trait reconstruction are unbiased and accurate for trees with 200 tips or less. The CP has only two parameters but is rich enough to capture complex-pulsed evolution. It can reconstruct posterior ancestral trait distributions that are multimodal, reflecting the uncertainty associated with the inference of the evolutionary history of a trait from extant taxa only. Applied on empirical datasets taken from the Evolutionary Ecology and Virology literature, the CP suggests nuanced scenarios for the body size evolution of Greater Antilles Lizards and for the geographical spread of the West Nile Virus epidemics in North America, both consistent with previous studies using more complex models. The method is efficiently implemented in C with an R interface in package cauphy, which is open source and freely available online.

Subject(s)

Lizards , Animals , Phylogeny , Computer Simulation , Likelihood Functions , Phenotype , Lizards/genetics

3.

Editorial: Timetrees: Incorporating fossils and molecules.

Laurin, Michel; Didier, Gilles; Warnock, Rachel C M.

Front Genet ; 13: 937763, 2022.

Article in English | MEDLINE | ID: mdl-35846136

4.

Distributions of extinction times from fossil ages and tree topologies: the example of mid-Permian synapsid extinctions.

Didier, Gilles; Laurin, Michel.

PeerJ ; 9: e12577, 2021.

Article in English | MEDLINE | ID: mdl-34966586

ABSTRACT

Given a phylogenetic tree that includes only extinct, or a mix of extinct and extant taxa, where at least some fossil data are available, we present a method to compute the distribution of the extinction time of a given set of taxa under the Fossilized-Birth-Death model. Our approach differs from the previous ones in that it takes into account (i) the possibility that the taxa or the clade considered may diversify before going extinct and (ii) the whole phylogenetic tree to estimate extinction times, whilst previous methods do not consider the diversification process and deal with each branch independently. Because of this, our method can estimate extinction times of lineages represented by a single fossil, provided that they belong to a clade that includes other fossil occurrences. We assess and compare our new approach with a standard previous one using simulated data. Results show that our method provides more accurate confidence intervals. This new approach is applied to the study of the extinction time of three Permo-Carboniferous synapsid taxa (Ophiacodontidae, Edaphosauridae, and Sphenacodontidae) that are thought to have disappeared toward the end of the Cisuralian (early Permian), or possibly shortly thereafter. The timing of extinctions of these three taxa and of their component lineages supports the idea that the biological crisis in the late Kungurian/early Roadian consisted of a progressive decline in biodiversity throughout the Kungurian.

5.

Exact Distribution of Divergence Times from Fossil Ages and Tree Topologies.

Didier, Gilles; Laurin, Michel.

Syst Biol ; 69(6): 1068-1087, 2020 11 01.

Article in English | MEDLINE | ID: mdl-32191326

ABSTRACT

Being given a phylogenetic tree of both extant and extinct taxa in which the fossil ages are the only temporal information (namely, in which divergence times are considered unknown), we provide a method to compute the exact probability distribution of any divergence time of the tree with regard to any speciation (cladogenesis), extinction, and fossilization rates under the Fossilized Birth-Death model. We use this new method to obtain a probability distribution for the age of Amniota (the synapsid/sauropsid or bird/mammal divergence), one of the most-frequently used dating constraints. Our results suggest an older age (between about 322 and 340 Ma) than has been assumed by most studies that have used this constraint (which typically assumed a best estimate around 310-315 Ma) and provide, for the first time, a method to compute the shape of the probability density for this divergence time. [Divergence times; fossil ages; fossilized birth-death model; probability distribution.].

Subject(s)

Fossils , Phylogeny , Time , Animals , Extinction, Biological , Models, Biological

6.

Testing for correlation between traits under directional evolution.

Royer-Carenzi, Manuela; Didier, Gilles.

J Theor Biol ; 482: 109982, 2019 12 07.

Article in English | MEDLINE | ID: mdl-31446022

ABSTRACT

Being confounding factors, directional trends are likely to make two quantitative traits appear as spuriously correlated. By determining the probability distributions of independent contrasts when traits evolve following Brownian motions with linear trends, we show that the standard independent contrasts can not be used to test for correlation in this situation. We propose a multiple regression approach which corrects the bias caused by directional evolution. We show that our approach is equivalent to performing a Phylogenetic Generalized Least Squares (PGLS) analysis with tip times as covariables by providing a new and more general proof of the equivalence between PGLS and independent contrasts methods. Our approach is assessed and compared with three previous correlation tests on data simulated in various situations and overall outperforms all the other methods. The approach is next illustrated on a real dataset to test for correlation between hominin cranial capacity and body mass.

Subject(s)

Biological Evolution , Models, Genetic , Multifactorial Inheritance/physiology , Selection, Genetic/physiology , Algorithms , Animals , Body Weight/genetics , Computer Simulation , Humans , Least-Squares Analysis , Organ Size/genetics , Phenotype , Probability , Skull/anatomy & histology

7.

Parsimony-based test for identifying changes in evolutionary trends for quantitative characters: implications for the origin of the amniotic egg.

Didier, Gilles; Chabrol, Olivier; Laurin, Michel.

Cladistics ; 35(5): 576-599, 2019 Oct.

Article in English | MEDLINE | ID: mdl-34618939

ABSTRACT

The origin of the amniotic egg was a major event in vertebrate evolution and is thought to have contributed to the spectacular evolutionary radiation of amniotes. We test one of the most popular scenarios proposed by Carroll in 1970 to explain the origin of the amniotic egg using a novel method based on an asymmetric version of linear parsimony (aka Wagner parsimony) for identifying the most parsimonious split of a tree into two parts between which the evolution of the character is allowed to differ. The new method evaluates the cost of splitting a phylogenetic tree at a given node as the integral, over all pairs of asymmetry parameters, of the most parsimonious costs that can be achieved by using the first parameter on the subtree pending from this node and the second parameter elsewhere. By testing all the nodes, we then obtain the most parsimonious split of a tree with regard to the character values at its tips. Among the nine trees and two characters tested, our method yields a total of 517 parsimonious trend changes in Permo-Carboniferous stegocephalians, a single one of which occurs in a part of the tree (among stem-amniotes) where Carroll's scenario predicts that there should have been distinct changes in body size evolutionary trends. This refutes the scenario because the amniote stem does not appear to have elevated rates of evolutionary trend shifts. Our nodal body size estimates offer less discriminating power, but they likewise fail to find strong support for Carroll's scenario.

8.

Identifying communities from multiplex biological networks by randomized optimization of modularity.

Didier, Gilles; Valdeolivas, Alberto; Baudot, Anaïs.

F1000Res ; 7: 1042, 2018.

Article in English | MEDLINE | ID: mdl-30210790

ABSTRACT

The identification of communities, or modules, is a common operation in the analysis of large biological networks. The Disease Module Identification DREAM challenge established a framework to evaluate clustering approaches in a biomedical context, by testing the association of communities with GWAS-derived common trait and disease genes. We implemented here several extensions of the MolTi software that detects communities by optimizing multiplex (and monoplex) network modularity. In particular, MolTi now runs a randomized version of the Louvain algorithm, can consider edge and layer weights, and performs recursive clustering. On simulated networks, the randomization procedure clearly improves the detection of communities. On the DREAM challenge benchmark, the results strongly depend on the selected GWAS dataset and enrichment p -value threshold. However, the randomization procedure, as well as the consideration of weighted edges and layers generally increases the number of trait and disease community detected. The new version of MolTi and the scripts used for the DMI DREAM challenge are available at: https://github.com/gilles-didier/MolTi-DREAM.

Subject(s)

Algorithms , Communicable Diseases/genetics , Quantitative Trait, Heritable , Software , Cluster Analysis , Humans , Random Allocation

9.

Time-Dependent-Asymmetric-Linear-Parsimonious Ancestral State Reconstruction.

Didier, Gilles.

Bull Math Biol ; 79(10): 2334-2355, 2017 Oct.

Article in English | MEDLINE | ID: mdl-28819749

ABSTRACT

The time-dependent-asymmetric-linear parsimony is an ancestral state reconstruction method which extends the standard linear parsimony (a.k.a. Wagner parsimony) approach by taking into account both branch lengths and asymmetric evolutionary costs for reconstructing quantitative characters (asymmetric costs amount to assuming an evolutionary trend toward the direction with the lowest cost). A formal study of the influence of the asymmetry parameter shows that the time-dependent-asymmetric-linear parsimony infers states which are all taken among the known states, except for some degenerate cases corresponding to special values of the asymmetry parameter. This remarkable property holds in particular for the Wagner parsimony. This study leads to a polynomial algorithm which determines, and provides a compact representation of, the parametric reconstruction of a phylogenetic tree, that is for all the unknown nodes, the set of all the possible reconstructed states associated with the asymmetry parameters leading to them. The time-dependent-asymmetric-linear parsimony is finally illustrated with the parametric reconstruction of the body size of cetaceans.

Subject(s)

Biological Evolution , Models, Biological , Algorithms , Animals , Body Size , Cetacea/anatomy & histology , Cetacea/classification , Linear Models , Mathematical Concepts , Phylogeny , Time Factors

10.

Likelihood of Tree Topologies with Fossils and Diversification Rate Estimation.

Didier, Gilles; Fau, Marine; Laurin, Michel.

Syst Biol ; 66(6): 964-987, 2017 Nov 01.

Article in English | MEDLINE | ID: mdl-28431159

ABSTRACT

Since the diversification process cannot be directly observed at the human scale, it has to be studied from the information available, namely the extant taxa and the fossil record. In this sense, phylogenetic trees including both extant taxa and fossils are the most complete representations of the diversification process that one can get. Such phylogenetic trees can be reconstructed from molecular and morphological data, to some extent. Among the temporal information of such phylogenetic trees, fossil ages are by far the most precisely known (divergence times are inferences calibrated mostly with fossils). We propose here a method to compute the likelihood of a phylogenetic tree with fossils in which the only considered time information is the fossil ages, and apply it to the estimation of the diversification rates from such data. Since it is required in our computation, we provide a method for determining the probability of a tree topology under the standard diversification model. Testing our approach on simulated data shows that the maximum likelihood rate estimates from the phylogenetic tree topology and the fossil dates are almost as accurate as those obtained by taking into account all the data, including the divergence times. Moreover, they are substantially more accurate than the estimates obtained only from the exact divergence times (without taking into account the fossil record). We also provide an empirical example composed of 50 Permo-Carboniferous eupelycosaur (early synapsid) taxa ranging in age from about 315 Ma (Late Carboniferous) to 270 Ma (shortly after the end of the Early Permian). Our analyses suggest a speciation (cladogenesis, or birth) rate of about 0.1 per lineage and per myr, a marginally lower extinction rate, and a considerable hidden paleobiodiversity of early synapsids. [Extinction rate; fossil ages; maximum likelihood estimation; speciation rate.].

Subject(s)

Fossils , Genetic Speciation , Phylogeny , Animals , Computer Simulation , Reptiles/classification , Time

11.

A comparison of ancestral state reconstruction methods for quantitative characters.

Royer-Carenzi, Manuela; Didier, Gilles.

J Theor Biol ; 404: 126-142, 2016 09 07.

Article in English | MEDLINE | ID: mdl-27234644

ABSTRACT

Choosing an ancestral state reconstruction method among the alternatives available for quantitative characters may be puzzling. We present here a comparison of seven of them, namely the maximum likelihood, restricted maximum likelihood, generalized least squares under Brownian, Brownian-with-trend and Ornstein-Uhlenbeck models, phylogenetic independent contrasts and squared parsimony methods. A review of the relations between these methods shows that the maximum likelihood, the restricted maximum likelihood and the generalized least squares under Brownian model infer the same ancestral states and can only be distinguished by the distributions accounting for the reconstruction uncertainty which they provide. The respective accuracy of the methods is assessed over character evolution simulated under a Brownian motion with (and without) directional or stabilizing selection. We give the general form of ancestral state distributions conditioned on leaf states under the simulation models. Ancestral distributions are used first, to give a theoretical lower bound of the expected reconstruction error, and second, to develop an original evaluation scheme which is more efficient than comparing the reconstructed and the simulated states. Our simulations show that: (i) the distributions of the reconstruction uncertainty provided by the methods generally make sense (some more than others); (ii) it is essential to detect the presence of an evolutionary trend and to choose a reconstruction method accordingly; (iii) all the methods show good performances on characters under stabilizing selection; (iv) without trend or stabilizing selection, the maximum likelihood method is generally the most accurate.

Subject(s)

Phylogeny , Computer Simulation , Models, Biological

12.

Identifying communities from multiplex biological networks.

Didier, Gilles; Brun, Christine; Baudot, Anaïs.

PeerJ ; 3: e1525, 2015.

Article in English | MEDLINE | ID: mdl-26713261

ABSTRACT

Various biological networks can be constructed, each featuring gene/protein relationships of different meanings (e.g., protein interactions or gene co-expression). However, this diversity is classically not considered and the different interaction categories are usually aggregated in a single network. The multiplex framework, where biological relationships are represented by different network layers reflecting the various nature of interactions, is expected to retain more information. Here we assessed aggregation, consensus and multiplex-modularity approaches to detect communities from multiple network sources. By simulating random networks, we demonstrated that the multiplex-modularity method outperforms the aggregation and consensus approaches when network layers are incomplete or heterogeneous in density. Application to a multiplex biological network containing 4 layers of physical or functional interactions allowed recovering communities more accurately annotated than their aggregated counterparts. Overall, taking into account the multiplexity of biological networks leads to better-defined functional modules. A user-friendly graphical software to detect communities from multiplex networks, and corresponding C source codes, are available at GitHub (https://github.com/gilles-didier/MolTi).

13.

Choosing the best ancestral character state reconstruction method.

Royer-Carenzi, Manuela; Pontarotti, Pierre; Didier, Gilles.

Math Biosci ; 242(1): 95-109, 2013 Mar.

Article in English | MEDLINE | ID: mdl-23276531

ABSTRACT

Despite its intrinsic difficulty, ancestral character state reconstruction is an essential tool for testing evolutionary hypothesis. Two major classes of approaches to this question can be distinguished: parsimony- or likelihood-based approaches. We focus here on the second class of methods, more specifically on approaches based on continuous-time Markov modeling of character evolution. Among them, we consider the most-likely-ancestor reconstruction, the posterior-probability reconstruction, the likelihood-ratio method, and the Bayesian approach. We discuss and compare the above-mentioned methods over several phylogenetic trees, adding the maximum-parsimony method performance in the comparison. Under the assumption that the character evolves according a continuous-time Markov process, we compute and compare the expectations of success of each method for a broad range of model parameter values. Moreover, we show how the knowledge of the evolution model parameters allows to compute upper bounds of reconstruction performances, which are provided as references. The results of all these reconstruction methods are quite close one to another, and the expectations of success are not so far from their theoretical upper bounds. But the performance ranking heavily depends on the topology of the studied tree, on the ancestral node that is to be inferred and on the parameter values. Consequently, we propose a protocol providing for each parameter value the best method in terms of expectation of success, with regard to the phylogenetic tree and the ancestral node to infer.

Subject(s)

Biological Evolution , Likelihood Functions , Models, Genetic , Foraminifera/genetics , Markov Chains , Phylogeny

14.

The reconstructed evolutionary process with the fossil record.

Didier, Gilles; Royer-Carenzi, Manuela; Laurin, Michel.

J Theor Biol ; 315: 26-37, 2012 Dec 21.

Article in English | MEDLINE | ID: mdl-22982290

ABSTRACT

Using the fossil record yields more detailed reconstructions of the evolutionary process than what is obtained from contemporary lineages only. In this work, we present a stochastic process modeling not only speciation and extinction, but also fossil finds. Next, we derive an explicit formula for the likelihood of a reconstructed phylogeny with fossils, which can be used to estimate the speciation and extinction rates. Finally, we provide a comparative simulation-based evaluation of the accuracy of estimations of these rates from complete phylogenies (including extinct lineages), from reconstructions with contemporary lineages only and from reconstructions with contemporary lineages and the fossil record. Results show that taking the fossil record into account yields more accurate estimates of speciation and extinction rates than considering only contemporary lineages.

Subject(s)

Biological Evolution , Fossils , Computer Simulation , Extinction, Biological , Likelihood Functions , Phylogeny , Time Factors

15.

Mapping multivalued onto Boolean dynamics.

Didier, Gilles; Remy, Elisabeth; Chaouiya, Claudine.

J Theor Biol ; 270(1): 177-84, 2011 Feb 07.

Article in English | MEDLINE | ID: mdl-20868697

ABSTRACT

This paper deals with the generalized logical framework defined by René Thomas in the 70's to qualitatively represent the dynamics of regulatory networks. In this formalism, a regulatory network is represented as a graph, where nodes denote regulatory components (basically genes) and edges denote regulations between these components. Discrete variables are associated to regulatory components accounting for their levels of expression. In most cases, Boolean variables are enough, but some situations may require further values. Despite this fact, the majority of tools dedicated to the analysis of logical models are restricted to the Boolean case. A formal Boolean mapping of multivalued logical models is a natural way of extending the applicability of these tools. Three decades ago, a multivalued to Boolean variable mapping was proposed by P. Van Ham. Since then, all works related to multivalued logical models and using a Boolean representation rely on this particular mapping. We formally show in this paper that this mapping is actually the sole, up to cosmetic changes, that could preserve the regulatory structures of the underlying graphs as well as their dynamical behaviours.

Subject(s)

Gene Regulatory Networks/physiology , Models, Biological , Algorithms , Cell Nucleus/metabolism , Cytoplasm/metabolism , Proto-Oncogene Proteins c-mdm2/metabolism , Tumor Suppressor Protein p53/metabolism

16.

Parametric maximum parsimonious reconstruction on trees.

Didier, Gilles.

Bull Math Biol ; 73(7): 1477-502, 2011 Jul.

Article in English | MEDLINE | ID: mdl-20737226

ABSTRACT

We give a formal study of the relationships between the transition cost parameters and the generalized maximum parsimonious reconstructions of unknown (ancestral) binary character states {0,1} over a phylogenetic tree. As a main result, we show there are two thresholds λ¹n and λ°n , generally confounded, associated to each node n of the phylogenetic tree and such that there exists a maximum parsimonious reconstruction associating state 1 to n (resp. state 0 to n) if the ratio "10-cost"/"01-cost" is smaller than λ¹n (resp. greater than λ°n). We propose a dynamic programming algorithm computing these thresholds in a quadratic time with the size of tree.We briefly illustrate some possible applications of this work over a biological dataset. In particular, the thresholds provide a natural way to quantify the degree of support for states reconstructed as well as to determine what kind of evolutionary assumptions in terms of costs are necessary to a given reconstruction.

Subject(s)

Algorithms , Numerical Analysis, Computer-Assisted , Phylogeny

17.

MS4--Multi-Scale Selector of Sequence Signatures: an alignment-free method for classification of biological sequences.

Corel, Eduardo; Pitschi, Florian; Laprevotte, Ivan; Grasseau, Gilles; Didier, Gilles; Devauchelle, Claudine.

BMC Bioinformatics ; 11: 406, 2010 Jul 30.

Article in English | MEDLINE | ID: mdl-20673356

ABSTRACT

BACKGROUND: While multiple alignment is the first step of usual classification schemes for biological sequences, alignment-free methods are being increasingly used as alternatives when multiple alignments fail. Subword-based combinatorial methods are popular for their low algorithmic complexity (suffix trees ...) or exhaustivity (motif search), in general with fixed length word and/or number of mismatches. We developed previously a method to detect local similarities (the N-local decoding) based on the occurrences of repeated subwords of fixed length, which does not impose a fixed number of mismatches. The resulting similarities are, for some "good" values of N, sufficiently relevant to form the basis of a reliable alignment-free classification. The aim of this paper is to develop a method that uses the similarities detected by N-local decoding while not imposing a fixed value of N. We present a procedure that selects for every position in the sequences an adaptive value of N, and we implement it as the MS4 classification tool. RESULTS: Among the equivalence classes produced by the N-local decodings for all N, we select a (relatively) small number of "relevant" classes corresponding to variable length subwords that carry enough information to perform the classification. The parameter N, for which correct values are data-dependent and thus hard to guess, is here replaced by the average repetitivity kappa of the sequences. We show that our approach yields classifications of several sets of HIV/SIV sequences that agree with the accepted taxonomy, even on usually discarded repetitive regions (like the non-coding part of LTR). CONCLUSIONS: The method MS4 satisfactorily classifies a set of sequences that are notoriously hard to align. This suggests that our approach forms the basis of a reliable alignment-free classification tool. The only parameter kappa of MS4 seems to give reasonable results even for its default value, which can be a great advantage for sequence sets for which little information is available.

Subject(s)

Classification/methods , Computational Biology/methods , Software , Algorithms , Amino Acid Sequence , Base Sequence , Genes, nef , Genome, Viral , HIV/classification , HIV/genetics , HIV Long Terminal Repeat , Simian Immunodeficiency Virus/classification , Simian Immunodeficiency Virus/genetics

18.

TranscriptomeBrowser: a powerful and flexible toolbox to explore productively the transcriptional landscape of the Gene Expression Omnibus database.

Lopez, Fabrice; Textoris, Julien; Bergon, Aurélie; Didier, Gilles; Remy, Elisabeth; Granjeaud, Samuel; Imbert, Jean; Nguyen, Catherine; Puthier, Denis.

PLoS One ; 3(12): e4001, 2008.

Article in English | MEDLINE | ID: mdl-19104654

ABSTRACT

BACKGROUND: As public microarray repositories are constantly growing, we are facing the challenge of designing strategies to provide productive access to the available data. METHODOLOGY: We used a modified version of the Markov clustering algorithm to systematically extract clusters of co-regulated genes from hundreds of microarray datasets stored in the Gene Expression Omnibus database (n = 1,484). This approach led to the definition of 18,250 transcriptional signatures (TS) that were tested for functional enrichment using the DAVID knowledgebase. Over-representation of functional terms was found in a large proportion of these TS (84%). We developed a JAVA application, TBrowser that comes with an open plug-in architecture and whose interface implements a highly sophisticated search engine supporting several Boolean operators (http://tagc.univ-mrs.fr/tbrowser/). User can search and analyze TS containing a list of identifiers (gene symbols or AffyIDs) or associated with a set of functional terms. CONCLUSIONS/SIGNIFICANCE: As proof of principle, TBrowser was used to define breast cancer cell specific genes and to detect chromosomal abnormalities in tumors. Finally, taking advantage of our large collection of transcriptional signatures, we constructed a comprehensive map that summarizes gene-gene co-regulations observed through all the experiments performed on HGU133A Affymetrix platform. We provide evidences that this map can extend our knowledge of cellular signaling pathways.

Subject(s)

Databases, Genetic , Gene Expression Profiling , Gene Expression/physiology , Software , User-Computer Interface , Algorithms , Animals , Cluster Analysis , Efficiency , Gene Regulatory Networks/physiology , Humans , Meta-Analysis as Topic , Mice , Oligonucleotide Array Sequence Analysis , Rats , Signal Transduction/genetics

19.

Mapping sequences by parts.

Didier, Gilles; Guziolowski, Carito.

Algorithms Mol Biol ; 2: 11, 2007 Sep 19.

Article in English | MEDLINE | ID: mdl-17880695

ABSTRACT

BACKGROUND: We present the N-map method, a pairwise and asymmetrical approach which allows us to compare sequences by taking into account evolutionary events that produce shuffled, reversed or repeated elements. Basically, the optimal N-map of a sequence s over a sequence t is the best way of partitioning the first sequence into N parts and placing them, possibly complementary reversed, over the second sequence in order to maximize the sum of their gapless alignment scores. RESULTS: We introduce an algorithm computing an optimal N-map with time complexity O (|s| x |t| x N) using O (|s| x |t| x N) memory space. Among all the numbers of parts taken in a reasonable range, we select the value N for which the optimal N-map has the most significant score. To evaluate this significance, we study the empirical distributions of the scores of optimal N-maps and show that they can be approximated by normal distributions with a reasonable accuracy. We test the functionality of the approach over random sequences on which we apply artificial evolutionary events. PRACTICAL APPLICATION: The method is illustrated with four case studies of pairs of sequences involving non-standard evolutionary events.

20.

Comparing sequences without using alignments: application to HIV/SIV subtyping.

Didier, Gilles; Debomy, Laurent; Pupin, Maude; Zhang, Ming; Grossmann, Alexander; Devauchelle, Claudine; Laprevotte, Ivan.

BMC Bioinformatics ; 8: 1, 2007 Jan 02.

Article in English | MEDLINE | ID: mdl-17199892

ABSTRACT

BACKGROUND: In general, the construction of trees is based on sequence alignments. This procedure, however, leads to loss of informationwhen parts of sequence alignments (for instance ambiguous regions) are deleted before tree building. To overcome this difficulty, one of us previously introduced a new and rapid algorithm that calculates dissimilarity matrices between sequences without preliminary alignment. RESULTS: In this paper, HIV (Human Immunodeficiency Virus) and SIV (Simian Immunodeficiency Virus) sequence data are used to evaluate this method. The program produces tree topologies that are identical to those obtained by a combination of standard methods detailed in the HIV Sequence Compendium. Manual alignment editing is not necessary at any stage. Furthermore, only one user-specified parameter is needed for constructing trees. CONCLUSION: The extensive tests on HIV/SIV subtyping showed that the virus classifications produced by our method are in good agreement with our best taxonomic knowledge, even in non-coding LTR (Long Terminal Repeat) regions that are not tractable by regular alignment methods due to frequent duplications/insertions/deletions. Our method, however, is not limited to the HIV/SIV subtyping. It provides an alternative tree construction without a time-consuming aligning procedure.

Subject(s)

HIV/classification , HIV/genetics , Sequence Alignment/methods , Serotyping/methods , Simian Immunodeficiency Virus/classification , Simian Immunodeficiency Virus/genetics , Animals , Base Sequence/genetics , Humans , Molecular Sequence Data

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL