Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 24
Filter
Add more filters










Publication year range
1.
IET Syst Biol ; 3(4): 266-78, 2009 Jul.
Article in English | MEDLINE | ID: mdl-19640165

ABSTRACT

Identification of interaction patterns in complex networks via community structures has gathered a lot of attention in recent research studies. Local community structures provide a better measure to understand and visualise the nature of interaction when the global knowledge of networks is unknown. Recent research on local community structures, however, lacks the feature to adjust itself in the dynamic networks and heavily depends on the source vertex position. In this study the authors propose a novel approach to identify local communities based on iterative agglomeration and local optimisation. The proposed solution has two significant improvements: (i) in each iteration, agglomeration strengthens the local community measure by selecting the best possible set of vertices, and (ii) the proposed vertex and community rank criterion are suitable for the dynamic networks where the interactions among vertices may change over time. In order to evaluate the proposed algorithm, extensive experiments and benchmarking on computer generated networks as well as real-world social and biological networks have been conducted. The experiment results reflect that the proposed algorithm can identify local communities, irrespective of the source vertex position, with more than 92% accuracy in the synthetic as well as in the real-world networks.


Subject(s)
Algorithms , Models, Biological , Protein Interaction Mapping/methods , Proteome/metabolism , Signal Transduction/physiology , Computer Simulation , Models, Statistical
2.
IET Syst Biol ; 1(5): 286-91, 2007 Sep.
Article in English | MEDLINE | ID: mdl-17907677

ABSTRACT

Here, the reliability of a recent approach to use parameterised linear programming for detecting community structures in network has been investigated. Using a one-parameter family of objective functions, a number of "perturbation experiments' document that our approach works rather well. A real-life network and a family of benchmark network are also analysed.


Subject(s)
Algorithms , Models, Biological , Population Dynamics , Proteome/metabolism , Signal Transduction/physiology , Social Support , Animals , Computer Simulation , Humans , Numerical Analysis, Computer-Assisted , Programming, Linear , Reproducibility of Results , Sensitivity and Specificity
3.
Mol Biol Evol ; 19(12): 2051-9, 2002 Dec.
Article in English | MEDLINE | ID: mdl-12446797

ABSTRACT

A method is described that allows the assessment of treelikeness of phylogenetic distance data before tree estimation. This method is related to statistical geometry as introduced by Eigen, Winkler-Oswatitsch, and Dress (1988 [Proc. Natl. Acad. Sci. USA. 85:5913-5917]), and in essence, displays a measure for treelikeness of quartets in terms of a histogram that we call a delta plot. This allows identification of nontreelike data and analysis of noisy data sets arising from processes such as, for example, parallel evolution, recombination, or lateral gene transfer. In addition to an overall assessment of treelikeness, individual taxa can be ranked by reference to the treelikeness of the quartets to which they belong. Removal of taxa on the basis of this ranking results in an increase in accuracy of tree estimation. Recombinant data sets are simulated, and the method is shown to be capable of identifying single recombinant taxa on the basis of distance information alone, provided the parents of the recombinant sequence are sufficiently divergent and the mixture of tree histories is not strongly skewed toward a single tree. delta Plots and taxon rankings are applied to three biological data sets using distances derived from sequence alignment, gene order, and fragment length polymorphism.


Subject(s)
Models, Genetic , Phylogeny , Polymorphism, Restriction Fragment Length , Recombination, Genetic
4.
Mol Biol Evol ; 18(8): 1502-11, 2001 Aug.
Article in English | MEDLINE | ID: mdl-11470841

ABSTRACT

Phylogenetic analyses of 110 serpin protein sequences revealed clades consistent with independent phylogenetic analyses based on exon-intron structure and diagnostic amino acid sites. Trees were estimated by maximum likelihood, neighbor joining, and partial split decomposition using both the BLOSUM 62 and Jones-Taylor-Thornton substitution matrices. Neighbor-joining trees gave results closest to those based on independent analyses using genomic and chromosomal data. The maximum-likelihood trees derived using the quartet puzzling algorithm were very conservative, producing many small clades that separated groups of proteins that other results suggest were related. Independent analyses based on exon-intron structure suggested that a neighbor-joining tree was more accurate than maximum-likelihood trees obtained using the quartet puzzling algorithm.


Subject(s)
Amino Acids/genetics , Phylogeny , Serpins/genetics , Animals , Databases, Factual , Evolution, Molecular , Exons , Genes/genetics , Genetic Variation , Humans , Introns
5.
Mol Phylogenet Evol ; 19(2): 302-10, 2001 May.
Article in English | MEDLINE | ID: mdl-11341811

ABSTRACT

Observations from molecular marker studies on recently diverged species indicate that substitution patterns in DNA sequences can often be complex and poorly described by tree-like bifurcating evolutionary models. These observations might result from processes of species diversification and/or processes of sequence evolution that are not tree-like. In these cases, bifurcating tree representations provide poor visualization of phylogenetic signals in sequence data. In this paper, we use median networks to study DNA sequence substitution patterns in plant nuclear and chloroplast markers. We describe how to prune median networks to obtain so called pruned median networks. These simpler networks may help to provide a useful framework for investigating the phylogenetic complexity of recently diverged taxa with hybrid origins.


Subject(s)
DNA, Plant/genetics , Phylogeny , Base Sequence , Cell Nucleus/genetics , Chloroplasts/genetics , Evolution, Molecular , Models, Genetic , Molecular Sequence Data , Plants/genetics , Sequence Homology, Nucleic Acid
6.
Mol Biol Evol ; 18(4): 577-84, 2001 Apr.
Article in English | MEDLINE | ID: mdl-11264410

ABSTRACT

A combination of three independent biological features, genomic organization, diagnostic amino acid sites, and rare indels, was used to elucidate the phylogeny of the vertebrate serpin (serine protease inhibitor) superfamily. A strong correlation between serpin gene families displaying (1) a conserved exon-intron pattern and (2) family-specific combinations of amino acid residues at specific sites suggests that present-day vertebrates encompass six serpin gene families which evolved from primordial genes by massive intron insertion before or during early vertebrate radiation. Introns placed at homologous positions in the gene sequences in combination with diagnostic sequence characters may also constitute a reliable kinship indicator for other protein superfamilies.


Subject(s)
Exons/genetics , Introns/genetics , Serine Proteinase Inhibitors/genetics , Serpins/genetics , Amino Acid Sequence/genetics , Animals , Humans , Models, Statistical , Multigene Family , Phylogeny , Serine Proteinase Inhibitors/classification , Serpins/classification
7.
Mol Biol Evol ; 17(1): 164-78, 2000 Jan.
Article in English | MEDLINE | ID: mdl-10666716

ABSTRACT

An information theoretic approach is used to examine the magnitude and origin of associations among amino acid sites in the basic helix-loop-helix (bHLH) family of transcription factors. Entropy and mutual information values are used to summarize the variability and covariability of amino acids comprising the bHLH domain for 242 sequences. When these quantitative measures are integrated with crystal structure data and summarized using helical wheels, they provide important insights into the evolution of three-dimensional structure in these proteins. We show that amino acid sites in the bHLH domain known to pack against each other have very low entropy values, indicating little residue diversity at these contact sites. Noncontact sites, on the other hand, exhibit significantly larger entropy values, as well as statistically significant levels of mutual information or association among sites. High levels of mutual information indicate significant amounts of intercorrelation among amino acid residues at these various sites. Using computer simulations based on a parametric bootstrap procedure, we are able to partition the observed covariation among various amino acid sites into that arising from phylogenetic (common ancestry) and stochastic causes and those resulting from structural and functional constraints. These results show that a significant amount of the observed covariation among amino acid sites is due to structural/functional constraints, over and above the covariation arising from phylogenetic constraints. These quantitative analyses provide a highly integrated evolutionary picture of the multidimensional dynamics of sequence diversity and protein structure.


Subject(s)
Amino Acids/genetics , Evolution, Molecular , Helix-Loop-Helix Motifs/genetics , Models, Theoretical , Transcription Factors/genetics , Amino Acids/chemistry , Animals , Humans , Transcription Factors/chemistry
9.
J Mol Evol ; 48(5): 501-16, 1999 May.
Article in English | MEDLINE | ID: mdl-10198117

ABSTRACT

Quantitative analyses were carried out on a large number of proteins that contain the highly conserved basic helix-loop-helix domain. Measures derived from information theory were used to examine the extent of conservation at amino acid sites within the bHLH domain as well as the extent of mutual information among sites within the domain. Using the Boltzmann entropy measure, we described the extent of amino acid conservation throughout the bHLH domain. We used position association (pa) statistics that reflect the joint probability of occurrence of events to estimate the "mutual information content" among distinct amino acid sites. Further, we used pa statistics to estimate the extent of association in amino acid composition at each site in the domain and between amino acid composition and variables reflecting clade and group membership, loop length, and the presence of a leucine zipper. The pa values were also used to describe groups of amino acid sites called "cliques" that were highly associated with each other. Finally, a predictive motif was constructed that accurately identifies bHLH domain-containing proteins that belong to Groups A and B.


Subject(s)
DNA-Binding Proteins/genetics , Helix-Loop-Helix Motifs/genetics , Transcription Factors/genetics , Amino Acid Sequence , Animals , Basic Helix-Loop-Helix Transcription Factors , Binding Sites/genetics , Conserved Sequence , DNA/metabolism , DNA-Binding Proteins/chemistry , DNA-Binding Proteins/metabolism , Evolution, Molecular , Humans , Molecular Sequence Data , Phylogeny , Sequence Homology, Amino Acid , Transcription Factors/chemistry , Transcription Factors/metabolism
10.
Proc Biol Sci ; 266(1433): 2131-6, 1999 Oct 22.
Article in English | MEDLINE | ID: mdl-10902547

ABSTRACT

On the evolutionary trajectory that led to human language there must have been a transition from a fairly limited to an essentially unlimited communication system. The structure of modern human languages reveals at least two steps that are required for such a transition: in all languages (i) a small number of phonemes are used to generate a large number of words; and (ii) a large number of words are used to a produce an unlimited number of sentences. The first (and simpler) step is the topic of the current paper. We study the evolution of communication in the presence of errors and show that this limits the number of objects (or concepts) that can be described by a simple communication system. The evolutionary optimum is achieved by using only a small number of signals to describe a few valuable concepts. Adding more signals does not increase the fitness of a language. This represents an error limit for the evolution of communication. We show that this error limit can be overcome by combining signals (phonemes) into words. The transition from an analogue to a digital system was a necessary step toward the evolution of human language.


Subject(s)
Biological Evolution , Language , Models, Theoretical , Humans
11.
Article in English | MEDLINE | ID: mdl-9783216

ABSTRACT

In this paper, we discuss a novel scoring scheme for sequence alignments. The score of an alignment is defined as the sum of so-called weights of aligned segment pairs. A simple modification of the weight function used by the original version of the DIALIGN alignment program turns out to have a crucial advantage: it can be applied to both, global and local alignment problems without the need to specify a threshold parameter.


Subject(s)
Sequence Alignment/methods , Software , Amino Acid Sequence , Animals , Artificial Intelligence , Humans , Molecular Sequence Data , Proteins/chemistry , Proteins/genetics , Sequence Alignment/statistics & numerical data , Sequence Homology, Amino Acid
12.
Bioinformatics ; 14(3): 290-4, 1998.
Article in English | MEDLINE | ID: mdl-9614273

ABSTRACT

MOTIVATION: DIALIGN is a new method for pairwise as well as multiple alignment of nucleic acid and protein sequences. While standard alignment programs rely on comparing single residues and imposing gap penalties, DIALIGN constructs alignments by comparing whole segments of the sequences. No gap penalty is employed. This point of view is especially adequate if sequences are not globally related, but share only local similarities, as is the case in genomic DNA sequences and in many protein families. RESULTS: Using four different data sets, we show that DIALIGN is able correctly to align conserved motifs in protein sequences. Alignments produced by DIALIGN are compared systematically to the results of five other alignment programs. AVAILABILITY: DIALIGN is available to the scientific community free of charge for non-commercial use. Executables for various UNIX platforms including LINUX can be downloaded at http://www.gsf.de/biodv/dialign.html CONTACT: werner, morgenstern@gsf.de


Subject(s)
Computational Biology/methods , Sequence Alignment/methods , Algorithms , Amino Acid Sequence , Conserved Sequence , Molecular Sequence Data , Sequence Analysis/methods , Sequence Analysis, DNA/methods , Sequence Homology, Amino Acid , Software Validation
13.
Comput Appl Biosci ; 13(6): 625-6, 1997 Dec.
Article in English | MEDLINE | ID: mdl-9475994

ABSTRACT

MOTIVATION: DCA is a new computer program for multiple sequence alignment which utilizes a 'divide-and-conquer' type of heuristic approach. AVAILABILITY: The algorithm is freely available from http://bibiserv.TechFak.Uni-Bielefeld.DE/dca/.


Subject(s)
Computational Biology/methods , Sequence Alignment/methods , Software , Algorithms , Sequence Analysis/methods
14.
Proc Natl Acad Sci U S A ; 93(22): 12098-103, 1996 Oct 29.
Article in English | MEDLINE | ID: mdl-8901539

ABSTRACT

In this paper, a new way to think about, and to construct, pairwise as well as multiple alignments of DNA and protein sequences is proposed. Rather than forcing alignments to either align single residues or to introduce gaps by defining an alignment as a path running right from the source up to the sink in the associated dot-matrix diagram, we propose to consider alignments as consistent equivalence relations defined on the set of all positions occurring in all sequences under consideration. We also propose constructing alignments from whole segments exhibiting highly significant overall similarity rather than by aligning individual residues. Consequently, we present an alignment algorithm that (i) is based on segment-to-segment comparison instead of the commonly used residue-to-residue comparison and which (ii) avoids the well-known difficulties concerning the choice of appropriate gap penalties: gaps are not treated explicity, but remain as those parts of the sequences that do not belong to any of the aligned segments. Finally, we discuss the application of our algorithm to two test examples and compare it with commonly used alignment methods. As a first example, we aligned a set of 11 DNA sequences coding for functional helix-loop-helix proteins. Though the sequences show only low overall similarity, our program correctly aligned all of the 11 functional sites, which was a unique result among the methods tested. As a by-product, the reading frames of the sequences were identified. Next, we aligned a set of ribonuclease H proteins and compared our results with alignments produced by other programs as reported by McClure et al. [McClure, M. A., Vasi, T. K. & Fitch, W. M. (1994) Mol. Biol. Evol. 11, 571-592]. Our program was one of the best scoring programs. However, in contrast to other methods, our protein alignments are independent of user-defined parameters.


Subject(s)
DNA/chemistry , Proteins/chemistry , Sequence Alignment/methods , Algorithms , Amino Acid Sequence , Base Sequence , Helix-Loop-Helix Motifs , Molecular Sequence Data , Software
15.
Gene ; 172(1): GC33-41, 1996 Jun 12.
Article in English | MEDLINE | ID: mdl-8654965

ABSTRACT

We have developed a fast heuristic algorithm for multiple sequence alignment which provides near-to-optimal results for sufficiently homologous sequences. The algorithm makes use of the standard dynamic programming procedure by applying it to all pairs of sequences. The resulting score matrices for pair-wise alignment give rise to secondary matrices containing the additional charges imposed by forcing the alignment path to run through a particular vertex. Such a constraint corresponds to slicing the sequences at the positions defining that vertex, and aligning the remaining pairs of prefix and suffix sequences separately. From these secondary matrices, one can compute-for any given family of sequences-suitable positions for cutting all of these sequences simultaneously, thus reducing the problem of aligning a family of n sequences of average length l in a Divide and Conquer fashion to aligning two families of n sequences of approximately half that length. In this paper, we explain the method for the case of 3 sequences in detail, and we demonstrate its potential and its limits by discussing its behaviour for several test families. A generalization for aligning more than 3 sequences is lined out, and some actual alignments constructed by our algorithm for various user-defined parameters are presented.


Subject(s)
Sequence Alignment/methods , Algorithms , Amino Acid Sequence , Models, Genetic , Molecular Sequence Data
16.
Article in English | MEDLINE | ID: mdl-7584425

ABSTRACT

We present a report on work in progress on a divide and conquer approach to multiple alignment. The algorithm makes use of the costs calculated from applying the standard dynamic programming scheme to all pairs of sequences. The resulting cost matrices for pairwise alignment give rise to secondary matrices containing the additional costs imposed by fixing the path through the dynamic programming graph at a particular vertex. Such a constraint corresponds to a division of the problem obtained by slicing both sequences between two particular positions, and aligning the two sequences on the left and the two sequences on the right, charging for gaps introduced at the slicing point. To obtain an estimate for the additional cost imposed by forcing the multiple alignment through a particular vertex in the whole hypercube, we will take a (weighted) sum of secondary costs over all pairwise projections of the division of the problem, as defined by this vertex, that is, by slicing all sequences at the points suggested by the vertex. We then use that partition of every single sequence under consideration into two 'halfs' which imposes a minimal (weighted) sum of pairwise additional costs, making sure that one of the sequences is divided somewhere close to its midpoint. Hence, each iteration can cut the problem size in half. As the enumeration of all possible partitions may restrict this approach to small-size problems, we eliminate futile partitions, and organize their enumeration in a way that starts with the most promising ones.(ABSTRACT TRUNCATED AT 250 WORDS)


Subject(s)
Algorithms , Amino Acid Sequence , Proteins/chemistry , Sequence Homology, Amino Acid , Sequence Homology , Azurin/analogs & derivatives , Azurin/chemistry , Metalloproteins/chemistry , Molecular Sequence Data , Plant Proteins/chemistry , Plastocyanin/chemistry , Pseudomonas , Software
17.
Proc Natl Acad Sci U S A ; 90(21): 10320-4, 1993 Nov 01.
Article in English | MEDLINE | ID: mdl-8234292

ABSTRACT

A clustering technique allowing a restricted amount of overlapping and based on an abstract theory of coherent decompositions of finite metrics is used to analyze the evolution of foot-and-mouth disease viruses. The emerging picture is compatible with the existence of viral populations with a quasispecies structure and illustrates various forms of evolution of this virus family. In addition, it allows the correlation of these forms with geographic occurrence.


Subject(s)
Aphthovirus/genetics , Biological Evolution , Viruses/genetics , Animals , Aphthovirus/classification , Cattle , Mathematics , Models, Genetic , Monte Carlo Method , Serotyping , Viruses/classification
18.
Mech Dev ; 44(1): 17-31, 1993 Nov.
Article in English | MEDLINE | ID: mdl-8155572

ABSTRACT

Morphological differentiation patterns--among them concentric rings and radial zonations--can be induced in the band-mutant of Neurospora crassa by appropriate experimental conditions, in particular by a mere shift of certain salt concentrations in the medium. The role of initial experimental conditions is examined and, furthermore, the influences of artificially induced phase differences are analyzed with respect to pattern formation. While the concentric ring pattern is due to some (endogenous) circadian rhythmicity within every hypha, nothing is known about the underlying mechanism of radial zonation development. Various hypotheses were tested with the help of a cellular automaton model which mimics growth, branching and differentiation of a fungal mycelium. In particular, sufficient conditions are provided which imply the formation of radial spore zonations. These conditions postulate a rather homogeneous microscopic hyphal branching pattern and induction of spore differentiation by means of an activator-inhibitor system. Furthermore, a working hypothesis for the formation of spore patterns in Neurospora crassa is suggested which is based on an extracellular control of fungal differentiation.


Subject(s)
Computer Simulation , Models, Biological , Neurospora crassa/growth & development , Cell Differentiation/physiology , Culture Media , Darkness , Hydrogen-Ion Concentration , Mathematics , Viscosity
19.
Mol Phylogenet Evol ; 1(3): 242-52, 1992 Sep.
Article in English | MEDLINE | ID: mdl-1342941

ABSTRACT

In order to analyze the structure inherent to a matrix of dissimilarities (such as evolutionary distances) we propose to use a new technique called split decomposition. This method accurately dissects the given dissimilarity measure as a sum of elementary "split" metrics plus a (small) residue. The split summands identify related groups which are susceptible to further interpretation when casted against the available biological information. Reanalysis of previously published ribosomal RNA data sets using split decomposition illustrate the potential of this approach.


Subject(s)
Data Interpretation, Statistical , Genetic Techniques , Phylogeny , Animals , Computer Graphics , Evaluation Studies as Topic , Humans , Models, Genetic , Sequence Homology, Amino Acid , Sequence Homology, Nucleic Acid
20.
Biol Cybern ; 62(6): 519-28, 1990.
Article in English | MEDLINE | ID: mdl-2357475

ABSTRACT

The aim of our investigation is to understand the mechanisms which control the movement of the human arm. The arm is here considered as a redundant system: the shoulder, elbow and wrist joints, which provide three degrees of freedom, combine to move the hand in a horizontal plane, i.e. a two dimensional space. Thus the system has one extra degree of freedom. Earlier investigations of the static situation led to the hypothesis that independent cost functions were attached to each of the three joints and that the configuration chosen for a given target position is that which provides the minimum total cost (Cruse 1986). The aim of the current investigation was to look for measurable values corresponding to the hypothetical cost functions. Experiments using pointers of different lengths attached to the hand showed that the strategy in choosing the joint angles are independent of the limb length. The muscle force necessary to reach a given angle is increased by a spring mounted across a joint. In this situation the angles of the loaded joint are changed for a given target point to give way to the force effect. This leads to the conclusion that the hypothetical cost functions are not independent of the physiological costs necessary to hold the joint at a given angle. The cost functions seem to depend on joint angle and on the force which is necessary to hold the joint in a given position. Cost functions are measured by psychophysical methods. The results show U-shaped curves which can be approximated by parabolas. The position of minimum cost (maximum comfort) for one joint showed no or weak dependency on the angles of the other joints. For each subject these "psychophysical" cost functions are compared with the hypothetical cost functions. The comparison showed reasonable agreement. This supports the assumption that the psychophysically measured "comfort functions" provide a measure for the hypothetical cost functions postulated to explain the targeting movements. Targeting experiments using a four joint arm which has two extra degrees of freedom showed a much larger scatter compared to the three joint arm. Nevertheless, the results still conform to the hypothesis that also in this case the minimum cost principle is applied to solve the redundancy problem. As the cost function for the whole arm shows a large minimum valley, quite a large range of arm positions is possible of about equal total costs.(ABSTRACT TRUNCATED AT 400 WORDS)


Subject(s)
Arm/physiology , Movement/physiology , Muscles/physiology , Psychophysics , Humans
SELECTION OF CITATIONS
SEARCH DETAIL
...