Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 28
Filter
1.
Bioinform Adv ; 3(1): vbad087, 2023.
Article in English | MEDLINE | ID: mdl-37456508

ABSTRACT

Motivation: Biology students often struggle with the fundamental concepts of evolutionary genetics, including genetic drift, mutation and selection. To address this problem, 1LocusSim was developed to simulate the interaction of different factors, such as population size, mutation, selection and dominance, to study their effect on allelic frequency during evolution. With 1LocusSim, students can compare theoretical results with simulation outputs and solve and analyze different problems of population genetics. The 1LocusSim web has a responsive design which means that it has been specifically designed to be used on smartphones. Results: To demonstrate its use, I review the classical overdominance model of population genetics and highlight a characteristic that is often not explicitly stated. Specifically, it is emphasized that the equilibrium of the model does not depend on the homozygous selection coefficients but rather on the ratio of the selection coefficients. This is already clear from the classical formula but maybe not so much for students. Also it implies that the equilibrium can be expressed solely in terms of the dominance coefficient h. To verify this theoretical prediction, I utilize the simulator and calculate the equilibrium for the well-known case of sickle cell anemia. By utilizing this tool, students can learn at their own pace and convenience, anywhere and anytime. Availability and implementation: 1LocusSim if freely available at https://1LocusSim-biosdev.pythonanywhere.com/. Website implemented under the Bottle micro web-framework for Python, with all major browsers supported. Supplementary information: Supplementary data are available at Bioinformatics Advances online.

2.
Biology (Basel) ; 11(6)2022 Jun 19.
Article in English | MEDLINE | ID: mdl-35741456

ABSTRACT

Pollution and other anthropogenic effects have driven a decrease in Atlantic salmon (Salmo salar) in the Iberian Peninsula. The restocking effort carried out in the 1980s, with salmon from northern latitudes with the aim of mitigating the decline of native populations, failed, probably due to the deficiency in adaptation of foreign salmon from northern Europe to the warm waters of the Iberian Peninsula. This result would imply that the Iberian populations of Atlantic salmon have experienced local adaptation in their past evolutionary history, as has been described for other populations of this species and other salmonids. Local adaptation can occur by divergent selections between environments, favoring the fixation of alleles that increase the fitness of a population in the environment it inhabits relative to other alleles favored in another population. In this work, we compared the genomes of different populations from the Iberian Peninsula (Atlantic and Cantabric basins) and Scotland in order to provide tentative evidence of candidate SNPs responsible for the adaptive differences between populations, which may explain the failures of restocking carried out during the 1980s. For this purpose, the samples were genotyped with a 220,000 high-density SNP array (Affymetrix) specific to Atlantic salmon. Our results revealed potential evidence of local adaptation for North Spanish and Scottish populations. As expected, most differences concerned the comparison of the Iberian Peninsula with Scotland, although there were also differences between Atlantic and Cantabric populations. A high proportion of the genes identified are related to development and cellular metabolism, DNA transcription and anatomical structure. A particular SNP was identified within the NADP-dependent malic enzyme-2 (mMEP-2*), previously reported by independent studies as a candidate for local adaptation in salmon from the Iberian Peninsula. Interestingly, the corresponding SNP within the mMEP-2* region was consistent with a genomic pattern of divergent selection.

3.
Article in English | MEDLINE | ID: mdl-33055017

ABSTRACT

Finding epistatic interactions among loci when expressing a phenotype is a widely employed strategy to understand the genetic architecture of complex traits in GWAS. The abundance of methods dedicated to the same purpose, however, makes it increasingly difficult for scientists to decide which method is more suitable for their studies. This work compares the different epistasis detection methods published during the last decade in terms of runtime, detection power and type I error rate, with a special emphasis on high-order interactions. Results show that in terms of detection power, the only methods that perform well across all experiments are the exhaustive methods, although their computational cost may be prohibitive in large-scale studies. Regarding non-exhaustive methods, not one could consistently find epistasis interactions when marginal effects are absent. If marginal effects are present, there are methods that perform well for high-order interactions, such as BADTrees, FDHE-IW, SingleMI or SNPHarvester. As for false-positive control, only SNPHarvester, FDHE-IW and DCHE show good results. The study concludes that there is no single epistasis detection method to recommend in all scenarios. Authors should prioritize exhaustive methods when sufficient computational resources are available considering the data set size, and resort to non-exhaustive methods when the analysis time is prohibitive.


Subject(s)
Epistasis, Genetic , Genome-Wide Association Study , Epistasis, Genetic/genetics , Genome-Wide Association Study/methods , Phenotype , Polymorphism, Single Nucleotide
4.
BMC Bioinformatics ; 21(1): 138, 2020 Apr 09.
Article in English | MEDLINE | ID: mdl-32272874

ABSTRACT

BACKGROUND: Epistasis is defined as the interaction between different genes when expressing a specific phenotype. The most common way to characterize an epistatic relationship is using a penetrance table, which contains the probability of expressing the phenotype under study given a particular allele combination. Available simulators can only create penetrance tables for well-known epistasis models involving a small number of genes and under a large number of limitations. RESULTS: Toxo is a MATLAB library designed to calculate penetrance tables of epistasis models of any interaction order which resemble real data more closely. The user specifies the desired heritability (or prevalence) and the program maximizes the table's prevalence (or heritability) according to the input epistatic model boundaries. CONCLUSIONS: Toxo extends the capabilities of existing simulators that define epistasis using penetrance tables. These tables can be directly used as input for software simulators such as GAMETES so that they are able to generate data samples with larger interactions and more realistic prevalences/heritabilities.


Subject(s)
Epistasis, Genetic , User-Computer Interface , Genotype , Models, Genetic , Penetrance , Phenotype
5.
Data Brief ; 28: 104969, 2020 Feb.
Article in English | MEDLINE | ID: mdl-31909101

ABSTRACT

This is a co-submission with Multi-model inference of non-random mating from an information theoretic approach [1]. These data corresponds to the complete simulated data set jointly with the set of models defined for analysing the data. The simulated data set was obtained using the program MateSim [2]. The simulated cases correspond to one-sex competition and mate choice models. For each simulation run, the population frequencies (premating individuals) and the sample of 500 mating pairs were generated randomly for a hypothetical trait with two classes at each sex. Some datasets represent larger population size species (n = 10 000) and the mating process was represented as a sampling with replacement, and the population frequencies were constant over the mating season. The minimum phenotype frequency (MPF) allowed was 0.1. Five different model cases were simulated, namely random mating, female competition with mate choice (with independent or compound parameters) and male competition with mate choice (with independent or compound parameters). Each case was simulated 1000 times. Other datasets represent monogamous species (with large or small population size) and the mating process was without replacement (from the point of view of the available phenotypes). These data sets were used to test the performance of the multi-model inference methodology proposed in [1]. The data may be useful for testing any new/old statistics for measuring sexual selection or assortative mating patterns.

6.
Microb Ecol ; 77(4): 1036-1047, 2019 May.
Article in English | MEDLINE | ID: mdl-30762095

ABSTRACT

Wolbachia is an intracellular endosymbiont that can produce a range of effects on host fitness, but the temporal dynamics of Wolbachia strains have rarely been experimentally evaluated. We compare interannual strain frequencies along a geographical region for understanding the forces that shape Wolbachia strain frequency in natural populations of its host, Chorthippus parallelus (Orthoptera, Acrididae). General linear models show that strain frequency changes significantly across geographical and temporal scales. Computer simulation allows to reject the compatibility of the observed patterns with either genetic drift or sampling errors. We use consecutive years to estimate total Wolbachia strain fitness. Our estimation of Wolbachia fitness is significant in most cases, within locality and between consecutive years, following a negatively frequency-dependent trend. Wolbachia spp. B and F strains show a temporal pattern of variation that is compatible with a negative frequency-dependent natural selection mechanism. Our results suggest that such a mechanism should be at least considered in future experimental and theoretical research strategies that attempt to understand Wolbachia biodiversity.


Subject(s)
Grasshoppers/microbiology , Polymorphism, Genetic , Symbiosis , Wolbachia/physiology , Animals , Biological Coevolution , Computer Simulation , Geography , Linear Models , Seasons , Wolbachia/genetics
7.
Bioinformatics ; 34(6): 1043-1045, 2018 03 15.
Article in English | MEDLINE | ID: mdl-29186285

ABSTRACT

Motivation: There are many multiple testing correction methods. Some of them are robust to various dependencies in the data while others are not. Some of the implementations have problems for managing high dimensional list of P-values as currently demanded by microarray and other omic technologies. Results: The program Myriads, formerly SGoF+, provides some of the most important P-value-based correction methods jointly with a test of dependency and a P-value simulator. Myriads easily manage hundreds of thousands of P-values. Availability and implementation: http://myriads.webs.uvigo.es. Contact: myriads@uvigo.es. Supplementary information: Supplementary data are available at Bioinformatics online.


Subject(s)
Software , Computational Biology/methods
8.
Ecol Evol ; 7(9): 2883-2893, 2017 05.
Article in English | MEDLINE | ID: mdl-28479989

ABSTRACT

Mating preference can be a driver of sexual selection and assortative mating and is, therefore, a key element in evolutionary dynamics. Positive mating preference by similarity is the tendency for the choosy individual to select a mate which possesses a similar variant of a trait. Such preference can be modelled using Gaussian-like mathematical functions that describe the strength of preference, but such functions cannot be applied to empirical data collected from the field. As a result, traditionally, mating preference is indirectly estimated by the degree of assortative mating (using Pearson's correlation coefficient, r) in wild captured mating pairs. Unfortunately, r and similar coefficients are often biased due to the fact that different variants of a given trait are nonrandomly distributed in the wild, and pooling of mating pairs from such heterogeneous samples may lead to "false-positive" results, termed "the scale-of-choice effect" (SCE). Here we provide two new estimators of mating preference (Crough and Cscaled) derived from Gaussian-like functions which can be applied to empirical data. Computer simulations demonstrated that r coefficient showed robust estimations properties of mating preference but it was severely affected by SCE, Crough showed reasonable estimation properties and it was little affected by SCE, while Cscaled showed the best properties at infinite sample sizes and it was not affected by SCE but failed at biological sample sizes. We recommend using Crough combined with the r coefficient to infer mating preference in future empirical studies.

9.
PLoS One ; 12(4): e0175944, 2017.
Article in English | MEDLINE | ID: mdl-28423003

ABSTRACT

The detection of genomic regions involved in local adaptation is an important topic in current population genetics. There are several detection strategies available depending on the kind of genetic and demographic information at hand. A common drawback is the high risk of false positives. In this study we introduce two complementary methods for the detection of divergent selection from populations connected by migration. Both methods have been developed with the aim of being robust to false positives. The first method combines haplotype information with inter-population differentiation (FST). Evidence of divergent selection is concluded only when both the haplotype pattern and the FST value support it. The second method is developed for independently segregating markers i.e. there is no haplotype information. In this case, the power to detect selection is attained by developing a new outlier test based on detecting a bimodal distribution. The test computes the FST outliers and then assumes that those of interest would have a different mode. We demonstrate the utility of the two methods through simulations and the analysis of real data. The simulation results showed power ranging from 60-95% in several of the scenarios whilst the false positive rate was controlled below the nominal level. The analysis of real samples consisted of phased data from the HapMap project and unphased data from intertidal marine snail ecotypes. The results illustrate that the proposed methods could be useful for detecting locally adapted polymorphisms. The software HacDivSel implements the methods explained in this manuscript.


Subject(s)
Genetics, Population , Haplotypes , Models, Genetic , Selection, Genetic , Snails/genetics , Software , Animal Migration , Animals , Computer Simulation , Ecosystem , False Positive Reactions , Genetic Markers , HapMap Project , Humans , Tidal Waves
10.
Evolution ; 69(7): 1845-57, 2015 Jul.
Article in English | MEDLINE | ID: mdl-26085130

ABSTRACT

The mode in which sexual organisms choose mates is a key evolutionary process, as it can have a profound impact on fitness and speciation. One way to study mate choice in the wild is by measuring trait correlation between mates. Positive assortative mating is inferred when individuals of a mating pair display traits that are more similar than those expected under random mating while negative assortative mating is the opposite. A recent review of 1134 trait correlations found that positive estimates of assortative mating were more frequent and larger in magnitude than negative estimates. Here, we describe the scale-of-choice effect (SCE), which occurs when mate choice exists at a smaller scale than that of the investigator's sampling, while simultaneously the trait is heterogeneously distributed at the true scale-of-choice. We demonstrate the SCE by Monte Carlo simulations and estimate it in two organisms showing positive (Littorina saxatilis) and negative (L. fabalis) assortative mating. Our results show that both positive and negative estimates are biased by the SCE by different magnitudes, typically toward positive values. Therefore, the low frequency of negative assortative mating observed in the literature may be due to the SCE's impact on correlation estimates, which demands new experimental evaluation.


Subject(s)
Choice Behavior , Mating Preference, Animal , Snails/physiology , Animals , Models, Genetic , Monte Carlo Method , Spain
11.
Mol Phylogenet Evol ; 76: 102-9, 2014 Jul.
Article in English | MEDLINE | ID: mdl-24631855

ABSTRACT

Deep coalescence and the nongenealogical pattern of descent caused by recombination have emerged as a common problem for phylogenetic inference at the species level. Here we use computer simulations to assess whether AFLP-based phylogenies are robust to the uncertainties introduced by these factors. Our results indicate that phylogenetic signal can prevail even in the face of extensive deep coalescence allowing recovering the correct species tree topology. The impact of recombination on tree accuracy was related to total tree depth and species effective population size. The correct tree topology could be recovered upon many simulation settings due to a trade-off between the conflicting signals resulting from intra-locus recombination and the benefits of the joint consideration of unlinked loci that better matched overall the true species tree. Errors in tree topology were not only determined by deep coalescence, but also by the timing of divergence and the tree-building errors arising from an insufficient number of characters. DNA sequences generally outperformed AFLPs upon any simulated scenario, but this difference in performance was nearly negligible when a sufficient number of AFLP characters were sampled. Our simulations suggest that the impact of deep coalescence and intra-locus recombination on the reliability of AFLP trees could be minimal for effective population sizes equal to or lower than 10,000 (typical of many vertebrates and tree plants) given tree depths above 0.02 substitutions per site.


Subject(s)
Amplified Fragment Length Polymorphism Analysis/methods , Phylogeny , Recombination, Genetic , Animals , Base Sequence , Computer Simulation , Models, Genetic , Reproducibility of Results , Sequence Analysis, DNA
12.
Biochem Mol Biol Educ ; 40(4): 277-83, 2012 Jul.
Article in English | MEDLINE | ID: mdl-22807434

ABSTRACT

Mutate is a program developed for teaching purposes to impart a virtual laboratory class for undergraduate students of Genetics in Biology. The program emulates the so-called fluctuation test whose aim is to distinguish between spontaneous and adaptive mutation hypotheses in bacteria. The plan is to train students in certain key multidisciplinary aspects of current genetics such as sequence databases, DNA mutations, and hypothesis testing, while introducing the fluctuation test. This seminal experiment was originally performed studying Escherichia coli resistance to the infection by bacteriophage T1. The fluctuation test initiated the modern bacterial genetics that 25 years later ushered in the era of the recombinant DNA. Nowadays we know that some deletions in fhuA, the gene responsible for E. coli membrane receptor of T1, could cause the E. coli resistance to this phage. For the sake of simplicity, we will introduce the assumption that a single mutation generates the resistance to T1. During the practical, the students use the program to download some fhuA gene sequences, manually introduce some stop codon mutations, and design a fluctuation test to obtain data for distinguishing between preadaptative (spontaneous) and induced (adaptive) mutation hypotheses. The program can be launched from a browser or, if preferred, its executable file can be downloaded from http://webs.uvigo.es/acraaj/MutateWeb/Mutate.html. It requires the Java 5.0 (or higher) Runtime Environment (freely available at http://www.java.com).


Subject(s)
Computational Biology/methods , Genetics/education , Mutation , Software , Teaching Materials , Bacterial Outer Membrane Proteins/genetics , Databases, Genetic , Escherichia coli/genetics , Escherichia coli/virology , Escherichia coli Proteins/genetics , Genes, Bacterial , Internet , T-Phages/metabolism
13.
PLoS One ; 6(9): e24700, 2011.
Article in English | MEDLINE | ID: mdl-21931819

ABSTRACT

We developed a new multiple hypothesis testing adjustment called SGoF+ implemented as a sequential goodness of fit metatest which is a modification of a previous algorithm, SGoF, taking advantage of the information of the distribution of p-values in order to fix the rejection region. The new method uses a discriminant rule based on the maximum distance between the uniform distribution of p-values and the observed one, to set the null for a binomial test. This new approach shows a better power/pFDR ratio than SGoF. In fact SGoF+ automatically sets the threshold leading to the maximum power and the minimum false non-discovery rate inside the SGoF' family of algorithms. Additionally, we suggest combining the information provided by SGoF+ with the estimate of the FDR that has been committed when rejecting a given set of nulls. We study different positive false discovery rate, pFDR, estimation methods to combine q-value estimates jointly with the information provided by the SGoF+ method. Simulations suggest that the combination of SGoF+ metatest with the q-value information is an interesting strategy to deal with multiple testing issues. These techniques are provided in the latest version of the SGoF+ software freely available at http://webs.uvigo.es/acraaj/SGoF.htm.


Subject(s)
Algorithms , Software , Computational Biology
14.
Mol Cell Proteomics ; 10(3): M110.004374, 2011 Mar.
Article in English | MEDLINE | ID: mdl-21364085

ABSTRACT

In quantitative proteomics work, the differences in expression of many separate proteins are routinely examined to test for significant differences between treatments. This leads to the multiple hypothesis testing problem: when many separate tests are performed many will be significant by chance and be false positive results. Statistical methods such as the false discovery rate method that deal with this problem have been disseminated for more than one decade. However a survey of proteomics journals shows that such tests are not widely implemented in one commonly used technique, quantitative proteomics using two-dimensional electrophoresis. We outline a selection of multiple hypothesis testing methods, including some that are well known and some lesser known, and present a simple strategy for their use by the experimental scientist in quantitative proteomics work generally. The strategy focuses on the desirability of simultaneous use of several different methods, the choice and emphasis dependent on research priorities and the results in hand. This approach is demonstrated using case scenarios with experimental and simulated model data.


Subject(s)
Models, Biological , Proteomics/methods , Animals , Computer Simulation , Gastropoda/metabolism , Periodicals as Topic , Proteins/metabolism , Software
15.
Mol Cell Proteomics ; 2010 Dec 07.
Article in English | MEDLINE | ID: mdl-21139049

ABSTRACT

In quantitative proteomics work, the differences in expression of many separate proteins are routinely examined to test for significant differences between treatments. This leads to the multiple hypothesis testing problem: when many separate tests are performed many will be significant by chance and be false positive results. Statistical methods such as the false discovery rate (FDR) method that deal with this problem have been disseminated for more than one decade. However a survey of proteomics journals shows that such tests are not widely implemented in one commonly used technique, quantitative proteomics using two-dimensional electrophoresis (2-DE). We outline a selection of multiple hypothesis testing methods, including some that are well known and some lesser known, and present a simple strategy for their use by the experimental scientist in quantitative proteomics work generally. The strategy focuses on the desirability of simultaneous use of several different methods, the choice and emphasis dependent on research priorities and the results in hand. This approach is demonstrated using case scenarios with experimental and simulated model data.

16.
Curr Genomics ; 11(1): 58-61, 2010 Mar.
Article in English | MEDLINE | ID: mdl-20808525

ABSTRACT

The importance of simulation software in current and future evolutionary and genomic studies is just confirmed by the recent publication of several new simulation tools. The forward-in-time simulation strategy has, therefore, re-emerged as a complement of coalescent simulation. Additionally, more efficient coalescent algorithms, the same as new ideas about the combined use of backward and forward strategies have recently appeared. In the present work, a previous review is updated to include some new forward simulation tools. When simulating at the genome-scale the conflict between efficiency (i.e. execution speed and memory usage) and flexibility (i.e. complex modeling capabilities) emerges. This is the pivot around which simulation of evolutionary processes should improve. In addition, some effort should be made to consider the process of developing simulation tools from the point of view of the software engineering theory. Finally, some new ideas and technologies as general purpose graphic processing units are commented.

17.
PLoS One ; 5(12): e15930, 2010 Dec 29.
Article in English | MEDLINE | ID: mdl-21209966

ABSTRACT

Recently, an exact binomial test called SGoF (Sequential Goodness-of-Fit) has been introduced as a new method for handling high dimensional testing problems. SGoF looks for statistical significance when comparing the amount of null hypotheses individually rejected at level γ = 0.05 with the expected amount under the intersection null, and then proceeds to declare a number of effects accordingly. SGoF detects an increasing proportion of true effects with the number of tests, unlike other methods for which the opposite is true. It is worth mentioning that the choice γ = 0.05 is not essential to the SGoF procedure, and more power may be reached at other values of γ depending on the situation. In this paper we enhance the possibilities of SGoF by letting the γ vary on the whole interval (0,1). In this way, we introduce the 'SGoFicance Trace' (from SGoF's significance trace), a graphical complement to SGoF which can help to make decisions in multiple-testing problems. A script has been written for the computation in R of the SGoFicance Trace. This script is available from the web site http://webs.uvigo.es/acraaj/SGoFicance.htm.


Subject(s)
Computational Biology/methods , Proteomics/methods , Algorithms , Breast Neoplasms/genetics , Breast Neoplasms/pathology , Computer Simulation , False Positive Reactions , Humans , Internet , Models, Statistical , Monte Carlo Method , Proteins/chemistry , Proteome , RNA/chemistry , Software
18.
BMC Bioinformatics ; 10: 209, 2009 Jul 08.
Article in English | MEDLINE | ID: mdl-19586526

ABSTRACT

BACKGROUND: The detection of true significant cases under multiple testing is becoming a fundamental issue when analyzing high-dimensional biological data. Unfortunately, known multitest adjustments reduce their statistical power as the number of tests increase. We propose a new multitest adjustment, based on a sequential goodness of fit metatest (SGoF), which increases its statistical power with the number of tests. The method is compared with Bonferroni and FDR-based alternatives by simulating a multitest context via two different kinds of tests: 1) one-sample t-test, and 2) homogeneity G-test. RESULTS: It is shown that SGoF behaves especially well with small sample sizes when 1) the alternative hypothesis is weakly to moderately deviated from the null model, 2) there are widespread effects through the family of tests, and 3) the number of tests is large. CONCLUSION: Therefore, SGoF should become an important tool for multitest adjustment when working with high-dimensional biological data.


Subject(s)
Computer Simulation , Models, Statistical , Sample Size
19.
BMC Bioinformatics ; 9: 223, 2008 Apr 30.
Article in English | MEDLINE | ID: mdl-18447924

ABSTRACT

BACKGROUND: There are several situations in population biology research where simulating DNA sequences is useful. Simulation of biological populations under different evolutionary genetic models can be undertaken using backward or forward strategies. Backward simulations, also called coalescent-based simulations, are computationally efficient. The reason is that they are based on the history of lineages with surviving offspring in the current population. On the contrary, forward simulations are less efficient because the entire population is simulated from past to present. However, the coalescent framework imposes some limitations that forward simulation does not. Hence, there is an increasing interest in forward population genetic simulation and efficient new tools have been developed recently. Software tools that allow efficient simulation of large DNA fragments under complex evolutionary models will be very helpful when trying to better understand the trace left on the DNA by the different interacting evolutionary forces. Here I will introduce GenomePop, a forward simulation program that fulfills the above requirements. The use of the program is demonstrated by studying the impact of intracodon recombination on global and site-specific dN/dS estimation. RESULTS: I have developed algorithms and written software to efficiently simulate, forward in time, different Markovian nucleotide or codon models of DNA mutation. Such models can be combined with recombination, at inter and intra codon levels, fitness-based selection and complex demographic scenarios. CONCLUSION: GenomePop has many interesting characteristics for simulating SNPs or DNA sequences under complex evolutionary and demographic models. These features make it unique with respect to other simulation tools. Namely, the possibility of forward simulation under General Time Reversible (GTR) mutation or GTRxMG94 codon models with intra-codon recombination, arbitrary, user-defined, migration patterns, diploid or haploid models, constant or variable population sizes, etc. It also allows simulation of fitness-based selection under different distributions of mutational effects. Under the 2-allele model it allows the simulation of recombination hot-spots, the definition of different frequencies in different populations, etc. GenomePop can also manage large DNA fragments. In addition, it has a scaling option to save computation time when simulating large sequences and population sizes under complex demographic and evolutionary situations. These and many other features are detailed in its web page [1].


Subject(s)
Databases, Nucleic Acid , Genetics, Population/methods , Genome, Human , Models, Genetic , Software , Algorithms , Base Sequence , Biological Evolution , Codon , Computer Simulation , Emigration and Immigration , Game Theory , Gene Frequency , Genotype , Humans , Markov Chains , Recombination, Genetic , Selection, Genetic
20.
Infect Genet Evol ; 8(2): 110-20, 2008 Mar.
Article in English | MEDLINE | ID: mdl-18249158

ABSTRACT

We have studied the relationship between disease progression and HIV-1 evolution in 24 infants classified as rapid or non-rapid progressors, during nearly the entire disease progression cycle from infection to AIDS. Specifically, we examined the temporal relationship between clinical status and changes in genetic diversity, divergence, selection and recombination at the C2V3C3 region of the env gene during a period of 3 years. Statistical analyses were performed using linear mixed models that are particularly well-suited for longitudinal studies in which repeated measures are taken from the same patients. We did not observe significant differences in genetic diversity or overall substitution rates between clinical categories. However, the nonsynonymous substitution rate per nonsynonymous site (dN) evolved differently between groups. Changes in dN explained the evolutionary slowdown of the dN/dS ratio in the rapid progressors, while in non-rapid progressors the dN/dS ratio continuously increased through time. The number of positively selected sites had limited power for predicting disease progression. Recombination rate estimates were different among groups, although not significantly in the linear mixed models analysis. They showed some power predicting clinical categories and, interestingly, they were significantly correlated with the frequency of positively selected sites. Overall, the results obtained confirm that viral adaptation in the C2V3C3 region of the env gene is related to disease progression, although the statistical characterization of such pattern seems rather difficult.


Subject(s)
Acquired Immunodeficiency Syndrome/pathology , Acquired Immunodeficiency Syndrome/virology , Evolution, Molecular , Genes, env , HIV-1/genetics , Adaptation, Biological/genetics , Disease Progression , Genetic Variation , Humans , Infant , Models, Genetic , Molecular Sequence Data , Point Mutation , RNA, Viral/genetics , Selection, Genetic
SELECTION OF CITATIONS
SEARCH DETAIL
...