Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 22
Filter
1.
Lancet Glob Health ; 12(6): e1027-e1037, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38762283

ABSTRACT

BACKGROUND: Medical consumable stock-outs negatively affect health outcomes not only by impeding or delaying the effective delivery of services but also by discouraging patients from seeking care. Consequently, supply chain strengthening is being adopted as a key component of national health strategies. However, evidence on the factors associated with increased consumable availability is limited. METHODS: In this study, we used the 2018-19 Harmonised Health Facility Assessment data from Malawi to identify the factors associated with the availability of consumables in level 1 facilities, ie, rural hospitals or health centres with a small number of beds and a sparsely equipped operating room for minor procedures. We estimate a multilevel logistic regression model with a binary outcome variable representing consumable availability (of 130 consumables across 940 facilities) and explanatory variables chosen based on current evidence. Further subgroup analyses are carried out to assess the presence of effect modification by level of care, facility ownership, and a categorisation of consumables by public health or disease programme, Malawi's Essential Medicine List classification, whether the consumable is a drug or not, and level of average national availability. FINDINGS: Our results suggest that the following characteristics had a positive association with consumable availability-level 1b facilities or community hospitals had 64% (odds ratio [OR] 1·64, 95% CI 1·37-1·97) higher odds of consumable availability than level 1a facilities or health centres, Christian Health Association of Malawi and private-for-profit ownership had 63% (1·63, 1·40-1·89) and 49% (1·49, 1·24-1·80) higher odds respectively than government-owned facilities, the availability of a computer had 46% (1·46, 1·32-1·62) higher odds than in its absence, pharmacists managing drug orders had 85% (1·85, 1·40-2·44) higher odds than a drug store clerk, proximity to the corresponding regional administrative office (facilities greater than 75 km away had 21% lower odds [0·79, 0·63-0·98] than facilities within 10 km of the district health office), and having three drug order fulfilments in the 3 months before the survey had 14% (1·14, 1·02-1·27) higher odds than one fulfilment in 3 months. Further, consumables categorised as vital in Malawi's Essential Medicine List performed considerably better with 235% (OR 3·35, 95% CI 1·60-7·05) higher odds than other essential or non-essential consumables and drugs performed worse with 79% (0·21, 0·08-0·51) lower odds than other medical consumables in terms of availability across facilities. INTERPRETATION: Our results provide evidence on the areas of intervention with potential to improve consumable availability. Further exploration of the health and resource consequences of the strategies discussed will be useful in guiding investments into supply chain strengthening. FUNDING: UK Research and Innovation as part of the Global Challenges Research Fund (Thanzi La Onse; reference MR/P028004/1), the Wellcome Trust (Thanzi La Mawa; reference 223120/Z/21/Z), the UK Medical Research Council, the UK Department for International Development, and the EU (reference MR/R015600/1).


Subject(s)
Health Facilities , Malawi , Humans , Health Facilities/statistics & numerical data , Health Facilities/supply & distribution , Health Services Accessibility/statistics & numerical data , Equipment and Supplies/supply & distribution , Censuses
2.
Stud Fam Plann ; 54(4): 585-607, 2023 12.
Article in English | MEDLINE | ID: mdl-38129327

ABSTRACT

Malawi has high unmet need for contraception with a costed national plan to increase contraception use. Estimating how such investments might impact future population size in Malawi can help policymakers understand effects and value of policies to increase contraception uptake. We developed a new model of contraception and pregnancy using individual-level data capturing complexities of contraception initiation, switching, discontinuation, and failure by contraception method, accounting for differences by individual characteristics. We modeled contraception scale-up via a population campaign to increase initiation of contraception (Pop) and a postpartum family planning intervention (PPFP). We calibrated the model without new interventions to the UN World Population Prospects 2019 medium variant projection of births for Malawi. Without interventions Malawi's population passes 60 million in 2084; with Pop and PPFP interventions. it peaks below 35 million by 2100. We compare contraception coverage and costs, by method, with and without interventions, from 2023 to 2050. We estimate investments in contraception scale-up correspond to only 0.9 percent of total health expenditure per capita though could result in dramatic reductions of current pressures of very rapid population growth on health services, schools, land, and society, helping Malawi achieve national and global health and development goals.


Subject(s)
Contraception , Family Planning Services , Pregnancy , Female , Humans , Malawi , Health Services , Postpartum Period , Contraception Behavior
3.
Virus Evol ; 8(2): veac093, 2022.
Article in English | MEDLINE | ID: mdl-36478783

ABSTRACT

Longitudinal deep sequencing of viruses can provide detailed information about intra-host evolutionary dynamics including how viruses interact with and transmit between hosts. Many analyses require haplotype reconstruction, identifying which variants are co-located on the same genomic element. Most current methods to perform this reconstruction are based on a high density of variants and cannot perform this reconstruction for slowly evolving viruses. We present a new approach, HaROLD (HAplotype Reconstruction Of Longitudinal Deep sequencing data), which performs this reconstruction based on identifying co-varying variant frequencies using a probabilistic framework. We illustrate HaROLD on both RNA and DNA viruses with synthetic Illumina paired read data created from mixed human cytomegalovirus (HCMV) and norovirus genomes, and clinical datasets of HCMV and norovirus samples, demonstrating high accuracy, especially when longitudinal samples are available.

4.
Elife ; 112022 Sep 13.
Article in English | MEDLINE | ID: mdl-36098502

ABSTRACT

Background: Viral sequencing of SARS-CoV-2 has been used for outbreak investigation, but there is limited evidence supporting routine use for infection prevention and control (IPC) within hospital settings. Methods: We conducted a prospective non-randomised trial of sequencing at 14 acute UK hospital trusts. Sites each had a 4-week baseline data collection period, followed by intervention periods comprising 8 weeks of 'rapid' (<48 hr) and 4 weeks of 'longer-turnaround' (5-10 days) sequencing using a sequence reporting tool (SRT). Data were collected on all hospital-onset COVID-19 infections (HOCIs; detected ≥48 hr from admission). The impact of the sequencing intervention on IPC knowledge and actions, and on the incidence of probable/definite hospital-acquired infections (HAIs), was evaluated. Results: A total of 2170 HOCI cases were recorded from October 2020 to April 2021, corresponding to a period of extreme strain on the health service, with sequence reports returned for 650/1320 (49.2%) during intervention phases. We did not detect a statistically significant change in weekly incidence of HAIs in longer-turnaround (incidence rate ratio 1.60, 95% CI 0.85-3.01; p=0.14) or rapid (0.85, 0.48-1.50; p=0.54) intervention phases compared to baseline phase. However, IPC practice was changed in 7.8 and 7.4% of all HOCI cases in rapid and longer-turnaround phases, respectively, and 17.2 and 11.6% of cases where the report was returned. In a 'per-protocol' sensitivity analysis, there was an impact on IPC actions in 20.7% of HOCI cases when the SRT report was returned within 5 days. Capacity to respond effectively to insights from sequencing was breached in most sites by the volume of cases and limited resources. Conclusions: While we did not demonstrate a direct impact of sequencing on the incidence of nosocomial transmission, our results suggest that sequencing can inform IPC response to HOCIs, particularly when returned within 5 days. Funding: COG-UK is supported by funding from the Medical Research Council (MRC) part of UK Research & Innovation (UKRI), the National Institute of Health Research (NIHR) (grant code: MC_PC_19027), and Genome Research Limited, operating as the Wellcome Sanger Institute. Clinical trial number: NCT04405934.


Subject(s)
COVID-19 , Cross Infection , Humans , SARS-CoV-2/genetics , COVID-19/epidemiology , COVID-19/prevention & control , Prospective Studies , Infection Control/methods , Cross Infection/epidemiology , Cross Infection/prevention & control , Hospitals
5.
Inj Epidemiol ; 9(1): 21, 2022 Jul 12.
Article in English | MEDLINE | ID: mdl-35821170

ABSTRACT

BACKGROUND: Road traffic injuries are a significant cause of death and disability globally. However, in some countries the exact health burden caused by road traffic injuries is unknown. In Malawi, there is no central reporting mechanism for road traffic injuries and so the exact extent of the health burden caused by road traffic injuries is hard to determine. A limited number of models predict the incidence of mortality due to road traffic injury in Malawi. These estimates vary greatly, owing to differences in assumptions, and so the health burden caused on the population by road traffic injuries remains unclear. METHODS: We use an individual-based model and combine an epidemiological model of road traffic injuries with a health seeking behaviour and health system model. We provide a detailed representation of road traffic injuries in Malawi, from the onset of the injury through to the final health outcome. We also investigate the effects of an assumption made by other models that multiple injuries do not contribute to health burden caused by road accidents. RESULTS: Our model estimates an overall average incidence of mortality between 23.5 and 29.8 per 100,000 person years due to road traffic injuries and an average of 180,000 to 225,000 disability-adjusted life years (DALYs) per year between 2010 and 2020 in an estimated average population size of 1,364,000 over the 10-year period. Our estimated incidence of mortality falls within the range of other estimates currently available for Malawi, whereas our estimated number of DALYs is greater than the only other estimate available for Malawi, the GBD estimate predicting and average of 126,200 DALYs per year over the same time period. Our estimates, which account for multiple injuries, predict a 22-58% increase in overall health burden compared to the model ran as a single injury model. CONCLUSIONS: Road traffic injuries are difficult to model with conventional modelling methods, owing to the numerous types of injuries that occur. Using an individual-based model framework, we can provide a detailed representation of road traffic injuries. Our results indicate a higher health burden caused by road traffic injuries than previously estimated.

6.
Nature ; 602(7896): 263-267, 2022 02.
Article in English | MEDLINE | ID: mdl-34937052

ABSTRACT

High-throughput sequencing projects generate genome-scale sequence data for species-level phylogenies1-3. However, state-of-the-art Bayesian methods for inferring timetrees are computationally limited to small datasets and cannot exploit the growing number of available genomes4. In the case of mammals, molecular-clock analyses of limited datasets have produced conflicting estimates of clade ages with large uncertainties5,6, and thus the timescale of placental mammal evolution remains contentious7-10. Here we develop a Bayesian molecular-clock dating approach to estimate a timetree of 4,705 mammal species integrating information from 72 mammal genomes. We show that increasingly larger phylogenomic datasets produce diversification time estimates with progressively smaller uncertainties, facilitating precise tests of macroevolutionary hypotheses. For example, we confidently reject an explosive model of placental mammal origination in the Palaeogene8 and show that crown Placentalia originated in the Late Cretaceous with unambiguous ordinal diversification in the Palaeocene/Eocene. Our Bayesian methodology facilitates analysis of complete genomes and thousands of species within an integrated framework, making it possible to address hitherto intractable research questions on species diversifications. This approach can be used to address other contentious cases of animal and plant diversifications that require analysis of species-level phylogenomic datasets.


Subject(s)
Evolution, Molecular , Mammals , Phylogeny , Animals , Bayes Theorem , Eutheria/classification , Eutheria/genetics , Female , Mammals/classification , Mammals/genetics , Placenta , Pregnancy , Species Specificity
7.
Mol Biol Evol ; 39(1)2022 01 07.
Article in English | MEDLINE | ID: mdl-34694387

ABSTRACT

We use first principles of population genetics to model the evolution of proteins under persistent positive selection (PPS). PPS may occur when organisms are subjected to persistent environmental change, during adaptive radiations, or in host-pathogen interactions. Our mutation-selection model indicates protein evolution under PPS is an irreversible Markov process, and thus proteins under PPS show a strongly asymmetrical distribution of selection coefficients among amino acid substitutions. Our model shows the criteria ω>1 (where ω is the ratio of nonsynonymous over synonymous codon substitution rates) to detect positive selection is conservative and indeed arbitrary, because in real proteins many mutations are highly deleterious and are removed by selection even at positively selected sites. We use a penalized-likelihood implementation of the PPS model to successfully detect PPS in plant RuBisCO and influenza HA proteins. By directly estimating selection coefficients at protein sites, our inference procedure bypasses the need for using ω as a surrogate measure of selection and improves our ability to detect molecular adaptation in proteins.


Subject(s)
Models, Genetic , Selection, Genetic , Amino Acid Substitution , Codon , Evolution, Molecular , Mutation
8.
Elife ; 102021 06 29.
Article in English | MEDLINE | ID: mdl-34184637

ABSTRACT

Background: Rapid identification and investigation of healthcare-associated infections (HCAIs) is important for suppression of SARS-CoV-2, but the infection source for hospital onset COVID-19 infections (HOCIs) cannot always be readily identified based only on epidemiological data. Viral sequencing data provides additional information regarding potential transmission clusters, but the low mutation rate of SARS-CoV-2 can make interpretation using standard phylogenetic methods difficult. Methods: We developed a novel statistical method and sequence reporting tool (SRT) that combines epidemiological and sequence data in order to provide a rapid assessment of the probability of HCAI among HOCI cases (defined as first positive test >48 hr following admission) and to identify infections that could plausibly constitute outbreak events. The method is designed for prospective use, but was validated using retrospective datasets from hospitals in Glasgow and Sheffield collected February-May 2020. Results: We analysed data from 326 HOCIs. Among HOCIs with time from admission ≥8 days, the SRT algorithm identified close sequence matches from the same ward for 160/244 (65.6%) and in the remainder 68/84 (81.0%) had at least one similar sequence elsewhere in the hospital, resulting in high estimated probabilities of within-ward and within-hospital transmission. For HOCIs with time from admission 3-7 days, the SRT probability of healthcare acquisition was >0.5 in 33/82 (40.2%). Conclusions: The methodology developed can provide rapid feedback on HOCIs that could be useful for infection prevention and control teams, and warrants further prospective evaluation. The integration of epidemiological and sequence data is important given the low mutation rate of SARS-CoV-2 and its variable incubation period. Funding: COG-UK HOCI funded by COG-UK consortium, supported by funding from UK Research and Innovation, National Institute of Health Research and Wellcome Sanger Institute.


Subject(s)
COVID-19/diagnosis , COVID-19/epidemiology , Cross Infection/diagnosis , Cross Infection/epidemiology , Disease Outbreaks/statistics & numerical data , Population Surveillance/methods , SARS-CoV-2/genetics , Genome, Viral , Hospitals/statistics & numerical data , Humans , Probability , Retrospective Studies , United Kingdom/epidemiology , Whole Genome Sequencing
9.
BMC Bioinformatics ; 22(1): 285, 2021 May 28.
Article in English | MEDLINE | ID: mdl-34049487

ABSTRACT

BACKGROUND: Many important applications in bioinformatics, including sequence alignment and protein family profiling, employ sequence weighting schemes to mitigate the effects of non-independence of homologous sequences and under- or over-representation of certain taxa in a dataset. These schemes aim to assign high weights to sequences that are 'novel' compared to the others in the same dataset, and low weights to sequences that are over-represented. RESULTS: We formalise this principle by rigorously defining the evolutionary 'novelty' of a sequence within an alignment. This results in new sequence weights that we call 'phylogenetic novelty scores'. These scores have various desirable properties, and we showcase their use by considering, as an example application, the inference of character frequencies at an alignment column-important, for example, in protein family profiling. We give computationally efficient algorithms for calculating our scores and, using simulations, show that they are versatile and can improve the accuracy of character frequency estimation compared to existing sequence weighting schemes. CONCLUSIONS: Our phylogenetic novelty scores can be useful when an evolutionarily meaningful system for adjusting for uneven taxon sampling is desired. They have numerous possible applications, including estimation of evolutionary conservation scores and sequence logos, identification of targets in conservation biology, and improving and measuring sequence alignment accuracy.


Subject(s)
Algorithms , Computational Biology , Phylogeny , Sequence Alignment
10.
Wellcome Open Res ; 6: 261, 2021.
Article in English | MEDLINE | ID: mdl-35299708

ABSTRACT

Hundreds of different mathematical models have been proposed for describing electrophysiology of various cell types. These models are quite complex (nonlinear systems of typically tens of ODEs and sometimes hundreds of parameters) and software packages such as the Cancer, Heart and Soft Tissue Environment (Chaste) C++ library have been designed to run simulations with these models in isolation or coupled to form a tissue simulation. The complexity of many of these models makes sharing and translating them to new simulation environments difficult. CellML is an XML format that offers a widely-adopted solution to this problem. This paper specifically describes the capabilities of two new Python tools: the cellmlmanip library for reading and manipulating CellML models; and chaste_codegen, a CellML to C++ converter. These tools provide a Python 3 replacement for a previous Python 2 tool (called PyCML) and they also provide additional new features that this paper describes. Most notably, they can generate analytic Jacobians without the use of proprietary software, and also find singularities occurring in equations and automatically generate and apply linear approximations to prevent numerical problems at these points.

11.
Proc Biol Sci ; 286(1898): 20182418, 2019 03 13.
Article in English | MEDLINE | ID: mdl-30836875

ABSTRACT

Resolving the timing and pattern of early placental mammal evolution has been confounded by conflict among divergence date estimates from interpretation of the fossil record and from molecular-clock dating studies. Despite both fossil occurrences and molecular sequences favouring a Cretaceous origin for Placentalia, no unambiguous Cretaceous placental mammal has been discovered. Investigating the differing patterns of evolution in morphological and molecular data reveals a possible explanation for this conflict. Here, we quantified the relationship between morphological and molecular rates of evolution. We show that, independent of divergence dates, morphological rates of evolution were slow relative to molecular evolution during the initial divergence of Placentalia, but substantially increased during the origination of the extant orders. The rapid radiation of placentals into a highly morphologically disparate Cenozoic fauna is thus not associated with the origin of Placentalia, but post-dates superordinal origins. These findings predict that early members of major placental groups may not be easily distinguishable from one another or from stem eutherians on the basis of skeleto-dental morphology. This result supports a Late Cretaceous origin of crown placentals with an ordinal-level adaptive radiation in the early Paleocene, with the high relative rate permitting rapid anatomical change without requiring unreasonably fast molecular evolutionary rates. The lack of definitive Cretaceous placental mammals may be a result of morphological similarity among stem and early crown eutherians, providing an avenue for reconciling the fossil record with molecular divergence estimates for Placentalia.


Subject(s)
Biological Evolution , Eutheria/anatomy & histology , Phylogeny , Animals , Eutheria/classification , Evolution, Molecular
12.
Proc Natl Acad Sci U S A ; 116(12): 5693-5698, 2019 03 19.
Article in English | MEDLINE | ID: mdl-30819890

ABSTRACT

Recent sequencing efforts have led to estimates of human cytomegalovirus (HCMV) genome-wide intrahost diversity that rival those of persistent RNA viruses [Renzette N, Bhattacharjee B, Jensen JD, Gibson L, Kowalik TF (2011) PLoS Pathog 7:e1001344]. Here, we deep sequence HCMV genomes recovered from single and longitudinally collected blood samples from immunocompromised children to show that the observations of high within-host HCMV nucleotide diversity are explained by the frequent occurrence of mixed infections caused by genetically distant strains. To confirm this finding, we reconstructed within-host viral haplotypes from short-read sequence data. We verify that within-host HCMV nucleotide diversity in unmixed infections is no greater than that of other DNA viruses analyzed by the same sequencing and bioinformatic methods and considerably less than that of human immunodeficiency and hepatitis C viruses. By resolving individual viral haplotypes within patients, we reconstruct the timing, likely origins, and natural history of superinfecting strains. We uncover evidence for within-host recombination between genetically distinct HCMV strains, observing the loss of the parental virus containing the nonrecombinant fragment. The data suggest selection for strains containing the recombinant fragment, generating testable hypotheses about HCMV evolution and pathogenesis. These results highlight that high HCMV diversity present in some samples is caused by coinfection with multiple distinct strains and provide reassurance that within the host diversity for single-strain HCMV infections is no greater than for other herpesviruses.


Subject(s)
Cytomegalovirus/genetics , Recombination, Genetic/genetics , Superinfection/genetics , Base Sequence/genetics , Child , Child, Preschool , Cytomegalovirus Infections/virology , DNA, Viral/genetics , Female , Genetic Variation/genetics , Genome, Human/genetics , Genome, Viral , Haplotypes/genetics , High-Throughput Nucleotide Sequencing/methods , Humans , Immunocompromised Host/genetics , Infant , Infant, Newborn , Male , Sequence Analysis, DNA/methods
13.
Mol Biol Evol ; 35(7): 1783-1797, 2018 07 01.
Article in English | MEDLINE | ID: mdl-29618097

ABSTRACT

Accurate reconstruction of ancestral states is a critical evolutionary analysis when studying ancient proteins and comparing biochemical properties between parental or extinct species and their extant relatives. It relies on multiple sequence alignment (MSA) which may introduce biases, and it remains unknown how MSA methodological approaches impact ancestral sequence reconstruction (ASR). Here, we investigate how MSA methodology modulates ASR using a simulation study of various evolutionary scenarios. We evaluate the accuracy of ancestral protein sequence reconstruction for simulated data and compare reconstruction outcomes using different alignment methods. Our results reveal biases introduced not only by aligner algorithms and assumptions, but also tree topology and the rate of insertions and deletions. Under many conditions we find no substantial differences between the MSAs. However, increasing the difficulty for the aligners can significantly impact ASR. The MAFFT consistency aligners and PRANK variants exhibit the best performance, whereas FSA displays limited performance. We also discover a bias towards reconstructed sequences longer than the true ancestors, deriving from a preference for inferring insertions, in almost all MSA methodological approaches. In addition, we find measures of MSA quality generally correlate highly with reconstruction accuracy. Thus, we show MSA methodological differences can affect the quality of reconstructions and propose MSA methods should be selected with care to accurately determine ancestral states with confidence.


Subject(s)
Genetic Techniques , Sequence Alignment
14.
Nucleic Acids Res ; 44(8): e77, 2016 05 05.
Article in English | MEDLINE | ID: mdl-26819408

ABSTRACT

Sequence Logos and its variants are the most commonly used method for visualization of multiple sequence alignments (MSAs) and sequence motifs. They provide consensus-based summaries of the sequences in the alignment. Consequently, individual sequences cannot be identified in the visualization and covariant sites are not easily discernible. We recently proposed Sequence Bundles, a motif visualization technique that maintains a one-to-one relationship between sequences and their graphical representation and visualizes covariant sites. We here present Alvis, an open-source platform for the joint explorative analysis of MSAs and phylogenetic trees, employing Sequence Bundles as its main visualization method. Alvis combines the power of the visualization method with an interactive toolkit allowing detection of covariant sites, annotation of trees with synapomorphies and homoplasies, and motif detection. It also offers numerical analysis functionality, such as dimension reduction and classification. Alvis is user-friendly, highly customizable and can export results in publication-quality figures. It is available as a full-featured standalone version (http://www.bitbucket.org/rfs/alvis) and its Sequence Bundles visualization module is further available as a web application (http://science-practice.com/projects/sequence-bundles).


Subject(s)
Base Sequence/genetics , Computational Biology/methods , Sequence Alignment/methods , Sequence Analysis, DNA/methods
15.
Nature ; 513(7518): 422-425, 2014 Sep 18.
Article in English | MEDLINE | ID: mdl-25043003

ABSTRACT

The somatic mutations present in the genome of a cell accumulate over the lifetime of a multicellular organism. These mutations can provide insights into the developmental lineage tree, the number of divisions that each cell has undergone and the mutational processes that have been operative. Here we describe whole genomes of clonal lines derived from multiple tissues of healthy mice. Using somatic base substitutions, we reconstructed the early cell divisions of each animal, demonstrating the contributions of embryonic cells to adult tissues. Differences were observed between tissues in the numbers and types of mutations accumulated by each cell, which likely reflect differences in the number of cell divisions they have undergone and varying contributions of different mutational processes. If somatic mutation rates are similar to those in mice, the results indicate that precise insights into development and mutagenesis of normal human cells will be possible.


Subject(s)
Cell Lineage/genetics , Clone Cells/cytology , Clone Cells/metabolism , Genome/genetics , Mutagenesis/genetics , Mutation/genetics , Animals , Biological Clocks/genetics , Cell Division , Cells, Cultured , Embryo, Mammalian/cytology , Humans , Male , Mice , Mice, Inbred C57BL , Mutation Rate , Organoids/cytology , Organoids/metabolism , Phylogeny , Sequence Analysis, DNA , Tail/cytology
16.
Genetics ; 197(1): 257-71, 2014 May.
Article in English | MEDLINE | ID: mdl-24532780

ABSTRACT

We develop a maximum penalized-likelihood (MPL) method to estimate the fitnesses of amino acids and the distribution of selection coefficients (S = 2Ns) in protein-coding genes from phylogenetic data. This improves on a previous maximum-likelihood method. Various penalty functions are used to penalize extreme estimates of the fitnesses, thus correcting overfitting by the previous method. Using a combination of computer simulation and real data analysis, we evaluate the effect of the various penalties on the estimation of the fitnesses and the distribution of S. We show the new method regularizes the estimates of the fitnesses for small, relatively uninformative data sets, but it can still recover the large proportion of deleterious mutations when present in simulated data. Computer simulations indicate that as the number of taxa in the phylogeny or the level of sequence divergence increases, the distribution of S can be more accurately estimated. Furthermore, the strength of the penalty can be varied to study how informative a particular data set is about the distribution of S. We analyze three protein-coding genes (the chloroplast rubisco protein, mammal mitochondrial proteins, and an influenza virus polymerase) and show the new method recovers a large proportion of deleterious mutations in these data, even under strong penalties, confirming the distribution of S is bimodal in these real data. We recommend the use of the new MPL approach for the estimation of the distribution of S in species phylogenies of protein-coding genes.


Subject(s)
Computer Simulation , Evolution, Molecular , Phylogeny , Animals , Base Sequence , Genetic Fitness , Humans , Likelihood Functions , Mutation , Selection, Genetic
17.
Nat Genet ; 45(5): 542-545, 2013 May.
Article in English | MEDLINE | ID: mdl-23563608

ABSTRACT

The blood group Vel was discovered 60 years ago, but the underlying gene is unknown. Individuals negative for the Vel antigen are rare and are required for the safe transfusion of patients with antibodies to Vel. To identify the responsible gene, we sequenced the exomes of five individuals negative for the Vel antigen and found that four were homozygous and one was heterozygous for a low-frequency 17-nucleotide frameshift deletion in the gene encoding the 78-amino-acid transmembrane protein SMIM1. A follow-up study showing that 59 of 64 Vel-negative individuals were homozygous for the same deletion and expression of the Vel antigen on SMIM1-transfected cells confirm SMIM1 as the gene underlying the Vel blood group. An expression quantitative trait locus (eQTL), the common SNP rs1175550 contributes to variable expression of the Vel antigen (P = 0.003) and influences the mean hemoglobin concentration of red blood cells (RBCs; P = 8.6 × 10(-15)). In vivo, zebrafish with smim1 knockdown showed a mild reduction in the number of RBCs, identifying SMIM1 as a new regulator of RBC formation. Our findings are of immediate relevance, as the homozygous presence of the deletion allows the unequivocal identification of Vel-negative blood donors.


Subject(s)
Blood Group Antigens/genetics , Erythrocyte Membrane/metabolism , Erythrocytes/immunology , Gene Deletion , Homozygote , Membrane Proteins/genetics , Quantitative Trait Loci , Alleles , Animals , Biomarkers/metabolism , Blood Group Antigens/immunology , Blood Group Antigens/metabolism , Electrophoretic Mobility Shift Assay , Erythrocytes/metabolism , Erythrocytes/pathology , Exome/genetics , Female , Gene Expression Profiling , Gene Regulatory Networks , Humans , Isoantibodies/immunology , Membrane Proteins/immunology , Membrane Proteins/metabolism , Molecular Sequence Data , Oligonucleotide Array Sequence Analysis , Pregnancy , Zebrafish/genetics
18.
Genetics ; 190(3): 1101-15, 2012 Mar.
Article in English | MEDLINE | ID: mdl-22209901

ABSTRACT

Estimation of the distribution of selection coefficients of mutations is a long-standing issue in molecular evolution. In addition to population-based methods, the distribution can be estimated from DNA sequence data by phylogenetic-based models. Previous models have generally found unimodal distributions where the probability mass is concentrated between mildly deleterious and nearly neutral mutations. Here we use a sitewise mutation-selection phylogenetic model to estimate the distribution of selection coefficients among novel and fixed mutations (substitutions) in a data set of 244 mammalian mitochondrial genomes and a set of 401 PB2 proteins from influenza. We find a bimodal distribution of selection coefficients for novel mutations in both the mitochondrial data set and for the influenza protein evolving in its natural reservoir, birds. Most of the mutations are strongly deleterious with the rest of the probability mass concentrated around mildly deleterious to neutral mutations. The distribution of the coefficients among substitutions is unimodal and symmetrical around nearly neutral substitutions for both data sets at adaptive equilibrium. About 0.5% of the nonsynonymous mutations and 14% of the nonsynonymous substitutions in the mitochondrial proteins are advantageous, with 0.5% and 24% observed for the influenza protein. Following a host shift of influenza from birds to humans, however, we find among novel mutations in PB2 a trimodal distribution with a small mode of advantageous mutations.


Subject(s)
Models, Genetic , Mutation , Phylogeny , Selection, Genetic , Algorithms , Animals , Computer Simulation , Evolution, Molecular , Genetic Drift , Humans , Reproducibility of Results
19.
PLoS One ; 6(5): e19953, 2011.
Article in English | MEDLINE | ID: mdl-21647447

ABSTRACT

BACKGROUND: XMRV is the most recently described retrovirus to be found in Man, firstly in patients with prostate cancer (PC) and secondly in 67% of patients with chronic fatigue syndrome (CFS) and 3.7% of controls. Both disease associations remain contentious. Indeed, a recent publication has concluded that "XMRV is unlikely to be a human pathogen". Subsequently related but different polytropic MLV (pMLV) sequences were also reported from the blood of 86.5% of patients with CFS. and 6.8% of controls. Consequently we decided to investigate blood donors for evidence of XMRV/pMLV. METHODOLOGY/PRINCIPAL FINDINGS: Testing of cDNA prepared from the whole blood of 80 random blood donors, generated gag PCR signals from two samples (7C and 9C). These had previously tested negative for XMRV by two other PCR based techniques. To test whether the PCR mix was the source of these sequences 88 replicates of water were amplified using Invitrogen Platinum Taq (IPT) and Applied Biosystems Taq Gold LD (ABTG). Four gag sequences (2D, 3F, 7H, 12C) were generated with the IPT, a further sequence (12D) by ABTG re-amplification of an IPT first round product. Sequence comparisons revealed remarkable similarities between these sequences, endogeous MLVs and the pMLV sequences reported in patients with CFS. CONCLUSIONS/SIGNIFICANCE: Methodologies for the detection of viruses highly homologous to endogenous murine viruses require special caution as the very reagents used in the detection process can be a source of contamination and at a level where it is not immediately apparent. It is suggested that such contamination is likely to explain the apparent presence of pMLV in CFS.


Subject(s)
DNA Contamination , Polymerase Chain Reaction/methods , Animals , Humans , Indicators and Reagents , Mice , RNA, Viral/blood , RNA, Viral/genetics , Xenotropic murine leukemia virus-related virus/genetics
20.
Mol Biol Evol ; 28(6): 1755-67, 2011 Jun.
Article in English | MEDLINE | ID: mdl-21109586

ABSTRACT

Four influenza pandemics have struck the human population during the last 100 years causing substantial morbidity and mortality. The pandemics were caused by the introduction of a new virus into the human population from an avian or swine host or through the mixing of virus segments from an animal host with a human virus to create a new reassortant subtype virus. Understanding which changes have contributed to the adaptation of the virus to the human host is essential in assessing the pandemic potential of current and future animal viruses. Here, we develop a measure of the level of adaptation of a given virus strain to a particular host. We show that adaptation to the human host has been gradual with a timescale of decades and that none of the virus proteins have yet achieved full adaptation to the selective constraints. When the measure is applied to historical data, our results indicate that the 1918 influenza virus had undergone a period of preadaptation prior to the 1918 pandemic. Yet, ancestral reconstruction of the avian virus that founded the classical swine and 1918 human influenza lineages shows no evidence that this virus was exceptionally preadapted to humans. These results indicate that adaptation to humans occurred following the initial host shift from birds to mammals, including a significant amount prior to 1918. The 2009 pandemic virus seems to have undergone preadaptation to human-like selective constraints during its period of circulation in swine. Ancestral reconstruction along the human virus tree indicates that mutations that have increased the adaptation of the virus have occurred preferentially along the trunk of the tree. The method should be helpful in assessing the potential of current viruses to found future epidemics or pandemics.


Subject(s)
Adaptation, Biological , Host-Pathogen Interactions , Orthomyxoviridae Infections/immunology , Orthomyxoviridae/immunology , Algorithms , Animals , Birds , Databases, Genetic , Dogs , Genetic Fitness , Host-Pathogen Interactions/immunology , Humans , Models, Biological , Orthomyxoviridae Infections/epidemiology , Pandemics , Phylogeny , Viral Matrix Proteins/chemistry , Viral Matrix Proteins/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...