Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 37
Filter
1.
Oncogene ; 26(13): 1959-70, 2007 Mar 22.
Article in English | MEDLINE | ID: mdl-17001317

ABSTRACT

We analysed 148 primary breast cancers using BAC-arrays containing 287 clones representing cancer-related gene/loci to obtain genomic molecular portraits. Gains were detected in 136 tumors (91.9%) and losses in 123 tumors (83.1%). Eight tumors (5.4%) did not have any genomic aberrations in the 281 clones analysed. Common (more than 15% of the samples) gains were observed at 8q11-qtel, 1q21-qtel, 17q11-q12 and 11q13, whereas common losses were observed at 16q12-qtel, 11ptel-p15.5, 1p36-ptel, 17p11.2-p12 and 8ptel-p22. Patients with tumors registering either less than 5% (median value) or less than 11% (third quartile) total copy number changes had a better overall survival (log-rank test: P=0.0417 and P=0.0375, respectively). Unsupervised hierarchical clustering based on copy number changes identified four clusters. Women with tumors from the cluster with amplification of three regions containing known breast oncogenes (11q13, 17q12 and 20q13) had a worse prognosis. The good prognosis group (Nottingham Prognostic Index (NPI)

Subject(s)
Breast Neoplasms/genetics , Genome , Nucleic Acid Hybridization , Chromosome Mapping , Cohort Studies , Humans , Survival Analysis
2.
Bioinformatics ; 22(9): 1144-6, 2006 May 01.
Article in English | MEDLINE | ID: mdl-16533818

ABSTRACT

SUMMARY: We have developed a new method (BioHMM) for segmenting array comparative genomic hybridization data into states with the same underlying copy number. By utilizing a heterogeneous hidden Markov model, BioHMM incorporates relevant biological factors (e.g. the distance between adjacent clones) in the segmentation process.


Subject(s)
Algorithms , Chromosome Mapping/methods , Gene Dosage/genetics , In Situ Hybridization/methods , Oligonucleotide Array Sequence Analysis/methods , Sequence Analysis, DNA/methods , Software , Artificial Intelligence , Base Sequence , Markov Chains , Models, Genetic , Models, Statistical , Molecular Sequence Data , Pattern Recognition, Automated/methods
3.
J Bacteriol ; 183(20): 5997-6008, 2001 Oct.
Article in English | MEDLINE | ID: mdl-11567000

ABSTRACT

Sequences of the major outer membrane protein (MOMP) gene (ompA) and the outer membrane complex B protein gene (omcB) from Chlamydia trachomatis, Chlamydia pneumoniae, and Chlamydia psittaci were analyzed for evidence of intragenic recombination and for linkage equilibrium. The Sawyer runs test, compatibility matrices, and index of association analyses provided substantial evidence that there has been a history of intragenic recombination at ompA including one instance of interspecies recombination between the C. trachomatis mouse pneumonitis strain and the C. pneumoniae horse N16 strain. Although none of these methods detected intragenic recombination within omcB, differences in divergence reported in earlier studies suggested that there has been intergenic recombination involving omcB, and the analyses presented in this study are consistent with this. For C. trachomatis, index-of-association analyses suggested a higher degree of recombination for C class than for B class strains and a higher degree of recombination in the downstream half of ompA. In concordance with these findings, many significant breakpoints were found in variable segments 3 and 4 of MOMP for the recombinant strains D/B120, G/UW-57, E/Bour, and LGV-98 identified in this study. We provide examples of how genetic diversity generated by repeated recombination in these regions may be associated with evasion of immune surveillance, serovar-specific differences in tissue tropism, and persistence.


Subject(s)
Bacterial Outer Membrane Proteins/genetics , Chlamydia Infections/microbiology , Chlamydia/genetics , Chlamydia/pathogenicity , Genes, Bacterial , Recombination, Genetic , Animals , Chlamydia/classification , Chlamydia/immunology , Chlamydia Infections/immunology , Chlamydia trachomatis/classification , Chlamydia trachomatis/genetics , Chlamydia trachomatis/immunology , Chlamydia trachomatis/pathogenicity , Chlamydophila pneumoniae/classification , Chlamydophila pneumoniae/genetics , Chlamydophila pneumoniae/immunology , Chlamydophila pneumoniae/pathogenicity , Chlamydophila psittaci/classification , Chlamydophila psittaci/genetics , Chlamydophila psittaci/immunology , Chlamydophila psittaci/pathogenicity , Chromosome Breakage , Genetic Linkage , Humans , Molecular Sequence Data , Phylogeny , Serotyping , Species Specificity
4.
Proc Natl Acad Sci U S A ; 98(19): 10839-44, 2001 Sep 11.
Article in English | MEDLINE | ID: mdl-11517339

ABSTRACT

The stem cells that maintain human colon crypts are poorly characterized. To better determine stem cell numbers and how they divide, epigenetic patterns were used as cell fate markers. Methylation exhibits somatic inheritance and random changes that potentially record lifelong stem cell division histories as binary strings or tags in adjacent CpG sites. Methylation tag contents of individual crypts were sampled with bisulfite sequencing at three presumably neutral loci. Methylation increased with aging but varied between crypts and was mosaic within single crypts. Some crypts appeared to be quasi-clonal as they contained more unique tags than expected if crypts were maintained by single immortal stem cells. The complex epigenetic patterns were more consistent with a crypt niche model wherein multiple stem cells were present and replaced through periodic symmetric divisions. Methylation tags provide evidence that normal human crypts are long-lived, accumulate random methylation errors, and contain multiple stem cells that go through "bottlenecks" during life.


Subject(s)
Colon/cytology , CpG Islands , DNA Methylation , Homeodomain Proteins/genetics , MyoD Protein/genetics , Proteoglycans/genetics , Stem Cells/cytology , Transcription Factors/genetics , Adolescent , Adult , Aged , Aged, 80 and over , Aging/genetics , Base Sequence , Biglycan , Extracellular Matrix Proteins , Female , Homeobox Protein Nkx-2.5 , Humans , Infant , Male , Middle Aged , Molecular Sequence Data , Mosaicism
6.
Pac Symp Biocomput ; : 215-25, 2001.
Article in English | MEDLINE | ID: mdl-11262942

ABSTRACT

We investigated whether or not evolutionary change in DNA sequence data was homogeneous across different classes of base pairs. DNA sequences for eight protein-coding mitochrondrial genes were obtained for 38 vertebrate taxa from GenBank. Each nucleotide site in the alignment was classified according to a number of covariates, including its codon position, genetic code degeneracy, and hydrophobicity. The evolutionary transition matrix for each base was estimated by tracing implied character changes under parsimony on a known phylogenetic tree. Canonical variates analyses of the inferred transition matrices were performed for each gene to determine whether or not different classes of bases behaved similarly. We found five distinct clusters of transition matrices that could be roughly defined by combinations of codon position and degeneracy. This pattern was consistent among all genes. A stochastic model of rate variation based on the interaction of the covariates was developed to assess the statistical significance of the clusters. The five-group classification was found to explain significantly more sequence variation than did a codon only classification, a codon degeneracy classification, or a codon and degeneracy classification. The same five-group classification was found for all genes tested, suggesting a common process underlying the molecular evolution of the mitochondrial genome. These results confirm that there are classes of base pairs that evolve differently, and suggest that models of sequence evolution that incorporate covariate information may be useful in developing nucleotide substitution models that more accurately reflect evolutionary history.


Subject(s)
DNA, Mitochondrial/chemistry , DNA, Mitochondrial/genetics , Models, Genetic , Phylogeny , Animals , Databases, Factual , Evolution, Molecular , Genetic Variation , Humans , Likelihood Functions , Models, Statistical , Stochastic Processes
7.
Genetics ; 156(3): 1427-36, 2000 Nov.
Article in English | MEDLINE | ID: mdl-11063714

ABSTRACT

We describe a Markov chain Monte Carlo approach for assessing the role of site-to-site rate variation in the analysis of within-population samples of DNA sequences using the coalescent. Our framework is a Bayesian one. We discuss methods for assessing the goodness-of-fit of these models, as well as problems concerning the separate estimation of effective population size and mutation rate. Using a mitochondrial data set for illustration, we show that ancestral inference concerning coalescence times can be dramatically affected if rate variation is ignored.


Subject(s)
Biological Evolution , Genetic Variation , Models, Genetic , Models, Statistical , Algorithms , Bayes Theorem , DNA/genetics , DNA, Mitochondrial/genetics , Markov Chains , Mutation , Phylogeny , Probability , Reproducibility of Results , Time
8.
Genetics ; 156(1): 401-9, 2000 Sep.
Article in English | MEDLINE | ID: mdl-10978303

ABSTRACT

We develop a Markov chain Monte Carlo approach for estimating the distribution of the age of a mutation that is assumed to have arisen just once in the history of the population of interest. We assume that in addition to the presence or absence of this mutation in a sample of chromosomes, we have DNA sequence data from a region completely linked to the mutant site. We apply our method to a mitochondrial data set in which the DNA sequence data come from hypervariable region I and the mutation of interest is the 9-bp region V deletion.


Subject(s)
Models, Genetic , Polymorphism, Genetic , DNA, Mitochondrial/genetics , Emigration and Immigration , Genetic Linkage , Humans , Indians, North American/genetics , Markov Chains , Monte Carlo Method , Mutation , Sequence Deletion , Time Factors
9.
J Comput Biol ; 7(1-2): 47-57, 2000.
Article in English | MEDLINE | ID: mdl-10890387

ABSTRACT

We provide both theoretical and simulation results on the progress of an STS mapping project in the presence of clone length inhomogeneity. For an example in which the genome comprises alternating regions of clones with short and long average length, the main conclusion is that the efficiency of the project is clearly decreased in the presence of such inhomogeneity. The case of deterministic clone length gives the worst progress. The general simulation algorithm we propose shows that strategies that space the anchors as regularly as possible do best: fewer contigs of larger average length are expected. The simulation algorithm can be used to study many statistical properties of the progress of any anchoring project.


Subject(s)
Human Genome Project , Sequence Tagged Sites , Algorithms , Biometry , Cloning, Molecular , Computer Simulation
10.
Proc Natl Acad Sci U S A ; 97(3): 1236-41, 2000 Feb 01.
Article in English | MEDLINE | ID: mdl-10655514

ABSTRACT

It is difficult to observe human tumor progression as precursor lesions are systematically removed. Alternatives to direct observations, commonly used to reveal the hidden past of species and populations, are sequence comparisons or molecular clocks. Noncoding microsatellite (MS) loci were employed as molecular tumor clocks in 13 human mutator phenotype (MSI(+)) colorectal tumors. Quantitative analysis revealed that specific patterns of somatic MS mutations accumulate with division after loss of mismatch repair (MMR). Tumors had unique patterns of MS mutation, and, therefore, based on this model, each tumor had its own unique history. Loss of MMR occurred very early relative to terminal clonal expansion, with an estimated average of 2,300 divisions since loss of MMR and 280 divisions since expansion. Contrary to the classical adenoma-cancer sequence, MSI(+) adenomas were nearly as old as cancers (2,000 versus 2,400 divisions since loss of MMR). Negative clinical examinations preceded six tumors, independently documenting an absence of visible precursors during early MSI(+) adenoma or cancer progression. These findings further extend a window beyond visible progression since loss of MMR appears to start a genetic phase involving clone sizes or phenotypes below a threshold of clinical detection. This previously occult prologue before visible neoplasia is longer and therefore likely more important than generally appreciated.


Subject(s)
Adenocarcinoma/genetics , Adenoma/genetics , Colonic Neoplasms/genetics , Colonic Polyps/genetics , Colorectal Neoplasms/genetics , Rectal Neoplasms/genetics , Adenocarcinoma/pathology , Adenoma/pathology , Cell Division , Clone Cells/pathology , Colonic Neoplasms/pathology , Colonic Polyps/pathology , Colorectal Neoplasms, Hereditary Nonpolyposis/genetics , DNA Mutational Analysis , DNA Repair/genetics , DNA, Neoplasm/genetics , Disease Progression , Humans , Male , Microsatellite Repeats , Precancerous Conditions/genetics , Rectal Neoplasms/pathology
11.
Mutat Res ; 406(2-4): 115-20, 1999 Aug.
Article in English | MEDLINE | ID: mdl-10479728

ABSTRACT

DNA sequence polymorphisms were sought in the coding region and at the exon-intron boundaries of the human XPF gene, which plays a role in nucleotide excision repair. Based on a survey of 38 individuals, we found six single nucleotide polymorphisms, one in the 5' non-coding region of the XPF gene, and five in the 2751 bp coding region. At each site, the frequency of the rarer allele varies from about 0.01 to over 0.38. Except for the 5' non-coding and one coding sequence polymorphism, the rarer alleles for the remaining four polymorphisms were found only in heterozygotes. Of the five polymorphisms in the coding region, one is silent, one results in a conserved amino acid difference, and the remaining three result in non-conserved amino acid differences. Because of its biological function in nucleotide excision repair, functionally significant XPF gene polymorphisms are candidates for influencing cancer susceptibility and overall genetic stability. Nucleotide sequence diversity estimates for XPF are similar to the lipoprotein lipase and beta-globin genes.


Subject(s)
DNA Repair/genetics , DNA-Binding Proteins/genetics , Alleles , DNA/chemistry , DNA/genetics , DNA Mutational Analysis , Gene Frequency , Humans , Male , Point Mutation , Polymorphism, Genetic , Polymorphism, Single-Stranded Conformational
12.
Genetics ; 152(3): 1091-101, 1999 Jul.
Article in English | MEDLINE | ID: mdl-10388827

ABSTRACT

With 10 segregating sites (simple nucleotide polymorphisms) in the last intron (1089 bp) of the ZFX gene we have observed 11 haplotypes in 336 chromosomes representing a worldwide array of 15 human populations. Two haplotypes representing 77% of all chromosomes were distributed almost evenly among four continents. Five of the remaining haplotypes were detected in Africa and 4 others were restricted to Eurasia and the Americas. Using the information about the ancestral state of the segregating positions (inferred from human-great ape comparisons), we applied coalescent analysis to estimate the age of the polymorphisms and the resulting haplotypes. The oldest haplotype, with the ancestral alleles at all the sites, was observed at low frequency only in two groups of African origin. Its estimated age of 740 to 1100 kyr corresponded to the time to the most recent common ancestor. The two most frequent worldwide distributed haplotypes were estimated at 550 to 840 and 260 to 400 kyr, respectively, while the age of the continentally restricted polymorphisms was 120 to 180 kyr and smaller. Comparison of spatial and temporal distribution of the ZFX haplotypes suggests that modern humans diverged from the common ancestral stock in the Middle Paleolithic era. Subsequent range expansion prevented substantial gene flow among continents, separating African groups from populations that colonized Eurasia and the New World.


Subject(s)
DNA-Binding Proteins/genetics , Genealogy and Heraldry , Haplotypes , Introns , Polymorphism, Genetic , Humans , Kruppel-Like Transcription Factors , Male , Models, Genetic , Time Factors , Transcription Factors , X Chromosome
13.
Am J Pathol ; 154(6): 1815-24, 1999 Jun.
Article in English | MEDLINE | ID: mdl-10362806

ABSTRACT

Colorectal cancer progression involves changes in phenotype and genotype. Although usually illustrated as a linear process, more complex underlying pathways have not been excluded. The object of this paper is to apply modern quantitative principles of molecular evolution to multistep tumor progression. To reconstruct progression lineages, the genotypes of two adjacent adenoma-cancer pairs were determined by serial dilution and polymerase chain reaction at 28-30 microsatellite (MS) loci and then traced back to their most recent common ancestor. The tumors were mismatch repair deficient, and therefore relatively large numbers of MS mutations should accumulate during progression. As expected, the MS genotypes were similar (correlation coefficients >0.9) between different parts of the same adenoma or cancer, but very different (correlation coefficients <0. 2) between unrelated metachronous adenoma-cancer pairs. Unexpectedly, the genotypes of the adjacent adenoma-cancer pairs were also very different (correlation coefficients of 0.30 and 0.36), consistent with early adenoma-cancer divergence rather than direct linear progression. More than 60% of the divisions occurred after this early adenoma-cancer divergence. Therefore, the tumor phylogenies were not consistent with sequential stepwise selection along a single most "fit" and frequent lineage from adenoma to cancer. Instead, one effective early progression strategy creates and maintains multiple evolving candidate lineages, which are subsequently selected for terminal clonal expansion.


Subject(s)
Adenoma/genetics , Cell Transformation, Neoplastic/genetics , Colorectal Neoplasms/genetics , Neoplasms, Multiple Primary/genetics , Neoplasms, Second Primary/genetics , Adult , Cell Division/genetics , Cell Lineage/genetics , Clone Cells , Disease Progression , Genotype , Humans , Male , Microsatellite Repeats , Polymerase Chain Reaction , Time
14.
Hum Mol Genet ; 8(2): 173-83, 1999 Feb.
Article in English | MEDLINE | ID: mdl-9931325

ABSTRACT

Trinucleotide repeat disease alleles can undergo 'dynamic' mutations in which repeat number may change when a gene is transmitted from parent to offspring. By typing >3500 sperm, we determined the size distribution of Huntington's disease (HD) germline mutations produced by 26 individuals from the Venezuelan cohort with CAG/CTG repeat numbers ranging from 37 to 62. Both the mutation frequency and mean change in allele size increased with increasing somatic repeat number. The mutation frequencies averaged 82% and, for individuals with at least 50 repeats, 98%. The extraordinarily high mutation frequency levels are most consistent with a mutation process that occurs throughout germline mitotic divisions, rather than resulting from a single meiotic event. In several cases, the mean change in repeat number differed significantly among individuals with similar somatic allele sizes. This individual variation could not be attributed to age in a simple way or to ' cis ' sequences, suggesting the influence of genetic background or other factors. A familial effect is suggested in one family where both the father and son gave highly unusual spectra compared with other individuals matched for age and repeat number. A statistical model based on incomplete processing of Okazaki fragments during DNA replication was found to provide an excellent fit to the data but variation in parameter values among individuals suggests that the molecular mechanism might be more complex.


Subject(s)
Genes/genetics , Germ-Line Mutation , Huntington Disease/genetics , Mitosis/genetics , Adolescent , Adult , Aged , Alleles , Cohort Studies , DNA/genetics , Family Health , Humans , Male , Middle Aged , Models, Biological , Spermatozoa/metabolism , Trinucleotide Repeat Expansion/genetics , Trinucleotide Repeats/genetics
15.
Genetics ; 145(2): 505-18, 1997 Feb.
Article in English | MEDLINE | ID: mdl-9071603

ABSTRACT

The paper is concerned with methods for the estimation of the coalescence time (time since the most recent common ancestor) of a sample of intraspecies DNA sequences. The methods take advantage of prior knowledge of population demography, in addition to the molecular data. While some theoretical results are presented, a central focus is on computational methods. These methods are easy to implement, and, since explicit formulae tend to be either unavailable or unilluminating, they are also more useful and more informative in most applications. Extensions are presented that allow for the effects of uncertainty in our knowledge of population size and mutation rates, for variability in population sizes, for regions of different mutation rate, and for inference concerning the coalescence time of the entire population. The methods are illustrated using recent data from the human Y chromosome.


Subject(s)
Algorithms , DNA , Databases, Factual , Humans , Time Factors
16.
Am J Hum Genet ; 59(4): 772-80, 1996 Oct.
Article in English | MEDLINE | ID: mdl-8808591

ABSTRACT

The human mitochondrial mutation mtDNA4977 is a 4,977-bp deletion that originates between two 13-bp direct repeats. We grew 220 colonies of cells, each from a single human cell. For each colony, we counted the number of cells and amplified the DNA by PCR to test for the presence of a deletion. To estimate the mutation fate, we used a model that describes the relationship between the mutation rate and the probability that a colony of a given size will contain no mutants, taking into account such factors as possible mitochondrial turnover and mistyping due to PCR error. We estimate that the mutation rate for mtDNA4977 in cultured human cells is 5.95 x 10(-8) per mitochondrial genome replication. This method can be applied to specific chromosomal, as well as mitochondrial, mutations.


Subject(s)
DNA, Mitochondrial/genetics , Gene Deletion , Adult , Cell Line , Clone Cells , Female , Humans , Models, Genetic , Polymerase Chain Reaction
18.
Hum Mol Genet ; 4(9): 1519-26, 1995 Sep.
Article in English | MEDLINE | ID: mdl-8541834

ABSTRACT

The CAG triplet repeat region of the Huntington's disease gene was amplified in 923 single sperm from three affected and two normal individuals. Average-size alleles (15-18 repeats) showed only three contraction mutations among 475 sperm (0.6%). A 30 repeat normal allele showed an 11% mutation frequency. The mutation frequency of a 36 repeat intermediate allele was 53% with 8% of all gametes having expansions which brought the allele size into the HD disease range (> or = 38 repeats). Disease alleles (38-51 repeats) showed a very high mutation frequency (92-99%). As repeat number increased there was a marked elevation in the frequency of expansions, in the mean number of repeats added per expansion and the size of the largest observed expansion. Contraction frequencies also appeared to increase with allele size but decreased as repeat number exceeded 36. Our sperm typing data are of a discrete nature rather than consisting of smears of PCR product from pooled sperm. This allowed the observed mutation frequency spectra to be compared to the distribution calculated using discrete stochastic models based on current molecular ideas of the expansion process. An excellent fit was found when the model specified that a random number of repeats are added during the progression of the polymerase through the repeated region.


Subject(s)
Gene Frequency , Huntington Disease/genetics , Mutation , Spermatozoa/metabolism , Trinucleotide Repeats , Alleles , Base Sequence , DNA Primers , Humans , Male , Molecular Sequence Data
19.
Math Biosci ; 127(1): 77-98, 1995 May.
Article in English | MEDLINE | ID: mdl-7734858

ABSTRACT

The infinitely-many-sites process is often used to model the sequence variability observed in samples of DNA sequences. Despite its popularity, the sampling theory of the process is rather poorly understood. We describe the tree structure underlying the model and show how this may be used to compute the probability of a sample of sequences. We show how to produce the unrooted genealogy from a set of sites in which the ancestral labeling is unknown and from this the corresponding rooted genealogies. We derive recursions for the probability of the configuration of sequences (equivalently, of trees) in both the rooted and unrooted cases. We give a computational method based on Monte Carlo recursion that provides approximates to sampling probabilities for samples of any size. Among several applications, this algorithm may be used to find maximum likelihood estimators of the substitution rate, both when the ancestral labeling of sites is known and when it is unknown.


Subject(s)
Genealogy and Heraldry , Models, Genetic , Biological Evolution , DNA/genetics , Genetics, Population , Mathematics
20.
J Math Biol ; 33(6): 602-18, 1995.
Article in English | MEDLINE | ID: mdl-7608640

ABSTRACT

Population geneticists have long been interested in the behavior of rare variants. The definition of a rare variant has been the subject of some debate, centered mainly on whether alleles with small relative frequency should be considered rare, or whether alleles with small numbers should be. We study the behavior of the counts of rare alleles in samples taken from a population genetics model that allows for selection and infinitely-many-alleles mutation structure. We show that in large samples the counts of rare alleles--those represented once, twice, ...--are approximately distributed as a Poisson process, with a parameter that depends on the total mutation rate, but not on the selection parameters. This result is applied to the problem of estimating the fraction of neutral mutations.


Subject(s)
Alleles , Genetics, Population , Models, Genetic , Models, Theoretical , Gene Frequency , Mutation , Poisson Distribution , Selection, Genetic
SELECTION OF CITATIONS
SEARCH DETAIL
...