Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 9 de 9
Filter
Add more filters










Database
Language
Publication year range
1.
Article in English | MEDLINE | ID: mdl-24334393

ABSTRACT

Principal component (PC) plots have become widely used to summarize genetic variation of individuals in a sample. The similarity between genetic distance in PC plots and geographical distance has shown to be quite impressive. However, in most situations, individual ancestral origins are not precisely known or they are heterogeneously distributed; hence, they are hardly linked to a geographical area. We have developed GeneOnEarth, a user-friendly web-based tool to help geneticists to understand whether a linear isolation-by-distance model may apply to a genetic data set; thus, genetic distances among a set of individuals resemble geographical distances among their origins. Its main goal is to allow users to first apply a by-view Procrustes method to visually learn whether this model holds. To do that, the user can choose the exact geographical area from an on line 2D or 3D world map by using, respectively, Google Maps or Google Earth, and rotate, flip, and resize the images. GeneOnEarth can also compute the optimal rotation angle using Procrustes analysis and assess statistical evidence of similarity when a different rotation angle has been chosen by the user. An online version of GeneOnEarth is available for testing and using purposes at http://bios.ugr.es/GeneOnEarth.


Subject(s)
Genomics/methods , Phylogeography/methods , Principal Component Analysis , Search Engine , Computer Simulation , HapMap Project , Humans , Models, Biological
2.
J Bioinform Comput Biol ; 11(2): 1250014, 2013 Apr.
Article in English | MEDLINE | ID: mdl-23600811

ABSTRACT

It is already known that power in multimarker transmission/disequilibrium tests may improve with the number of markers as some associations may require several markers to be captured. However, a mechanism such as haplotype grouping must be used to avoid incremental complexity with the number of markers. 2G, a state-of-the-art transmission/disequilibrium test, implements this mechanism to its maximum extent by grouping haplotypes into only two groups, high and low-risk haplotypes, so that the test has only one degree of freedom regardless of the number of markers. The test checks whether those haplotypes more often transmitted from parents to offspring are truly high-risk haplotypes. In this paper we use haplotype similarity as prior knowledge to classify haplotypes as high or low risk ones and start with those haplotypes in which the prior will have lower impact i.e. those with the largest differences between transmission and non-transmission counts. If their counts are very different, the prior knowledge has little effect and haplotypes are classified as low or high risk as 2G does. We show a substantial gain in power achieved by this approach, in both simulation and real data sets.


Subject(s)
Algorithms , Genetic Markers , Haplotypes , Linkage Disequilibrium , Bayes Theorem , Computational Biology , Computer Simulation , Genetic Predisposition to Disease , Genome-Wide Association Study/statistics & numerical data , Humans , Models, Genetic , Polymorphism, Single Nucleotide
3.
Hum Genet ; 128(3): 325-44, 2010 Sep.
Article in English | MEDLINE | ID: mdl-20603721

ABSTRACT

Multimarker transmission/disequilibrium tests (TDTs) are powerful association and linkage tests used to perform genome-wide filtering in the search for disease susceptibility loci. In contrast to case/control studies, they have a low rate of false positives for population stratification and admixture. However, the length of a region found in association with a disease is usually very large because of linkage disequilibrium (LD). Here, we define a multimarker proportional TDT (mTDT ( P )) designed to improve locus specificity in complex diseases that has good power compared to the most powerful multimarker TDTs. The test is a simple generalization of a multimarker TDT in which haplotype frequencies are used to weight the effect that each haplotype has on the whole measure. Two concepts underlie the features of the metric: the 'common disease, common variant' hypothesis and the decrease in LD with chromosomal distance. Because of this decrease, the frequency of haplotypes in strong LD with common disease variants decreases with increasing distance from the disease susceptibility locus. Thus, our haplotype proportional test has higher locus specificity than common multimarker TDTs that assume a uniform distribution of haplotype probabilities. Because of the common variant hypothesis, risk haplotypes at a given locus are relatively frequent and a metric that weights partial results for each haplotype by its frequency will be as powerful as the most powerful multimarker TDTs. Simulations and real data sets demonstrate that the test has good power compared with the best tests but has remarkably higher locus specificity, so that the association rate decreases at a higher rate with distance from a disease susceptibility or disease protective locus.


Subject(s)
Genome-Wide Association Study/methods , Linkage Disequilibrium , Biostatistics , Databases, Genetic , Female , Genetic Markers , Genetic Predisposition to Disease , Genetics, Population , Genome-Wide Association Study/statistics & numerical data , Haplotypes , Humans , Male , Models, Genetic , Oligonucleotide Array Sequence Analysis/statistics & numerical data , Polymorphism, Single Nucleotide
4.
PLoS One ; 4(1): e4137, 2009.
Article in English | MEDLINE | ID: mdl-19125193

ABSTRACT

BACKGROUND: IL-2 receptor (IL2R) alpha is the specific component of the high affinity IL2R system involved in the immune response and in the control of autoimmunity. METHODS AND RESULTS: Here we perform a replication and fine mapping of the IL2RA gene region analyzing 3 SNPs previously associated with multiple sclerosis (MS) and 5 SNPs associated with type 1 diabetes (T1D) in a collection of 798 MS patients and 927 matched Caucasian controls from the south of Spain. We observed association with MS in 6 of 8 SNPs. The rs1570538, at the 3'- UTR extreme of the gene, previously reported to have a weak association with MS, is replicated here (P = 0.032). The most associated T1D SNP (rs41295061) was not associated with MS in the present study. However, the rs35285258, belonging to another independent group of SNPs associated with T1D, showed the maximal association in this study but different risk allele. We replicated the association of only one (rs2104286) of the two IL2RA SNPs identified in the recently performed genome-wide association study of MS. CONCLUSIONS: These findings confirm and extend the association of this gene with MS and reveal a genetic heterogeneity of the associated polymorphisms and risk alleles between MS and T1D suggesting different immunopathological roles of IL2RA in these two diseases.


Subject(s)
Diabetes Mellitus, Type 1/genetics , Interleukin-2 Receptor alpha Subunit/genetics , Multiple Sclerosis/genetics , Polymorphism, Single Nucleotide , Alleles , Chromosome Mapping , Genetic Predisposition to Disease , Genotype , Humans , Linkage Disequilibrium , Young Adult
5.
J Biomed Inform ; 41(3): 432-41, 2008 Jun.
Article in English | MEDLINE | ID: mdl-18337189

ABSTRACT

Compared with expert systems for specific disease diagnosis, knowledge-based systems to assist decision making in triage usually try to cover a much wider domain but can use a smaller set of variables due to time restrictions, many of them subjective so that accurate models are difficult to build. In this paper, we first study criteria that most affect the performance of systems for triage assistance. Such criteria include whether principled approaches from machine learning can be used to increase accuracy and robustness and to represent uncertainty, whether data and model integration can be performed or whether temporal evolution can be modeled to implement retriage or represent medication responses. Following the most important criteria, we explore current systems and identify some missing features that, if added, may yield to more accurate triage systems.


Subject(s)
Artificial Intelligence , Decision Support Systems, Clinical/organization & administration , Software Design , Software , Triage/methods , Triage/organization & administration
6.
BMC Genet ; 9: 6, 2008 Jan 14.
Article in English | MEDLINE | ID: mdl-18194558

ABSTRACT

BACKGROUND: One of the challenges of the analysis of pooling-based genome wide association studies is to identify authentic associations among potentially thousands of false positive associations. RESULTS: We present a hierarchical and modular approach to the analysis of genome wide genotype data that incorporates quality control, linkage disequilibrium, physical distance and gene ontology to identify authentic associations among those found by statistical association tests. The method is developed for the allelic association analysis of pooled DNA samples, but it can be easily generalized to the analysis of individually genotyped samples. We evaluate the approach using data sets from diverse genome wide association studies including fetal hemoglobin levels in sickle cell anemia and a sample of centenarians and show that the approach is highly reproducible and allows for discovery at different levels of synthesis. CONCLUSION: Results from the integration of Bayesian tests and other machine learning techniques with linkage disequilibrium data suggest that we do not need to use too stringent thresholds to reduce the number of false positive associations. This method yields increased power even with relatively small samples. In fact, our evaluation shows that the method can reach almost 70% sensitivity with samples of only 100 subjects.


Subject(s)
DNA/genetics , Genome, Human , Genotype , Bayes Theorem , Computational Biology , Fetal Hemoglobin/genetics , Gene Frequency , Genetic Markers , Humans , Linkage Disequilibrium , Oligonucleotide Array Sequence Analysis , Polymorphism, Single Nucleotide , Reproducibility of Results , Sensitivity and Specificity
7.
Blood ; 110(7): 2727-35, 2007 Oct 01.
Article in English | MEDLINE | ID: mdl-17600133

ABSTRACT

Modeling the complexity of sickle cell disease pathophysiology and severity is difficult. Using data from 3380 patients accounting for all common genotypes of sickle cell disease, Bayesian network modeling of 25 clinical events and laboratory tests was used to estimate sickle cell disease severity, which was represented as a score predicting the risk of death within 5 years. The reliability of the model was supported by analysis of 2 independent patient groups. In 1 group, the severity score was related to disease severity based on the opinion of expert clinicians. In the other group, the severity score was related to the presence and severity of pulmonary hypertension and the risk of death. Along with previously known risk factors for mortality, like renal insufficiency and leukocytosis, the network identified laboratory markers of the severity of hemolytic anemia and its associated clinical events as contributing risk factors. This model can be used to compute a personalized disease severity score allowing therapeutic decisions to be made according to the prognosis. The severity score could serve as an estimate of overall disease severity in genotype-phenotype association studies, and the model provides an additional method to study the complex pathophysiology of sickle cell disease.


Subject(s)
Anemia, Sickle Cell/epidemiology , Anemia, Sickle Cell/pathology , Models, Biological , Adolescent , Adult , Anemia, Sickle Cell/classification , Blood Platelets/cytology , Child , Child, Preschool , Humans , Leukocyte Count , Leukocytes , Platelet Count , Risk Factors
8.
BMC Genet ; 8: 36, 2007 Jun 25.
Article in English | MEDLINE | ID: mdl-17592642

ABSTRACT

BACKGROUND: The maximum likelihood estimator of D'--a standard measure of linkage disequilibrium--is biased toward disequilibrium, and the bias is particularly evident in small samples and rare haplotypes. RESULTS: This paper proposes a Bayesian estimation of D' to address this problem. The reduction of the bias is achieved by using a prior distribution on the pair-wise associations between single nucleotide polymorphisms (SNP)s that increases the likelihood of equilibrium with increasing physical distances between pairs of SNPs. We show how to compute the Bayesian estimate using a stochastic estimation based on MCMC methods, and also propose a numerical approximation to the Bayesian estimates that can be used to estimate patterns of LD in large datasets of SNPs. CONCLUSION: Our Bayesian estimator of D' corrects the bias toward disequilibrium that affects the maximum likelihood estimator. A consequence of this feature is a more objective view about the extent of linkage disequilibrium in the human genome, and a more realistic number of tagging SNPs to fully exploit the power of genome wide association studies.


Subject(s)
Bayes Theorem , Linkage Disequilibrium , Haplotypes , Human Genome Project , Humans , Likelihood Functions , Models, Genetic , Polymorphism, Single Nucleotide
9.
Bioinformatics ; 22(16): 1933-4, 2006 Aug 15.
Article in English | MEDLINE | ID: mdl-16782726

ABSTRACT

SUMMARY: BMapBuilder builds maps of pairwise linkage disequilibrium (LD) in either two or three dimensions. The optimized resolution allows for graphical display of LD for single nucleotide polymorphisms (SNPs) in a whole chromosome. AVAILABILITY: The program is coded in Java, which runs on all relevant operating systems, including Windows, Mac and Unix/Linux, and is available from http://bios.ugr.es/BMapBuilder.


Subject(s)
Chromosome Mapping/methods , Computational Biology/methods , Linkage Disequilibrium , Humans , Polymorphism, Single Nucleotide , Programming Languages , Software
SELECTION OF CITATIONS
SEARCH DETAIL
...