Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Genet Epidemiol ; 31 Suppl 1: S43-50, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-18046764

RESUMO

The complexity of data available in human genetics continues to grow at an explosive rate. With that growth, the challenges to understanding the meaning of the underlying information also grow. A currently popular approach to dissecting such information falls under the broad category of data mining. This can apply to any approach that tries to extract relevant information from large amounts of data, but often refers to methods that deal, in a non-linear fashion, with very large numbers of variables that cannot be simultaneously handled by more conventional statistical methods. To explore the usefulness of some of these approaches, 13 groups applied a variety of strategies to the first dataset provided to GAW 15 participants. With the extensive microarray and SNP data provided for 14 CEPH families, these groups explored multistage analyses, machine learning methods, network construction, and other techniques to try to answer questions about gene-gene interaction, functional similarities, co-regulated gene expression and the mapping of gene expression determinants, among others. In general, the methods offered strategies to provide a better understanding of the complex pathways involved in gene expression and function. These are still "works in progress," often exploratory in nature, but they provide insights into ways in which the data might be interpreted. Despite the still preliminary nature of some of these methods and the diversity of the approaches, some common themes emerged. The collection of papers and methods offer a starting point for further exploration of complex interactions in human genetic data now readily available.


Assuntos
DNA/genética , Genética Médica , RNA/genética , Teorema de Bayes , Ligação Genética , Impressão Genômica , Genótipo , Humanos , Análise de Sequência com Séries de Oligonucleotídeos , Polimorfismo de Nucleotídeo Único
3.
Genet Epidemiol ; 29 Suppl 1: S103-9, 2005.
Artigo em Inglês | MEDLINE | ID: mdl-16342179

RESUMO

Group 14 used data-mining strategies to evaluate a number of issues, including appropriate diagnosis, haplotype estimation, genetic linkage and association studies, and type I error. Methods ranged from exploratory analyses, to machine learning strategies (neural networks, supervised learning, and tree-based methods), to false discovery rate control of type I errors. The general motivations were to find the "story" in the data and to summarize information from a multitude of measures. Several methods illustrated strategies for better trait definition, using summarization of related traits. In the few studies that sought to identify genes for alcoholism, there was little agreement among the different strategies, likely reflecting the complexities of the disease. Nevertheless, Group 14 found that these methods offered strategies to gain a better understanding of the complex pathways by which disease develops.


Assuntos
Alcoolismo/genética , Inteligência Artificial , Análise Citogenética/métodos , Redes Neurais de Computação , Alcoolismo/diagnóstico , Mapeamento Cromossômico/métodos , Árvores de Decisões , Marcadores Genéticos/genética , Haplótipos , Humanos
4.
BMC Genet ; 6 Suppl 1: S131, 2005 Dec 30.
Artigo em Inglês | MEDLINE | ID: mdl-16451590

RESUMO

BACKGROUND: Alcoholism is a serious public health problem. It has both genetic and environmental causes. In an effort to gain understanding of the underlying genetic susceptibility to alcoholism, a long-term study has been undertaken. The Collaborative Study on the Genetics of Alcoholism (COGA) provides a rich source of genetic and phenotypic data. One ongoing problem is the difficulty of reliably diagnosing alcoholism, despite many known risk factors and measurements. We have applied a well known pattern-matching method, neural network analysis, to phenotypic data provided to participants in Genetic Analysis Workshop 14 by COGA. The aim is to train the network to recognize complex phenotypic patterns that are characteristic of those with alcoholism as well as those who are free of symptoms. Our results indicate that this approach may be helpful in the diagnosis of alcoholism. RESULTS: Training and testing of input/output pairs of risk factors by means of a "feed-forward back-propagation" neural network resulted in reliability of about 94% in predicting the presence or absence of alcoholism based on 36 input phenotypic risk factors. Pruning the neural network to remove relatively uninformative factors resulted in a reduced network of 14 input factors that was still 95% reliable. Some of the factors selected by the pruning steps have been identified as traits that show either linkage or association to potential candidate regions. CONCLUSION: The complex, multivariate picture formed by known risk factors for alcoholism can be incorporated into a neural network analysis that reliably predicts the presence or absence of alcoholism about 94-95% of the time. Several characteristics that were identified by a pruned neural network have previously been shown to be important in this disease based on more traditional linkage and association studies. Neural networks therefore provide one less traditional approach to both identifying alcoholic individuals and determining the most informative risk factors.


Assuntos
Alcoolismo/diagnóstico , Predisposição Genética para Doença , Redes Neurais de Computação , Consumo de Bebidas Alcoólicas/genética , Humanos , Fenótipo , Fatores de Risco
5.
Genet Epidemiol ; 25 Suppl 1: S57-63, 2003.
Artigo em Inglês | MEDLINE | ID: mdl-14635170

RESUMO

The Framingham Heart Study data, as well as a related simulated data set, were generously provided to the participants of the Genetic Analysis Workshop 13 in order that newly developed and emerging statistical methodologies could be tested on that well-characterized data set. The impetus driving the development of novel methods is to elucidate the contributions of genes, environment, and interactions between and among them, as well as to allow comparison between and validation of methods. The seven papers that comprise this group used data-mining methodologies (tree-based methods, neural networks, discriminant analysis, and Bayesian variable selection) in an attempt to identify the underlying genetics of cardiovascular disease and related traits in the presence of environmental and genetic covariates. Data-mining strategies are gaining popularity because they are extremely flexible and may have greater efficiency and potential in identifying the factors involved in complex disorders. While the methods grouped together here constitute a diverse collection, some papers asked similar questions with very different methods, while others used the same underlying methodology to ask very different questions. This paper briefly describes the data-mining methodologies applied to the Genetic Analysis Workshop 13 data sets and the results of those investigations.


Assuntos
Doenças Cardiovasculares/epidemiologia , Doenças Cardiovasculares/genética , Ligação Genética , Computação Matemática , Modelos Genéticos , Modelos Estatísticos , Teorema de Bayes , Análise Discriminante , Predisposição Genética para Doença , Humanos , Síndrome Metabólica/epidemiologia , Síndrome Metabólica/genética , Redes Neurais de Computação , Fatores de Risco
6.
BMC Genet ; 4 Suppl 1: S67, 2003 Dec 31.
Artigo em Inglês | MEDLINE | ID: mdl-14975135

RESUMO

BACKGROUND: The Framingham Heart Study was initiated in 1948 as a long-term longitudinal study to identify risk factors associated with cardiovascular disease (CVD). Over the years the scope of the study has expanded to include offspring and other family members of the original cohort, marker data useful for gene mapping and information on other diseases. As a result, it is a rich resource for many areas of research going beyond the original goals. As part of the Genetic Analysis Workshop 13, we used data from the study to evaluate the ability of neural networks to use CVD risk factors as training data for predictions of normal and high blood pressure. RESULTS: Applying two different strategies to the coding of CVD risk data as risk factors (one longitudinal and one independent of time), we found that neural networks could not be trained to clearly separate individuals into normal and high blood pressure groups. When training was successful, validation was not, suggesting over-fitting of the model. When the number of parameters was reduced, training was not as good. An analysis of the input data showed that the neural networks were, in fact, finding consistent patterns, but that these patterns were not correlated with the presence or absence of high blood pressure. CONCLUSION: Neural network analysis, applied to risk factors for CVD in the Framingham data, did not lead to a clear classification of individuals into groups with normal and high blood pressure. Thus, although high blood pressure may itself be a risk factor for CVD, it does not appear to be clearly predictable using observations from a set of other CVD risk factors.


Assuntos
Doença da Artéria Coronariana/epidemiologia , Doença da Artéria Coronariana/genética , Hipertensão/diagnóstico , Redes Neurais de Computação , Adulto , Estudos de Coortes , Doença da Artéria Coronariana/complicações , Feminino , Humanos , Hipertensão/epidemiologia , Hipertensão/genética , Padrões de Herança/genética , Estudos Longitudinais , Masculino , Pessoa de Meia-Idade , Modelos Estatísticos , Valor Preditivo dos Testes , Fatores de Risco , Tempo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...