Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Eur J Hum Genet ; 24(4): 521-8, 2016 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-26306643

RESUMO

A wealth of biospecimen samples are stored in modern globally distributed biobanks. Biomedical researchers worldwide need to be able to combine the available resources to improve the power of large-scale studies. A prerequisite for this effort is to be able to search and access phenotypic, clinical and other information about samples that are currently stored at biobanks in an integrated manner. However, privacy issues together with heterogeneous information systems and the lack of agreed-upon vocabularies have made specimen searching across multiple biobanks extremely challenging. We describe three case studies where we have linked samples and sample descriptions in order to facilitate global searching of available samples for research. The use cases include the ENGAGE (European Network for Genetic and Genomic Epidemiology) consortium comprising at least 39 cohorts, the SUMMIT (surrogate markers for micro- and macro-vascular hard endpoints for innovative diabetes tools) consortium and a pilot for data integration between a Swedish clinical health registry and a biobank. We used the Sample avAILability (SAIL) method for data linking: first, created harmonised variables and then annotated and made searchable information on the number of specimens available in individual biobanks for various phenotypic categories. By operating on this categorised availability data we sidestep many obstacles related to privacy that arise when handling real values and show that harmonised and annotated records about data availability across disparate biomedical archives provide a key methodological advance in pre-analysis exchange of information between biobanks, that is, during the project planning phase.


Assuntos
Bancos de Espécimes Biológicos , Bases de Dados Factuais , Armazenamento e Recuperação da Informação/métodos , Armazenamento e Recuperação da Informação/ética , Armazenamento e Recuperação da Informação/normas , Privacidade
2.
Nat Genet ; 44(10): 1084-9, 2012 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-22941192

RESUMO

Sequence-based variation in gene expression is a key driver of disease risk. Common variants regulating expression in cis have been mapped in many expression quantitative trait locus (eQTL) studies, typically in single tissues from unrelated individuals. Here, we present a comprehensive analysis of gene expression across multiple tissues conducted in a large set of mono- and dizygotic twins that allows systematic dissection of genetic (cis and trans) and non-genetic effects on gene expression. Using identity-by-descent estimates, we show that at least 40% of the total heritable cis effect on expression cannot be accounted for by common cis variants, a finding that reveals the contribution of low-frequency and rare regulatory variants with respect to both transcriptional regulation and complex trait susceptibility. We show that a substantial proportion of gene expression heritability is trans to the structural gene, and we identify several replicating trans variants that act predominantly in a tissue-restricted manner and may regulate the transcription of many genes.


Assuntos
Mapeamento Cromossômico , Regulação da Expressão Gênica , Transcrição Gênica , Adulto , Idoso , Idoso de 80 Anos ou mais , Feminino , Interação Gene-Ambiente , Ligação Genética , Humanos , Linfócitos/metabolismo , Pessoa de Meia-Idade , Modelos Genéticos , Especificidade de Órgãos , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Pele/metabolismo , Gordura Subcutânea/metabolismo
3.
Stud Health Technol Inform ; 180: 569-73, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22874255

RESUMO

World-wide availability of biobank samples is a great desideratum for biomedical researchers. We describe the use case of biobank information retrieval that requires the semantic descriptions of biobank samples and of clinical information. In addition we sketch the foundations of an ontology for biobanks, as a basis on which distributed biobank indexing and retrieval systems can be built. We advocate that a detailed and robust representation of this kind of information improves and allows complex queries that will certainly arise to explore the full potential of biobanks.


Assuntos
Bancos de Espécimes Biológicos/organização & administração , Sistemas de Gerenciamento de Base de Dados , Bases de Dados Factuais , Armazenamento e Recuperação da Informação/métodos , Semântica , Interface Usuário-Computador , Europa (Continente)
4.
PLoS Genet ; 7(9): e1002270, 2011 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-21931564

RESUMO

We have performed a metabolite quantitative trait locus (mQTL) study of the (1)H nuclear magnetic resonance spectroscopy ((1)H NMR) metabolome in humans, building on recent targeted knowledge of genetic drivers of metabolic regulation. Urine and plasma samples were collected from two cohorts of individuals of European descent, with one cohort comprised of female twins donating samples longitudinally. Sample metabolite concentrations were quantified by (1)H NMR and tested for association with genome-wide single-nucleotide polymorphisms (SNPs). Four metabolites' concentrations exhibited significant, replicable association with SNP variation (8.6×10(-11)

Assuntos
Estudo de Associação Genômica Ampla , Redes e Vias Metabólicas/genética , Metaboloma/genética , Locos de Características Quantitativas/genética , Seleção Genética , Acetiltransferases/genética , Acetiltransferases/metabolismo , Dimetilaminas/sangue , Dimetilaminas/metabolismo , Feminino , Haplótipos , Humanos , Isobutiratos/metabolismo , Isobutiratos/urina , Espectroscopia de Ressonância Magnética , Metilaminas/metabolismo , Metilaminas/urina , Polimorfismo de Nucleotídeo Único
5.
Mol Syst Biol ; 7: 525, 2011 Aug 30.
Artigo em Inglês | MEDLINE | ID: mdl-21878913

RESUMO

¹H Nuclear Magnetic Resonance spectroscopy (¹H NMR) is increasingly used to measure metabolite concentrations in sets of biological samples for top-down systems biology and molecular epidemiology. For such purposes, knowledge of the sources of human variation in metabolite concentrations is valuable, but currently sparse. We conducted and analysed a study to create such a resource. In our unique design, identical and non-identical twin pairs donated plasma and urine samples longitudinally. We acquired ¹H NMR spectra on the samples, and statistically decomposed variation in metabolite concentration into familial (genetic and common-environmental), individual-environmental, and longitudinally unstable components. We estimate that stable variation, comprising familial and individual-environmental factors, accounts on average for 60% (plasma) and 47% (urine) of biological variation in ¹H NMR-detectable metabolite concentrations. Clinically predictive metabolic variation is likely nested within this stable component, so our results have implications for the effective design of biomarker-discovery studies. We provide a power-calculation method which reveals that sample sizes of a few thousand should offer sufficient statistical precision to detect ¹H NMR-based biomarkers quantifying predisposition to disease.


Assuntos
Biomarcadores , Interação Gene-Ambiente , Metaboloma/genética , Ressonância Magnética Nuclear Biomolecular/métodos , Biologia de Sistemas/métodos , População Branca/genética , Idoso , Algoritmos , Biomarcadores/sangue , Biomarcadores/urina , Bases de Dados Genéticas , Feminino , Variação Genética , Humanos , Pessoa de Meia-Idade , Modelos Estatísticos , Projetos de Pesquisa , Tamanho da Amostra , Gêmeos Dizigóticos/genética , Gêmeos Monozigóticos/genética
6.
Bioinformatics ; 27(4): 589-91, 2011 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-21169373

RESUMO

SUMMARY: The Sample avAILability system-SAIL-is a web based application for searching, browsing and annotating biological sample collections or biobank entries. By providing individual-level information on the availability of specific data types (phenotypes, genetic or genomic data) and samples within a collection, rather than the actual measurement data, resource integration can be facilitated. A flexible data structure enables the collection owners to provide descriptive information on their samples using existing or custom vocabularies. Users can query for the available samples by various parameters combining them via logical expressions. The system can be scaled to hold data from millions of samples with thousands of variables. AVAILABILITY: SAIL is available under Aferro-GPL open source license: https://github.com/sail.


Assuntos
Biologia Computacional/métodos , Sistemas de Gerenciamento de Base de Dados , Genômica/métodos , Fenótipo , Software , Internet , Metanálise como Assunto
7.
Nucleic Acids Res ; 38(Web Server issue): W78-83, 2010 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-20519200

RESUMO

R spider is a web-based tool for the analysis of a gene list using the systematic knowledge of core pathways and reactions in human biology accumulated in the Reactome and KEGG databases. R spider implements a network-based statistical framework, which provides a global understanding of gene relations in the supplied gene list, and fully exploits the Reactome and KEGG knowledge bases. R spider provides a user-friendly dialog-driven web interface for several model organisms and supports most available gene identifiers. R spider is freely available at http://mips.helmholtz-muenchen.de/proj/rspider.


Assuntos
Bases de Dados Genéticas , Redes Reguladoras de Genes , Redes e Vias Metabólicas/genética , Mapeamento de Interação de Proteínas , Transdução de Sinais/genética , Software , Gráficos por Computador , Bases de Dados de Proteínas , Humanos , Internet , Proteínas/genética , Integração de Sistemas , Interface Usuário-Computador
8.
Bioinformatics ; 25(20): 2768-9, 2009 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-19633095

RESUMO

UNLABELLED: SIMBioMS is a web-based open source software system for managing data and information in biomedical studies. It provides a solution for the collection, storage, management and retrieval of information about research subjects and biomedical samples, as well as experimental data obtained using a range of high-throughput technologies, including gene expression, genotyping, proteomics and metabonomics. The system can easily be customized and has proven to be successful in several large-scale multi-site collaborative projects. It is compatible with emerging functional genomics data standards and provides data import and export in accepted standard formats. Protocols for transferring data to durable archives at the European Bioinformatics Institute have been implemented. AVAILABILITY: The source code, documentation and initialization scripts are available at http://simbioms.org.


Assuntos
Biologia Computacional/métodos , Sistemas de Gerenciamento de Base de Dados , Gestão da Informação/métodos , Armazenamento e Recuperação da Informação/métodos , Software , Bases de Dados Factuais
9.
Nucleic Acids Res ; 37(Database issue): D868-72, 2009 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-19015125

RESUMO

ArrayExpress http://www.ebi.ac.uk/arrayexpress consists of three components: the ArrayExpress Repository--a public archive of functional genomics experiments and supporting data, the ArrayExpress Warehouse--a database of gene expression profiles and other bio-measurements and the ArrayExpress Atlas--a new summary database and meta-analytical tool of ranked gene expression across multiple experiments and different biological conditions. The Repository contains data from over 6000 experiments comprising approximately 200,000 assays, and the database doubles in size every 15 months. The majority of the data are array based, but other data types are included, most recently-ultra high-throughput sequencing transcriptomics and epigenetic data. The Warehouse and Atlas allow users to query for differentially expressed genes by gene names and properties, experimental conditions and sample properties, or a combination of both. In this update, we describe the ArrayExpress developments over the last two years.


Assuntos
Bases de Dados Genéticas , Perfilação da Expressão Gênica , Análise de Sequência com Séries de Oligonucleotídeos , Genômica
10.
Nat Genet ; 41(1): 77-81, 2009 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-19060907

RESUMO

To identify previously unknown genetic loci associated with fasting glucose concentrations, we examined the leading association signals in ten genome-wide association scans involving a total of 36,610 individuals of European descent. Variants in the gene encoding melatonin receptor 1B (MTNR1B) were consistently associated with fasting glucose across all ten studies. The strongest signal was observed at rs10830963, where each G allele (frequency 0.30 in HapMap CEU) was associated with an increase of 0.07 (95% CI = 0.06-0.08) mmol/l in fasting glucose levels (P = 3.2 x 10(-50)) and reduced beta-cell function as measured by homeostasis model assessment (HOMA-B, P = 1.1 x 10(-15)). The same allele was associated with an increased risk of type 2 diabetes (odds ratio = 1.09 (1.05-1.12), per G allele P = 3.3 x 10(-7)) in a meta-analysis of 13 case-control studies totaling 18,236 cases and 64,453 controls. Our analyses also confirm previous associations of fasting glucose with variants at the G6PC2 (rs560887, P = 1.1 x 10(-57)) and GCK (rs4607517, P = 1.0 x 10(-25)) loci.


Assuntos
Glicemia/genética , Jejum/sangue , Polimorfismo de Nucleotídeo Único/genética , Receptor MT2 de Melatonina/genética , Receptores de Melatonina/genética , Estudos de Casos e Controles , Diabetes Mellitus Tipo 2/sangue , Diabetes Mellitus Tipo 2/genética , Diabetes Mellitus Tipo 2/fisiopatologia , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Humanos , Metanálise como Assunto , Locos de Características Quantitativas/genética
11.
BMC Bioinformatics ; 8: 52, 2007 Feb 09.
Artigo em Inglês | MEDLINE | ID: mdl-17291344

RESUMO

BACKGROUND: One of the crucial aspects of day-to-day laboratory information management is collection, storage and retrieval of information about research subjects and biomedical samples. An efficient link between sample data and experiment results is absolutely imperative for a successful outcome of a biomedical study. Currently available software solutions are largely limited to large-scale, expensive commercial Laboratory Information Management Systems (LIMS). Acquiring such LIMS indeed can bring laboratory information management to a higher level, but often implies sufficient investment of time, effort and funds, which are not always available. There is a clear need for lightweight open source systems for patient and sample information management. RESULTS: We present a web-based tool for submission, management and retrieval of sample and research subject data. The system secures confidentiality by separating anonymized sample information from individuals' records. It is simple and generic, and can be customised for various biomedical studies. Information can be both entered and accessed using the same web interface. User groups and their privileges can be defined. The system is open-source and is supplied with an on-line tutorial and necessary documentation. It has proven to be successful in a large international collaborative project. CONCLUSION: The presented system closes the gap between the need and the availability of lightweight software solutions for managing information in biomedical studies involving human research subjects.


Assuntos
Sistemas de Gerenciamento de Base de Dados , Bases de Dados Factuais , Armazenamento e Recuperação da Informação/métodos , Sistemas Computadorizados de Registros Médicos , Software , Interface Usuário-Computador , Inteligência Artificial , Engenharia Biomédica/métodos , Pesquisa Biomédica/métodos , Ensaios Clínicos como Assunto/métodos , Linguagens de Programação
12.
Nat Rev Genet ; 7(8): 593-605, 2006 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-16847461

RESUMO

High-throughput technologies are generating large amounts of complex data that have to be stored in databases, communicated to various data analysis tools and interpreted by scientists. Data representation and communication standards are needed to implement these steps efficiently. Here we give a classification of various standards related to systems biology and discuss various aspects of standardization in life sciences in general. Why are some standards more successful than others, what are the prerequisites for a standard to succeed and what are the possible pitfalls?


Assuntos
Biologia de Sistemas/normas , Animais , Humanos , Biologia de Sistemas/métodos , Biologia de Sistemas/tendências , Reino Unido
13.
Nucleic Acids Res ; 33(Database issue): D201-5, 2005 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-15608177

RESUMO

InterPro, an integrated documentation resource of protein families, domains and functional sites, was created to integrate the major protein signature databases. Currently, it includes PROSITE, Pfam, PRINTS, ProDom, SMART, TIGRFAMs, PIRSF and SUPERFAMILY. Signatures are manually integrated into InterPro entries that are curated to provide biological and functional information. Annotation is provided in an abstract, Gene Ontology mapping and links to specialized databases. New features of InterPro include extended protein match views, taxonomic range information and protein 3D structure data. One of the new match views is the InterPro Domain Architecture view, which shows the domain composition of protein matches. Two new entry types were introduced to better describe InterPro entries: these are active site and binding site. PIRSF and the structure-based SUPERFAMILY are the latest member databases to join InterPro, and CATH and PANTHER are soon to be integrated. InterPro release 8.0 contains 11 007 entries, representing 2573 domains, 8166 families, 201 repeats, 26 active sites, 21 binding sites and 20 post-translational modification sites. InterPro covers over 78% of all proteins in the Swiss-Prot and TrEMBL components of UniProt. The database is available for text- and sequence-based searches via a webserver (http://www.ebi.ac.uk/interpro), and for download by anonymous FTP (ftp://ftp.ebi.ac.uk/pub/databases/interpro).


Assuntos
Bases de Dados de Proteínas , Proteínas/química , Proteínas/classificação , Análise de Sequência de Proteína , Bases de Dados de Proteínas/tendências , Humanos , Estrutura Terciária de Proteína , Alinhamento de Sequência , Integração de Sistemas
14.
Genome Biol ; 4(1): R6, 2003.
Artigo em Inglês | MEDLINE | ID: mdl-12540298

RESUMO

BACKGROUND: Microarray experiments are generating datasets that can help in reconstructing gene networks. One of the most important problems in network reconstruction is finding, for each gene in the network, which genes can affect it and how. We use a supervised learning approach to address this question by building decision-tree-related classifiers, which predict gene expression from the expression data of other genes. RESULTS: We present algorithms that work for continuous expression levels and do not require a priori discretization. We apply our method to publicly available data for the budding yeast cell cycle. The obtained classifiers can be presented as simple rules defining gene interrelations. In most cases the extracted rules confirm the existing knowledge about cell-cycle gene expression, while hitherto unknown relationships can be treated as new hypotheses. CONCLUSIONS: All the relations between the considered genes are consistent with the facts reported in the literature. This indicates that the approach presented here is valid and that the resulting rules can be used as elements for building and explaining gene networks.


Assuntos
Algoritmos , Perfilação da Expressão Gênica/métodos , Ciclinas/genética , Regulação Fúngica da Expressão Gênica , Reprodutibilidade dos Testes , Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae
15.
Nucleic Acids Res ; 31(1): 315-8, 2003 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-12520011

RESUMO

InterPro, an integrated documentation resource of protein families, domains and functional sites, was created in 1999 as a means of amalgamating the major protein signature databases into one comprehensive resource. PROSITE, Pfam, PRINTS, ProDom, SMART and TIGRFAMs have been manually integrated and curated and are available in InterPro for text- and sequence-based searching. The results are provided in a single format that rationalises the results that would be obtained by searching the member databases individually. The latest release of InterPro contains 5629 entries describing 4280 families, 1239 domains, 95 repeats and 15 post-translational modifications. Currently, the combined signatures in InterPro cover more than 74% of all proteins in SWISS-PROT and TrEMBL, an increase of nearly 15% since the inception of InterPro. New features of the database include improved searching capabilities and enhanced graphical user interfaces for visualisation of the data. The database is available via a webserver (http://www.ebi.ac.uk/interpro) and anonymous FTP (ftp://ftp.ebi.ac.uk/pub/databases/interpro).


Assuntos
Bases de Dados de Proteínas , Proteínas/química , Animais , Gráficos por Computador , Processamento de Proteína Pós-Traducional , Estrutura Terciária de Proteína , Proteínas/genética , Proteínas/metabolismo , Sequências Repetitivas de Aminoácidos , Interface Usuário-Computador
16.
Brief Bioinform ; 3(3): 225-35, 2002 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-12230031

RESUMO

The exponential increase in the submission of nucleotide sequences to the nucleotide sequence database by genome sequencing centres has resulted in a need for rapid, automatic methods for classification of the resulting protein sequences. There are several signature and sequence cluster-based methods for protein classification, each resource having distinct areas of optimum application owing to the differences in the underlying analysis methods. In recognition of this, InterPro was developed as an integrated documentation resource for protein families, domains and functional sites, to rationalise the complementary efforts of the individual protein signature database projects. The member databases - PRINTS, PROSITE, Pfam, ProDom, SMART and TIGRFAMs - form the InterPro core. Related signatures from each member database are unified into single InterPro entries. Each InterPro entry includes a unique accession number, functional descriptions and literature references, and links are made back to the relevant member database(s). Release 4.0 of InterPro (November 2001) contains 4,691 entries, representing 3,532 families, 1,068 domains, 74 repeats and 15 sites of post-translational modification (PTMs) encoded by different regular expressions, profiles, fingerprints and hidden Markov models (HMMs). Each InterPro entry lists all the matches against SWISS-PROT and TrEMBL (2,141,621 InterPro hits from 586,124 SWISS-PROT and TrEMBL protein sequences). The database is freely accessible for text- and sequence-based searches.


Assuntos
Biologia Computacional , Bases de Dados de Proteínas , Proteínas , Algoritmos , Humanos , Serviços de Informação , Internet , Proteínas/química , Proteínas/classificação , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...