Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 7 de 7
Filter
Add more filters










Database
Language
Publication year range
1.
Nucleic Acids Res ; 45(W1): W484-W489, 2017 07 03.
Article in English | MEDLINE | ID: mdl-28531339

ABSTRACT

A considerable effort has been devoted to retrieve systematically information for genes and proteins as well as relationships between them. Despite the importance of chemical compounds and drugs as a central bio-entity in pharmacological and biological research, only a limited number of freely available chemical text-mining/search engine technologies are currently accessible. Here we present LimTox (Literature Mining for Toxicology), a web-based online biomedical search tool with special focus on adverse hepatobiliary reactions. It integrates a range of text mining, named entity recognition and information extraction components. LimTox relies on machine-learning, rule-based, pattern-based and term lookup strategies. This system processes scientific abstracts, a set of full text articles and medical agency assessment reports. Although the main focus of LimTox is on adverse liver events, it enables also basic searches for other organ level toxicity associations (nephrotoxicity, cardiotoxicity, thyrotoxicity and phospholipidosis). This tool supports specialized search queries for: chemical compounds/drugs, genes (with additional emphasis on key enzymes in drug metabolism, namely P450 cytochromes-CYPs) and biochemical liver markers. The LimTox website is free and open to all users and there is no login requirement. LimTox can be accessed at: http://limtox.bioinfo.cnio.es.


Subject(s)
Drug-Related Side Effects and Adverse Reactions , Software , Cytochrome P-450 Enzyme System , Data Mining , Genes , Internet , Liver/drug effects
2.
DNA Res ; 20(1): 93-108, 2013 Feb.
Article in English | MEDLINE | ID: mdl-23297299

ABSTRACT

Olive breeding programmes are focused on selecting for traits as short juvenile period, plant architecture suited for mechanical harvest, or oil characteristics, including fatty acid composition, phenolic, and volatile compounds to suit new markets. Understanding the molecular basis of these characteristics and improving the efficiency of such breeding programmes require the development of genomic information and tools. However, despite its economic relevance, genomic information on olive or closely related species is still scarce. We have applied Sanger and 454 pyrosequencing technologies to generate close to 2 million reads from 12 cDNA libraries obtained from the Picual, Arbequina, and Lechin de Sevilla cultivars and seedlings from a segregating progeny of a Picual × Arbequina cross. The libraries include fruit mesocarp and seeds at three relevant developmental stages, young stems and leaves, active juvenile and adult buds as well as dormant buds, and juvenile and adult roots. The reads were assembled by library or tissue and then assembled together into 81 020 unigenes with an average size of 496 bases. Here, we report their assembly and their functional annotation.


Subject(s)
Genome, Plant , Molecular Sequence Annotation , Olea/genetics , Transcriptome , Breeding , Databases, Genetic , Expressed Sequence Tags , Fruit/chemistry , Gene Library , Olive Oil , Plant Oils/chemistry , Seeds/genetics , Sequence Analysis, DNA
3.
Hum Mutat ; 32(2): E1999-2017, 2011 Feb.
Article in English | MEDLINE | ID: mdl-21280140

ABSTRACT

The tumor suppressor gene, SMARCA4 (or BRG1), which encodes the ATPase component of the chromatin remodeling complex SWI/SNF, is commonly inactivated by mutations and deletions in lung cancer cell lines. However, SMARCA4 alterations appear to be rare in lung primary tumors. Ultra-deep sequencing technologies provide a promising alternative to achieve a sensitivity superior to that of current sequencing strategies. Here we used ultra-deep pyrosequencing to screen for mutations over the entire SMARCA4 coding region in 12 lung tumors without detectable BRG1 protein. While automatic-fluorescence-based sequencing detected one somatic mutation (p.K586X), the pyrosequencing revealed additional variants, thus increasing the sensitivity. One of the variants, which affected a consensus splice site, was confirmed by individual cloning of PCR products, ruling out the possibility of PCR or pyrosequencing artifacts. This mutation, confirmed to be somatic, was present at a frequency of ten percent, suggesting normal cell contamination in the tumor. Our analysis also allowed us to determine the sensitivity and to identify some limitations of the technology. In conclusion, in addition to cell lines, SMARCA4 is biallelically inactivated in a significant proportion of lung primary tumors, thereby constituting one of the most important genes contributing to the development of this type of cancer.


Subject(s)
DNA Helicases/genetics , High-Throughput Nucleotide Sequencing/methods , Lung Neoplasms/diagnosis , Nuclear Proteins/genetics , Sequence Analysis, DNA/methods , Transcription Factors/genetics , Cell Line, Tumor , Humans , Lung Neoplasms/genetics , Mutation
4.
Nucleic Acids Res ; 36(Web Server issue): W364-7, 2008 Jul 01.
Article in English | MEDLINE | ID: mdl-18467422

ABSTRACT

Many biological experiments and their subsequent analysis yield lists of genes or proteins that can potentially be important to the prognosis or diagnosis of certain diseases (e.g. cancer). Nowadays, information about the function of those genes or proteins may be already gathered in some databases, but it is essential to understand if some of the members of those lists have a function in common or if they belong to the same metabolic pathway. To help researchers filter those genes or proteins that have such information in common, we have developed PaLS (pathway and literature strainer, http://pals.bioinfo.cnio.es). PaLS takes a list or a set of lists of gene or protein identifiers and shows which ones share certain descriptors. Four publicly available databases have been used for this purpose: PubMed, which links genes with those articles that make reference to them; Gene Ontology, an annotated ontology of terms related to the cellular component, biological process or molecular function where those genes or proteins are involved; KEGG pathways and Reactome pathways. Those descriptors among these four sources of information that are shared by more members of the list (or lists) are highlighted by PaLS.


Subject(s)
Genes/physiology , Proteins/metabolism , Software , Computer Graphics , Databases, Factual , Gene Expression Profiling , Internet , Metabolic Networks and Pathways , PubMed , User-Computer Interface , Vocabulary, Controlled
5.
Nucleic Acids Res ; 35(Web Server issue): W75-80, 2007 Jul.
Article in English | MEDLINE | ID: mdl-17488846

ABSTRACT

Asterias (http://www.asterias.info) is an open-source, web-based, suite for the analysis of gene expression and aCGH data. Asterias implements validated statistical methods, and most of the applications use parallel computing, which permits taking advantage of multicore CPUs and computing clusters. Access to, and further analysis of, additional biological information and annotations (PubMed references, Gene Ontology terms, KEGG and Reactome pathways) are available either for individual genes (from clickable links in tables and figures) or sets of genes. These applications cover from array normalization to imputation and preprocessing, differential gene expression analysis, class and survival prediction and aCGH analysis. The source code is available, allowing for extention and reuse of the software. The links and analysis of additional functional information, parallelization of computation and open-source availability of the code make Asterias a unique suite that can exploit features specific to web-based environments.


Subject(s)
Computational Biology/methods , Gene Expression Profiling , Internet , Nucleic Acid Hybridization , Oligonucleotide Array Sequence Analysis , Animals , Automation , Genomics , Humans , Programming Languages , Software , User-Computer Interface
6.
BMC Bioinformatics ; 8: 9, 2007 Jan 10.
Article in English | MEDLINE | ID: mdl-17214880

ABSTRACT

BACKGROUND: Researchers involved in the annotation of large numbers of gene, clone or protein identifiers are usually required to perform a one-by-one conversion for each identifier. When the field of research is one such as microarray experiments, this number may be around 30,000. RESULTS: To help researchers map accession numbers and identifiers among clones, genes, proteins and chromosomal positions, we have designed and developed IDconverter and IDClight. They are two user-friendly, freely available web server applications that also provide additional functional information by mapping the identifiers on to pathways, Gene Ontology terms, and literature references. Both tools are high-throughput oriented and include identifiers for the most common genomic databases. These tools have been compared to other similar tools, showing that they are among the fastest and the most up-to-date. CONCLUSION: These tools provide a fast and intuitive way of enriching the information coming out of high-throughput experiments like microarrays. They can be valuable both to wet-lab researchers and to bioinformaticians.


Subject(s)
Database Management Systems , Databases, Protein , Genes , Proteins/chemistry , Proteins/classification , Software , Terminology as Topic , Algorithms , Amino Acid Sequence , Information Storage and Retrieval , Molecular Sequence Data , Oligonucleotide Array Sequence Analysis/methods , Proteins/metabolism
7.
Cancer Inform ; 3: 1-9, 2007 Feb 03.
Article in English | MEDLINE | ID: mdl-19455230

ABSTRACT

The analysis of expression and CGH arrays plays a central role in the study of complex diseases, especially cancer, including finding markers for early diagnosis and prognosis, choosing an optimal therapy, or increasing our understanding of cancer development and metastasis. Asterias (http://www.asterias.info) is an integrated collection of freely-accessible web tools for the analysis of gene expression and aCGH data. Most of the tools use parallel computing (via MPI) and run on a server with 60 CPUs for computation; compared to a desktop or server-based but not parallelized application, parallelization provides speed ups of factors up to 50. Most of our applications allow the user to obtain additional information for user-selected genes (chromosomal location, PubMed ids, Gene Ontology terms, etc.) by using clickable links in tables and/or figures. Our tools include: normalization of expression and aCGH data (DNMAD); converting between different types of gene/clone and protein identifiers (IDconverter/IDClight); filtering and imputation (preP); finding differentially expressed genes related to patient class and survival data (Pomelo II); searching for models of class prediction (Tnasas); using random forests to search for minimal models for class prediction or for large subsets of genes with predictive capacity (GeneSrF); searching for molecular signatures and predictive genes with survival data (SignS); detecting regions of genomic DNA gain or loss (ADaCGH). The capability to send results between different applications, access to additional functional information, and parallelized computation make our suite unique and exploit features only available to web-based applications.

SELECTION OF CITATIONS
SEARCH DETAIL
...