Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 13 de 13
Filter
Add more filters










Publication year range
3.
Bioinformatics ; 36(6): 1889-1895, 2020 03 01.
Article in English | MEDLINE | ID: mdl-31647529

ABSTRACT

MOTIVATION: There is an increasing amount of transcriptomic and genomic data available for planarians with the advent of both traditional and single-cell RNA sequencing technologies. Therefore, exploring, visualizing and making sense of all these data in order to understand planarian regeneration and development can be challenging. RESULTS: In this work, we present PlanExp, a web-application to explore and visualize gene expression data from different RNA-seq experiments (both traditional and single-cell RNA-seq) for the planaria Schmidtea mediterranea. PlanExp provides tools for creating different interactive plots, such as heatmaps, scatterplots, etc. and links them with the current sequence annotations both at the genome and the transcript level thanks to its integration with the PlanNET web application. PlanExp also provides a full gene/protein network editor, a prediction of genetic interactions from single-cell RNA-seq data, and a network expression mapper that will help researchers to close the gap between systems biology and planarian regeneration. AVAILABILITY AND IMPLEMENTATION: PlanExp is freely available at https://compgen.bio.ub.edu/PlanNET/planexp. The source code is available at https://compgen.bio.ub.edu/PlanNET/downloads. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Planarians/genetics , Animals , RNA-Seq , Sequence Analysis, RNA , Software , Exome Sequencing
4.
Bioinformatics ; 35(14): 2523-2524, 2019 07 15.
Article in English | MEDLINE | ID: mdl-30500875

ABSTRACT

MOTIVATION: Protein-protein interactions (PPIs) are very important to build models for understanding many biological processes. Although several databases hold many of these interactions, exploring them, selecting those relevant for a given subject and contextualizing them can be a difficult task for researchers. Extracting PPIs directly from the scientific literature can be very helpful for providing such context, as the sentences describing these interactions may give insights to researchers in helpful ways. RESULTS: We have developed PPaxe, a python module and a web application that allows users to extract PPIs and protein occurrence from a given set of PubMed and PubMedCentral articles. It presents the results of the analysis in different ways to help researchers export, filter and analyze the results easily. AVAILABILITY AND IMPLEMENTATION: PPaxe web demo is freely available at https://compgen.bio.ub.edu/PPaxe. All the software can be downloaded from https://compgen.bio.ub.edu/PPaxe/download, including a command-line version and docker containers for an easy installation. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Software , Databases, Factual , Proteins , PubMed , Publications
6.
Sci Total Environ ; 618: 870-880, 2018 Mar 15.
Article in English | MEDLINE | ID: mdl-29108696

ABSTRACT

The application of next-generation sequencing (NGS) techniques for the identification of viruses present in urban sewage has not been fully explored. This is partially due to a lack of reliable and sensitive protocols for studying viral diversity and to the highly complex analysis required for NGS data processing. One important step towards this goal is finding methods that can efficiently concentrate viruses from sewage samples. Here the application of a virus concentration method based on skimmed milk organic flocculation (SMF) using 10L of sewage collected in different seasons enabled the detection of many viruses. However, some viruses, such as human adenoviruses, could not always be detected using metagenomics, even when quantitative PCR (qPCR) assessments were positive. A targeted metagenomic assay for adenoviruses was conducted and 59.41% of the obtained reads were assigned to murine adenoviruses. However, up to 20 different human adenoviruses (HAdV) were detected by this targeted assay being the most abundant HAdV-41 (29.24%) and HAdV-51 (1.63%). To improve metagenomics' sensitivity, two different protocols for virus concentration were comparatively analysed: an ultracentrifugation protocol and a lower-volume SMF protocol. The sewage virome contained 41 viral families, including pathogenic viral species from families Caliciviridae, Adenoviridae, Astroviridae, Picornaviridae, Polyomaviridae, Papillomaviridae and Hepeviridae. The contribution of urine to sewage metavirome seems to be restricted to a few specific DNA viral families, including the polyomavirus and papillomavirus species. In experimental infections with sewage in a rhesus macaque model, infective human hepatitis E and JC polyomavirus were identified. Urban raw sewage consists of the excreta of thousands of inhabitants; therefore, it is a representative sample for epidemiological surveillance purposes. The knowledge of the metavirome is of significance to public health, highlighting the presence of viral strains that are circulating within a population while acting as a complex matrix for viral discovery.


Subject(s)
Metagenomics , Public Health Surveillance , Sewage/virology , Viruses/isolation & purification , Animals , Humans , Macaca mulatta , Spain , Viruses/genetics
7.
Bioinformatics ; 34(6): 1016-1023, 2018 03 15.
Article in English | MEDLINE | ID: mdl-29186384

ABSTRACT

Motivation: Planarians are emerging as a model organism to study regeneration in animals. However, the little available data of protein-protein interactions hinders the advances in understanding the mechanisms underlying its regenerating capabilities. Results: We have developed a protocol to predict protein-protein interactions using sequence homology data and a reference Human interactome. This methodology was applied on 11 Schmidtea mediterranea transcriptomic sequence datasets. Then, using Neo4j as our database manager, we developed PlanNET, a web application to explore the multiplicity of networks and the associated sequence annotations. By mapping RNA-seq expression experiments onto the predicted networks, and allowing a transcript-centric exploration of the planarian interactome, we provide researchers with a useful tool to analyse possible pathways and to design new experiments, as well as a reproducible methodology to predict, store, and explore protein interaction networks for non-model organisms. Availability and implementation: The web application PlanNET is available at https://compgen.bio.ub.edu/PlanNET. The source code used is available at https://compgen.bio.ub.edu/PlanNET/downloads. Contact: jabril@ub.edu. Supplementary information: Supplementary data are available at Bioinformatics online.


Subject(s)
Gene Expression Profiling/methods , Planarians/genetics , Protein Interaction Maps , Regeneration , Software , Animals , Humans , Internet , Planarians/physiology , Sequence Analysis, RNA/methods
8.
PLoS One ; 12(10): e0185911, 2017.
Article in English | MEDLINE | ID: mdl-28982120

ABSTRACT

Hepatitis is a general term meaning inflammation of the liver, which can be caused by a variety of viruses. However, a substantial number of cases remain with unknown aetiology. We analysed the serum of patients with clinical signs of hepatitis using a metagenomics approach to characterize their viral species composition. Four pools of patients with hepatitis without identified aetiological agents were evaluated. Additionally, one pool of patients with hepatitis E (HEV) and pools of healthy volunteers were included as controls. A high diversity of anelloviruses, including novel sequences, was found in pools from patients with hepatitis of unknown aetiology. Moreover, viruses recently associated with gastroenteritis as sapovirus GV.2 and astrovirus VA3 were also detected only in those pools. Besides, most of the HEV genome was recovered from the HEV pool. Finally, GB virus C and human endogenous retrovirus were found in the HEV and healthy pools. Our study provides an overview of the virome in serum from hepatitis patients suggesting a potential role of these viruses not previously described in cases of hepatitis. However, further epidemiologic studies are necessary to confirm their contribution to the development of hepatitis.


Subject(s)
Anelloviridae/isolation & purification , Hepatitis, Viral, Human/virology , Mamastrovirus/isolation & purification , Sapovirus/isolation & purification , Viremia/blood , Acute Disease , Anelloviridae/classification , Case-Control Studies , Hepatitis, Viral, Human/blood , High-Throughput Nucleotide Sequencing , Humans , Mamastrovirus/classification , Phylogeny , Viremia/classification
9.
Int J Food Microbiol ; 257: 80-90, 2017 Sep 18.
Article in English | MEDLINE | ID: mdl-28646670

ABSTRACT

Microbial food-borne diseases are still frequently reported despite the implementation of microbial quality legislation to improve food safety. Among all the microbial agents, viruses are the most important causative agents of food-borne outbreaks. The development and application of a new generation of sequencing techniques to test for viral contaminants in fresh produce is an unexplored field that allows for the study of the viral populations that might be transmitted by the fecal-oral route through the consumption of contaminated food. To advance this promising field, parsley was planted and grown under controlled conditions and irrigated using contaminated river water. Viruses polluting the irrigation water and the parsley leaves were studied by using metagenomics. To address possible contamination due to sample manipulation, library preparation, and other sources, parsley plants irrigated with nutritive solution were used as a negative control. In parallel, viruses present in the river water used for plant irrigation were analyzed using the same methodology. It was possible to assign viral taxons from 2.4 to 74.88% of the total reads sequenced depending on the sample. Most of the viral reads detected in the river water were related to the plant viral families Tymoviridae (66.13%) and Virgaviridae (14.45%) and the phage viral families Myoviridae (5.70%), Siphoviridae (5.06%), and Microviridae (2.89%). Less than 1% of the viral reads were related to viral families that infect humans, including members of the Adenoviridae, Reoviridae, Picornaviridae and Astroviridae families. On the surface of the parsley plants, most of the viral reads that were detected were assigned to the Dicistroviridae family (41.52%). Sequences related to important viral pathogens, such as the hepatitis E virus, several picornaviruses from species A and B as well as human sapoviruses and GIV noroviruses were detected. The high diversity of viral sequences found in the parsley plants suggests that irrigation on fecally-tainted food may have a role in the transmission of a wide diversity of viral families. This finding reinforces the idea that the best way to avoid food-borne viral diseases is to introduce good field irrigation and production practices. New strains have been identified that are related to the Picornaviridae and distantly related to the Hepeviridae family. However, the detection of a viral genome alone does not necessarily indicate there is a risk of infection or disease development. Thus, further investigation is crucial for correlating the detection of viral metagenomes in samples with the risk of infection. There is also an urgent need to develop new methods to improve the sensitivity of current Next Generation Sequencing (NGS) techniques in the food safety area.


Subject(s)
DNA Viruses/classification , DNA Viruses/isolation & purification , Food Contamination/analysis , Foodborne Diseases/virology , Petroselinum/virology , RNA Viruses/classification , RNA Viruses/isolation & purification , Water Pollution/analysis , Disease Outbreaks , Feces/virology , Food/virology , Food Safety , Genome, Viral/genetics , High-Throughput Nucleotide Sequencing , Humans , Metagenome/genetics , Metagenomics , RNA Viruses/genetics , Rivers/virology
10.
Bioinformatics ; 16(8): 743-4, 2000 Aug.
Article in English | MEDLINE | ID: mdl-11099262

ABSTRACT

gff2psis a program for visualizing annotations of genomic sequences. The program takes the annotated features on a genomic sequence in GFF format as input, and produces a visual output in PostScript. While it can be used in a very simple way, it also allows for a great degree of customization through a number of options and/or customization files.


Subject(s)
Sequence Analysis, DNA/methods , Sequence Analysis, RNA/methods , Software , Computational Biology , Genome
11.
Genome Res ; 10(10): 1631-42, 2000 Oct.
Article in English | MEDLINE | ID: mdl-11042160

ABSTRACT

One of the first useful products from the human genome will be a set of predicted genes. Besides its intrinsic scientific interest, the accuracy and completeness of this data set is of considerable importance for human health and medicine. Though progress has been made on computational gene identification in terms of both methods and accuracy evaluation measures, most of the sequence sets in which the programs are tested are short genomic sequences, and there is concern that these accuracy measures may not extrapolate well to larger, more challenging data sets. Given the absence of experimentally verified large genomic data sets, we constructed a semiartificial test set comprising a number of short single-gene genomic sequences with randomly generated intergenic regions. This test set, which should still present an easier problem than real human genomic sequence, mimics the approximately 200kb long BACs being sequenced. In our experiments with these longer genomic sequences, the accuracy of GENSCAN, one of the most accurate ab initio gene prediction programs, dropped significantly, although its sensitivity remained high. Conversely, the accuracy of similarity-based programs, such as GENEWISE, PROCRUSTES, and BLASTX was not affected significantly by the presence of random intergenic sequence, but depended on the strength of the similarity to the protein homolog. As expected, the accuracy dropped if the models were built using more distant homologs, and we were able to quantitatively estimate this decline. However, the specificities of these techniques are still rather good even when the similarity is weak, which is a desirable characteristic for driving expensive follow-up experiments. Our experiments suggest that though gene prediction will improve with every new protein that is discovered and through improvements in the current set of tools, we still have a long way to go before we can decipher the precise exonic structure of every gene in the human genome using purely computational methodology.


Subject(s)
Computational Biology/methods , DNA/chemistry , DNA/genetics , Genes/genetics , Base Composition , Chromosomes, Artificial/chemistry , Chromosomes, Artificial/genetics , Humans , Reproducibility of Results , Software
12.
Genome Res ; 10(4): 483-501, 2000 Apr.
Article in English | MEDLINE | ID: mdl-10779488

ABSTRACT

Computational methods for automated genome annotation are critical to our community's ability to make full use of the large volume of genomic sequence being generated and released. To explore the accuracy of these automated feature prediction tools in the genomes of higher organisms, we evaluated their performance on a large, well-characterized sequence contig from the Adh region of Drosophila melanogaster. This experiment, known as the Genome Annotation Assessment Project (GASP), was launched in May 1999. Twelve groups, applying state-of-the-art tools, contributed predictions for features including gene structure, protein homologies, promoter sites, and repeat elements. We evaluated these predictions using two standards, one based on previously unreleased high-quality full-length cDNA sequences and a second based on the set of annotations generated as part of an in-depth study of the region by a group of Drosophila experts. Although these standard sets only approximate the unknown distribution of features in this region, we believe that when taken in context the results of an evaluation based on them are meaningful. The results were presented as a tutorial at the conference on Intelligent Systems in Molecular Biology (ISMB-99) in August 1999. Over 95% of the coding nucleotides in the region were correctly identified by the majority of the gene finders, and the correct intron/exon structures were predicted for >40% of the genes. Homology-based annotation techniques recognized and associated functions with almost half of the genes in the region; the remainder were only identified by the ab initio techniques. This experiment also presents the first assessment of promoter prediction techniques for a significant number of genes in a large contiguous region. We discovered that the promoter predictors' high false-positive rates make their predictions difficult to use. Integrating gene finding and cDNA/EST alignments with promoter predictions decreases the number of false-positive classifications but discovers less than one-third of the promoters in the region. We believe that by establishing standards for evaluating genomic annotations and by assessing the performance of existing automated genome annotation tools, this experiment establishes a baseline that contributes to the value of ongoing large-scale annotation projects and should guide further research in genome informatics.


Subject(s)
Computational Biology/methods , Drosophila melanogaster/genetics , Genes, Insect , Genome , Alcohol Dehydrogenase/chemistry , Alcohol Dehydrogenase/genetics , Animals , DNA, Complementary , Databases, Factual/trends , Drosophila melanogaster/enzymology , Expressed Sequence Tags , Promoter Regions, Genetic/genetics , Sequence Homology, Amino Acid
13.
Science ; 287(5461): 2185-95, 2000 Mar 24.
Article in English | MEDLINE | ID: mdl-10731132

ABSTRACT

The fly Drosophila melanogaster is one of the most intensively studied organisms in biology and serves as a model system for the investigation of many developmental and cellular processes common to higher eukaryotes, including humans. We have determined the nucleotide sequence of nearly all of the approximately 120-megabase euchromatic portion of the Drosophila genome using a whole-genome shotgun sequencing strategy supported by extensive clone-based sequence and a high-quality bacterial artificial chromosome physical map. Efforts are under way to close the remaining gaps; however, the sequence is of sufficient accuracy and contiguity to be declared substantially complete and to support an initial analysis of genome structure and preliminary gene annotation and interpretation. The genome encodes approximately 13,600 genes, somewhat fewer than the smaller Caenorhabditis elegans genome, but with comparable functional diversity.


Subject(s)
Drosophila melanogaster/genetics , Genome , Sequence Analysis, DNA , Animals , Biological Transport/genetics , Chromatin/genetics , Cloning, Molecular , Computational Biology , Contig Mapping , Cytochrome P-450 Enzyme System/genetics , DNA Repair/genetics , DNA Replication/genetics , Drosophila melanogaster/metabolism , Euchromatin , Gene Library , Genes, Insect , Heterochromatin/genetics , Insect Proteins/chemistry , Insect Proteins/genetics , Insect Proteins/physiology , Nuclear Proteins/genetics , Protein Biosynthesis , Transcription, Genetic
SELECTION OF CITATIONS
SEARCH DETAIL
...