Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 17 de 17
Filter
Add more filters










Publication year range
1.
Curr Protoc ; 4(6): e1065, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38857087

ABSTRACT

The European Bioinformatics Institute (EMBL-EBI)'s Job Dispatcher framework provides access to a wide range of core databases and analysis tools that are of key importance in bioinformatics. As well as providing web interfaces to these resources, web services are available using REST and SOAP protocols that enable programmatic access and allow their integration into other applications and analytical workflows and pipelines. This article describes the various options available to researchers and bioinformaticians who would like to use our resources via the web interface employing RESTful web services clients provided in Perl, Python, and Java or who would like to use Docker containers to integrate the resources into analysis pipelines and workflows. © 2024 The Authors. Current Protocols published by Wiley Periodicals LLC. Basic Protocol 1: Retrieving data from EMBL-EBI using Dbfetch via the web interface Alternate Protocol 1: Retrieving data from EMBL-EBI using WSDbfetch via the REST interface Alternate Protocol 2: Retrieving data from EMBL-EBI using Dbfetch via RESTful web services with Python client Support Protocol 1: Installing Python REST web services clients Basic Protocol 2: Sequence similarity search using FASTA search via the web interface Alternate Protocol 3: Sequence similarity search using FASTA via RESTful web services with Perl client Support Protocol 2: Installing Perl REST web services clients Basic Protocol 3: Sequence similarity search using NCBI BLAST+ RESTful web services with Python client Basic Protocol 4: Sequence similarity search using HMMER3 phmmer REST web services with Perl client and Docker Support Protocol 3: Installing Docker and running the EMBL-EBI client container Basic Protocol 5: Protein functional analysis using InterProScan 5 RESTful web services with the Python client and Docker Alternate Protocol 4: Protein functional analysis using InterProScan 5 RESTful web services with the Java client Support Protocol 4: Installing Java web services clients Basic Protocol 6: Multiple sequence alignment using Clustal Omega via web interface Alternate Protocol 5: Multiple sequence alignment using Clustal Omega with Perl client and Docker Support Protocol 5: Exploring the RESTful API with OpenAPI User Inferface.


Subject(s)
Internet , Software , Computational Biology/methods , User-Computer Interface
2.
Nucleic Acids Res ; 2024 Apr 10.
Article in English | MEDLINE | ID: mdl-38597606

ABSTRACT

The EMBL-EBI Job Dispatcher sequence analysis tools framework (https://www.ebi.ac.uk/jdispatcher) enables the scientific community to perform a diverse range of sequence analyses using popular bioinformatics applications. Free access to the tools and required sequence datasets is provided through user-friendly web applications, as well as via RESTful and SOAP-based APIs. These are integrated into popular EMBL-EBI resources such as UniProt, InterPro, ENA and Ensembl Genomes. This paper overviews recent improvements to Job Dispatcher, including its brand new website and documentation, enhanced visualisations, improved job management, and a rising trend of user reliance on the service from low- and middle-income regions.

3.
Nucleic Acids Res ; 50(W1): W276-W279, 2022 07 05.
Article in English | MEDLINE | ID: mdl-35412617

ABSTRACT

The EMBL-EBI search and sequence analysis tools frameworks provide integrated access to EMBL-EBI's data resources and core bioinformatics analytical tools. EBI Search (https://www.ebi.ac.uk/ebisearch) provides a full-text search engine across nearly 5 billion entries, while the Job Dispatcher tools framework (https://www.ebi.ac.uk/services) enables the scientific community to perform a diverse range of sequence analysis using popular bioinformatics applications. Both allow users to interact through user-friendly web applications, as well as via RESTful and SOAP-based APIs. Here, we describe recent improvements to these services and updates made to accommodate the increasing data requirements during the COVID-19 pandemic.


Subject(s)
Sequence Analysis , Software , Humans , Computational Biology , COVID-19/epidemiology , Internet , Pandemics , Sequence Alignment
4.
Curr Protoc Bioinformatics ; 66(1): e74, 2019 06.
Article in English | MEDLINE | ID: mdl-31039604

ABSTRACT

The European Bioinformatics Institute (EMBL-EBI) provides access to a wide range of core databases and analysis tools that are of key importance in bioinformatics. As well as providing web interfaces to these resources, web services are available using REST and SOAP protocols that enable programmatic access and allow their integration into other applications and analytical workflows and pipelines. This article describes the various options available to researchers and bioinformaticians who would like to use our resources via the web interface employing RESTful web service clients provided in Perl, Python, and Java, or would like to use Docker containers to integrate the resources into analysis pipelines and workflows. © 2019 by John Wiley & Sons, Inc.


Subject(s)
Databases, Genetic , Internet , Amino Acid Sequence , Knowledge Bases , Phylogeny , Sequence Alignment , Software , User-Computer Interface
5.
Nucleic Acids Res ; 47(W1): W636-W641, 2019 07 02.
Article in English | MEDLINE | ID: mdl-30976793

ABSTRACT

The EMBL-EBI provides free access to popular bioinformatics sequence analysis applications as well as to a full-featured text search engine with powerful cross-referencing and data retrieval capabilities. Access to these services is provided via user-friendly web interfaces and via established RESTful and SOAP Web Services APIs (https://www.ebi.ac.uk/seqdb/confluence/display/JDSAT/EMBL-EBI+Web+Services+APIs+-+Data+Retrieval). Both systems have been developed with the same core principles that allow them to integrate an ever-increasing volume of biological data, making them an integral part of many popular data resources provided at the EMBL-EBI. Here, we describe the latest improvements made to the frameworks which enhance the interconnectivity between public EMBL-EBI resources and ultimately enhance biological data discoverability, accessibility, interoperability and reusability.


Subject(s)
Sequence Analysis , Software , Databases, Nucleic Acid , Databases, Protein , Sequence Alignment , Sequence Analysis, Protein
6.
Lancet ; 385(9975): 1305-14, 2015 Apr 04.
Article in English | MEDLINE | ID: mdl-25529582

ABSTRACT

BACKGROUND: Human genome sequencing has transformed our understanding of genomic variation and its relevance to health and disease, and is now starting to enter clinical practice for the diagnosis of rare diseases. The question of whether and how some categories of genomic findings should be shared with individual research participants is currently a topic of international debate, and development of robust analytical workflows to identify and communicate clinically relevant variants is paramount. METHODS: The Deciphering Developmental Disorders (DDD) study has developed a UK-wide patient recruitment network involving over 180 clinicians across all 24 regional genetics services, and has performed genome-wide microarray and whole exome sequencing on children with undiagnosed developmental disorders and their parents. After data analysis, pertinent genomic variants were returned to individual research participants via their local clinical genetics team. FINDINGS: Around 80,000 genomic variants were identified from exome sequencing and microarray analysis in each individual, of which on average 400 were rare and predicted to be protein altering. By focusing only on de novo and segregating variants in known developmental disorder genes, we achieved a diagnostic yield of 27% among 1133 previously investigated yet undiagnosed children with developmental disorders, whilst minimising incidental findings. In families with developmentally normal parents, whole exome sequencing of the child and both parents resulted in a 10-fold reduction in the number of potential causal variants that needed clinical evaluation compared to sequencing only the child. Most diagnostic variants identified in known genes were novel and not present in current databases of known disease variation. INTERPRETATION: Implementation of a robust translational genomics workflow is achievable within a large-scale rare disease research study to allow feedback of potentially diagnostic findings to clinicians and research participants. Systematic recording of relevant clinical data, curation of a gene-phenotype knowledge base, and development of clinical decision support software are needed in addition to automated exclusion of almost all variants, which is crucial for scalable prioritisation and review of possible diagnostic variants. However, the resource requirements of development and maintenance of a clinical reporting system within a research setting are substantial. FUNDING: Health Innovation Challenge Fund, a parallel funding partnership between the Wellcome Trust and the UK Department of Health.


Subject(s)
Developmental Disabilities/diagnosis , Genome, Human/genetics , Adolescent , Child , Child, Preschool , Developmental Disabilities/genetics , Female , Genetic Variation/genetics , Genome-Wide Association Study/methods , Heterozygote , Humans , Incidental Findings , Infant , Infant, Newborn , Information Dissemination , Male , Phenotype , Specimen Handling
7.
Nucleic Acids Res ; 40(Database issue): D98-108, 2012 Jan.
Article in English | MEDLINE | ID: mdl-22116062

ABSTRACT

GeneDB (http://www.genedb.org) is a genome database for prokaryotic and eukaryotic pathogens and closely related organisms. The resource provides a portal to genome sequence and annotation data, which is primarily generated by the Pathogen Genomics group at the Wellcome Trust Sanger Institute. It combines data from completed and ongoing genome projects with curated annotation, which is readily accessible from a web based resource. The development of the database in recent years has focused on providing database-driven annotation tools and pipelines, as well as catering for increasingly frequent assembly updates. The website has been significantly redesigned to take advantage of current web technologies, and improve usability. The current release stores 41 data sets, of which 17 are manually curated and maintained by biologists, who review and incorporate data from the scientific literature, as well as other sources. GeneDB is primarily a production and annotation database for the genomes of predominantly pathogenic organisms.


Subject(s)
Databases, Genetic , Genomics , Molecular Sequence Annotation , Animals , Arthropods/genetics , Genome, Bacterial , Genome, Helminth , Genome, Protozoan , Internet , Vocabulary, Controlled
8.
Nucleic Acids Res ; 38(Database issue): D457-62, 2010 Jan.
Article in English | MEDLINE | ID: mdl-19843604

ABSTRACT

TriTrypDB (http://tritrypdb.org) is an integrated database providing access to genome-scale datasets for kinetoplastid parasites, and supporting a variety of complex queries driven by research and development needs. TriTrypDB is a collaborative project, utilizing the GUS/WDK computational infrastructure developed by the Eukaryotic Pathogen Bioinformatics Resource Center (EuPathDB.org) to integrate genome annotation and analyses from GeneDB and elsewhere with a wide variety of functional genomics datasets made available by members of the global research community, often pre-publication. Currently, TriTrypDB integrates datasets from Leishmania braziliensis, L. infantum, L. major, L. tarentolae, Trypanosoma brucei and T. cruzi. Users may examine individual genes or chromosomal spans in their genomic context, including syntenic alignments with other kinetoplastid organisms. Data within TriTrypDB can be interrogated utilizing a sophisticated search strategy system that enables a user to construct complex queries combining multiple data types. All search strategies are stored, allowing future access and integrated searches. 'User Comments' may be added to any gene page, enhancing available annotation; such comments become immediately searchable via the text search, and are forwarded to curators for incorporation into the reference annotation when appropriate.


Subject(s)
Computational Biology/methods , Databases, Genetic , Databases, Nucleic Acid , Leishmania/genetics , Trypanosoma/genetics , Animals , Computational Biology/trends , Databases, Protein , Genome, Protozoan , Information Storage and Retrieval/methods , Internet , Protein Structure, Tertiary , Protozoan Proteins/genetics , Software , User-Computer Interface
9.
Genome Res ; 19(12): 2231-44, 2009 Dec.
Article in English | MEDLINE | ID: mdl-19745113

ABSTRACT

Candida dubliniensis is the closest known relative of Candida albicans, the most pathogenic yeast species in humans. However, despite both species sharing many phenotypic characteristics, including the ability to form true hyphae, C. dubliniensis is a significantly less virulent and less versatile pathogen. Therefore, to identify C. albicans-specific genes that may be responsible for an increased capacity to cause disease, we have sequenced the C. dubliniensis genome and compared it with the known C. albicans genome sequence. Although the two genome sequences are highly similar and synteny is conserved throughout, 168 species-specific genes are identified, including some encoding known hyphal-specific virulence factors, such as the aspartyl proteinases Sap4 and Sap5 and the proposed invasin Als3. Among the 115 pseudogenes confirmed in C. dubliniensis are orthologs of several filamentous growth regulator (FGR) genes that also have suspected roles in pathogenesis. However, the principal differences in genomic repertoire concern expansion of the TLO gene family of putative transcription factors and the IFA family of putative transmembrane proteins in C. albicans, which represent novel candidate virulence-associated factors. The results suggest that the recent evolutionary histories of C. albicans and C. dubliniensis are quite different. While gene families instrumental in pathogenesis have been elaborated in C. albicans, C. dubliniensis has lost genomic capacity and key pathogenic functions. This could explain why C. albicans is a more potent pathogen in humans than C. dubliniensis.


Subject(s)
Candida albicans , Candida , Fungal Proteins , Genome, Fungal , Genomics , Virulence Factors , Candida/classification , Candida/genetics , Candida/pathogenicity , Candida albicans/genetics , Candida albicans/pathogenicity , Fungal Proteins/genetics , Fungal Proteins/metabolism , Gene Order , Humans , Hyphae/genetics , Hyphae/metabolism , Membrane Proteins/genetics , Membrane Proteins/metabolism , Molecular Sequence Data , Phylogeny , Sequence Analysis, DNA , Species Specificity , Synteny , Transcription Factors/genetics , Transcription Factors/metabolism , Virulence , Virulence Factors/genetics , Virulence Factors/metabolism
10.
Nature ; 460(7253): 352-8, 2009 Jul 16.
Article in English | MEDLINE | ID: mdl-19606141

ABSTRACT

Schistosoma mansoni is responsible for the neglected tropical disease schistosomiasis that affects 210 million people in 76 countries. Here we present analysis of the 363 megabase nuclear genome of the blood fluke. It encodes at least 11,809 genes, with an unusual intron size distribution, and new families of micro-exon genes that undergo frequent alternative splicing. As the first sequenced flatworm, and a representative of the Lophotrochozoa, it offers insights into early events in the evolution of the animals, including the development of a body pattern with bilateral symmetry, and the development of tissues into organs. Our analysis has been informed by the need to find new drug targets. The deficits in lipid metabolism that make schistosomes dependent on the host are revealed, and the identification of membrane receptors, ion channels and more than 300 proteases provide new insights into the biology of the life cycle and new targets. Bioinformatics approaches have identified metabolic chokepoints, and a chemogenomic screen has pinpointed schistosome proteins for which existing drugs may be active. The information generated provides an invaluable resource for the research community to develop much needed new control tools for the treatment and eradication of this important and neglected disease.


Subject(s)
Genome, Helminth/genetics , Schistosoma mansoni/genetics , Animals , Biological Evolution , Exons/genetics , Genes, Helminth/genetics , Host-Parasite Interactions/genetics , Introns/genetics , Molecular Sequence Data , Physical Chromosome Mapping , Schistosoma mansoni/drug effects , Schistosoma mansoni/embryology , Schistosoma mansoni/physiology , Schistosomiasis mansoni/drug therapy , Schistosomiasis mansoni/parasitology
11.
Bioinformatics ; 24(23): 2672-6, 2008 Dec 01.
Article in English | MEDLINE | ID: mdl-18845581

ABSTRACT

MOTIVATION: Artemis and Artemis Comparison Tool (ACT) have become mainstream tools for viewing and annotating sequence data, particularly for microbial genomes. Since its first release, Artemis has been continuously developed and supported with additional functionality for editing and analysing sequences based on feedback from an active user community of laboratory biologists and professional annotators. Nevertheless, its utility has been somewhat restricted by its limitation to reading and writing from flat files. Therefore, a new version of Artemis has been developed, which reads from and writes to a relational database schema, and allows users to annotate more complex, often large and fragmented, genome sequences. RESULTS: Artemis and ACT have now been extended to read and write directly to the Generic Model Organism Database (GMOD, http://www.gmod.org) Chado relational database schema. In addition, a Gene Builder tool has been developed to provide structured forms and tables to edit coordinates of gene models and edit functional annotation, based on standard ontologies, controlled vocabularies and free text. AVAILABILITY: Artemis and ACT are freely available (under a GPL licence) for download (for MacOSX, UNIX and Windows) at the Wellcome Trust Sanger Institute web sites: http://www.sanger.ac.uk/Software/Artemis/ http://www.sanger.ac.uk/Software/ACT/


Subject(s)
Databases, Genetic , Genomics , Software , Databases, Nucleic Acid
13.
Nat Genet ; 39(7): 839-47, 2007 Jul.
Article in English | MEDLINE | ID: mdl-17572675

ABSTRACT

Leishmania parasites cause a broad spectrum of clinical disease. Here we report the sequencing of the genomes of two species of Leishmania: Leishmania infantum and Leishmania braziliensis. The comparison of these sequences with the published genome of Leishmania major reveals marked conservation of synteny and identifies only approximately 200 genes with a differential distribution between the three species. L. braziliensis, contrary to Leishmania species examined so far, possesses components of a putative RNA-mediated interference pathway, telomere-associated transposable elements and spliced leader-associated SLACS retrotransposons. We show that pseudogene formation and gene loss are the principal forces shaping the different genomes. Genes that are differentially distributed between the species encode proteins implicated in host-pathogen interactions and parasite survival in the macrophage.


Subject(s)
Genome , Genomics , Leishmania/genetics , Leishmaniasis/parasitology , Amino Acid Sequence , Animals , Humans , Leishmania braziliensis/genetics , Leishmania infantum/genetics , Leishmania major/genetics , Leishmaniasis, Cutaneous/parasitology , Leishmaniasis, Visceral/parasitology , Molecular Sequence Data
14.
Genome Res ; 17(3): 311-9, 2007 Mar.
Article in English | MEDLINE | ID: mdl-17284678

ABSTRACT

Eimeria tenella is an intracellular protozoan parasite that infects the intestinal tracts of domestic fowl and causes coccidiosis, a serious and sometimes lethal enteritis. Eimeria falls in the same phylum (Apicomplexa) as several human and animal parasites such as Cryptosporidium, Toxoplasma, and the malaria parasite, Plasmodium. Here we report the sequencing and analysis of the first chromosome of E. tenella, a chromosome believed to carry loci associated with drug resistance and known to differ between virulent and attenuated strains of the parasite. The chromosome--which appears to be representative of the genome--is gene-dense and rich in simple-sequence repeats, many of which appear to give rise to repetitive amino acid tracts in the predicted proteins. Most striking is the segmentation of the chromosome into repeat-rich regions peppered with transposon-like elements and telomere-like repeats, alternating with repeat-free regions. Predicted genes differ in character between the two types of segment, and the repeat-rich regions appear to be associated with strain-to-strain variation.


Subject(s)
Chromosome Structures/genetics , Eimeria tenella/genetics , Genes, Protozoan/genetics , Animals , Base Sequence , Chromosome Mapping , Computational Biology , Minisatellite Repeats/genetics , Molecular Sequence Data , Polymorphism, Restriction Fragment Length , Sequence Analysis, DNA
15.
Science ; 309(5731): 131-3, 2005 Jul 01.
Article in English | MEDLINE | ID: mdl-15994557

ABSTRACT

Theileria annulata and T. parva are closely related protozoan parasites that cause lymphoproliferative diseases of cattle. We sequenced the genome of T. annulata and compared it with that of T. parva to understand the mechanisms underlying transformation and tropism. Despite high conservation of gene sequences and synteny, the analysis reveals unequally expanded gene families and species-specific genes. We also identify divergent families of putative secreted polypeptides that may reduce immune recognition, candidate regulators of host-cell transformation, and a Theileria-specific protein domain [frequently associated in Theileria (FAINT)] present in a large number of secreted proteins.


Subject(s)
Genome, Protozoan , Protozoan Proteins/genetics , Theileria annulata/genetics , Theileria parva/genetics , Amino Acid Motifs , Animals , Cattle , Cell Proliferation , Chromosome Mapping , Chromosomes/genetics , Conserved Sequence , Genes, Protozoan , Life Cycle Stages , Lipid Metabolism , Lymphocytes/cytology , Lymphocytes/parasitology , Molecular Sequence Data , Multigene Family , Phylogeny , Protein Sorting Signals/genetics , Protein Structure, Tertiary , Proteome , Protozoan Proteins/chemistry , Protozoan Proteins/physiology , Sequence Analysis, DNA , Species Specificity , Synteny , Telomere/genetics , Theileria annulata/growth & development , Theileria annulata/immunology , Theileria annulata/pathogenicity , Theileria parva/growth & development , Theileria parva/immunology , Theileria parva/pathogenicity
16.
Int J Parasitol ; 35(5): 481-93, 2005 Apr 30.
Article in English | MEDLINE | ID: mdl-15826641

ABSTRACT

Centralisation of tools for analysis of genomic data is paramount in ensuring that research is always carried out on the latest currently available data. As such, World Wide Web sites providing a range of online analyses and displays of data can play a crucial role in guaranteeing consistency of in silico work. In this respect, the protozoan parasite research community is served by several resources, either focussing on data and tools for one species or taking a broader view and providing tools for analysis of data from many species, thereby facilitating comparative studies. In this paper, we give a broad overview of the online resources available. We then focus on the GeneDB project, detailing the features and tools currently available through it. Finally, we discuss data curation and its importance in keeping genomic data 'relevant' to the research community.


Subject(s)
Databases, Genetic , Genome, Protozoan , Genomics , Animals , Computational Biology , Information Storage and Retrieval , Online Systems
17.
Nucleic Acids Res ; 32(Database issue): D339-43, 2004 Jan 01.
Article in English | MEDLINE | ID: mdl-14681429

ABSTRACT

GeneDB (http://www.genedb.org/) is a genome database for prokaryotic and eukaryotic organisms. The resource provides a portal through which data generated by the Pathogen Sequencing Unit at the Wellcome Trust Sanger Institute and other collaborating sequencing centres can be made publicly available. It combines data from finished and ongoing genome and expressed sequence tag (EST) projects with curated annotation, that can be searched, sorted and downloaded, using a single web based resource. The current release stores 11 datasets of which six are curated and maintained by biologists, who review and incorporate information from the scientific literature, public databases and the respective research communities.


Subject(s)
Databases, Genetic , Eukaryotic Cells , Genome , Prokaryotic Cells , Animals , Computational Biology , Expressed Sequence Tags , Genomics , Information Storage and Retrieval , Internet
SELECTION OF CITATIONS
SEARCH DETAIL
...