Pesquisa | Portal Regional da BVS

1.

Screening of global microbiomes implies ecological boundaries impacting the distribution and dissemination of clinically relevant antimicrobial resistance genes.

Lin, Qiang; Xavier, Basil Britto; Alako, Blaise T F; Mitchell, Alex L; Rajakani, Sahaya Glingston; Glupczynski, Youri; Finn, Robert D; Cochrane, Guy; Malhotra-Kumar, Surbhi.

Commun Biol ; 5(1): 1217, 2022 11 18.

Artigo em Inglês | MEDLINE | ID: mdl-36400841

RESUMO

Understanding the myriad pathways by which antimicrobial-resistance genes (ARGs) spread across biomes is necessary to counteract the global menace of antimicrobial resistance. We screened 17939 assembled metagenomic samples covering 21 biomes, differing in sequencing quality and depth, unevenly across 46 countries, 6 continents, and 14 years (2005-2019) for clinically crucial ARGs, mobile colistin resistance (mcr), carbapenem resistance (CR), and (extended-spectrum) beta-lactamase (ESBL and BL) genes. These ARGs were most frequent in human gut, oral and skin biomes, followed by anthropogenic (wastewater, bioreactor, compost, food), and natural biomes (freshwater, marine, sediment). Mcr-9 was the most prevalent mcr gene, spatially and temporally; blaOXA-233 and blaTEM-1 were the most prevalent CR and BL/ESBL genes, but blaGES-2 and blaTEM-116 showed the widest distribution. Redundancy analysis and Bayesian analysis showed ARG distribution was non-random and best-explained by potential host genera and biomes, followed by collection year, anthropogenic factors and collection countries. Preferential ARG occurrence, and potential transmission, between characteristically similar biomes indicate strong ecological boundaries. Our results provide a high-resolution global map of ARG distribution and importantly, identify checkpoint biomes wherein interventions aimed at disrupting ARGs dissemination are likely to be most effective in reducing dissemination and in the long term, the ARG global burden.

Assuntos

Antibacterianos , Microbiota , Humanos , Antibacterianos/farmacologia , Farmacorresistência Bacteriana/genética , Teorema de Bayes , Microbiota/genética , Genes Bacterianos

2.

Quantitative monitoring of nucleotide sequence data from genetic resources in context of their citation in the scientific literature.

Lange, Matthias; Alako, Blaise T F; Cochrane, Guy; Ghaffar, Mehmood; Mascher, Martin; Habekost, Pia-Katharina; Hillebrand, Upneet; Scholz, Uwe; Schorch, Florian; Freitag, Jens; Scholz, Amber Hartman.

Gigascience ; 10(12)2021 12 29.

Artigo em Inglês | MEDLINE | ID: mdl-34966925

RESUMO

BACKGROUND: Linking nucleotide sequence data (NSD) to scientific publication citations can enhance understanding of NSD provenance, scientific use, and reuse in the community. By connecting publications with NSD records, NSD geographical provenance information, and author geographical information, it becomes possible to assess the contribution of NSD to infer trends in scientific knowledge gain at the global level. FINDINGS: We extracted and linked records from the European Nucleotide Archive to citations in open-access publications aggregated at Europe PubMed Central. A total of 8,464,292 ENA accessions with geographical provenance information were associated with publications. We conducted a data quality review to uncover potential issues in publication citation information extraction and author affiliation tagging and developed and implemented best-practice recommendations for citation extraction. We constructed flat data tables and a data warehouse with an interactive web application to enable ad hoc exploration of NSD use and summary statistics. CONCLUSIONS: The extraction and linking of NSD with associated publication citations enables transparency. The quality review contributes to enhanced text mining methods for identifier extraction and use. Furthermore, the global provision and use of NSD enable scientists worldwide to join literature and sequence databases in a multidimensional fashion. As a concrete use case, we visualized statistics of country clusters concerning NSD access in the context of discussions around digital sequence information under the United Nations Convention on Biological Diversity.

Assuntos

Mineração de Dados , Nucleotídeos , Sequência de Bases , Bases de Dados de Ácidos Nucleicos , Europa (Continente)

3.

Exploring bacterial diversity via a curated and searchable snapshot of archived DNA sequences.

Blackwell, Grace A; Hunt, Martin; Malone, Kerri M; Lima, Leandro; Horesh, Gal; Alako, Blaise T F; Thomson, Nicholas R; Iqbal, Zamin.

PLoS Biol ; 19(11): e3001421, 2021 11.

Artigo em Inglês | MEDLINE | ID: mdl-34752446

RESUMO

The open sharing of genomic data provides an incredibly rich resource for the study of bacterial evolution and function and even anthropogenic activities such as the widespread use of antimicrobials. However, these data consist of genomes assembled with different tools and levels of quality checking, and of large volumes of completely unprocessed raw sequence data. In both cases, considerable computational effort is required before biological questions can be addressed. Here, we assembled and characterised 661,405 bacterial genomes retrieved from the European Nucleotide Archive (ENA) in November of 2018 using a uniform standardised approach. Of these, 311,006 did not previously have an assembly. We produced a searchable COmpact Bit-sliced Signature (COBS) index, facilitating the easy interrogation of the entire dataset for a specific sequence (e.g., gene, mutation, or plasmid). Additional MinHash and pp-sketch indices support genome-wide comparisons and estimations of genomic distance. Combined, this resource will allow data to be easily subset and searched, phylogenetic relationships between genomes to be quickly elucidated, and hypotheses rapidly generated and tested. We believe that this combination of uniform processing and variety of search/filter functionalities will make this a resource of very wide utility. In terms of diversity within the data, a breakdown of the 639,981 high-quality genomes emphasised the uneven species composition of the ENA/public databases, with just 20 of the total 2,336 species making up 90% of the genomes. The overrepresented species tend to be acute/common human pathogens, aligning with research priorities at different levels from individual interests to funding bodies and national and global public health agencies.

Assuntos

Bactérias/genética , Biodiversidade , DNA Bacteriano/genética , Curadoria de Dados , Sequência de Bases , Farmacorresistência Bacteriana/genética , Especificidade da Espécie

4.

The European Nucleotide Archive in 2020.

Harrison, Peter W; Ahamed, Alisha; Aslam, Raheela; Alako, Blaise T F; Burgin, Josephine; Buso, Nicola; Courtot, Mélanie; Fan, Jun; Gupta, Dipayan; Haseeb, Muhammad; Holt, Sam; Ibrahim, Talal; Ivanov, Eugene; Jayathilaka, Suran; Balavenkataraman Kadhirvelu, Vishnukumar; Kumar, Manish; Lopez, Rodrigo; Kay, Simon; Leinonen, Rasko; Liu, Xin; O'Cathail, Colman; Pakseresht, Amir; Park, Youngmi; Pesant, Stephane; Rahman, Nadim; Rajan, Jeena; Sokolov, Alexey; Vijayaraja, Senthilnathan; Waheed, Zahra; Zyoud, Ahmad; Burdett, Tony; Cochrane, Guy.

Nucleic Acids Res ; 49(D1): D82-D85, 2021 01 08.

Artigo em Inglês | MEDLINE | ID: mdl-33175160

RESUMO

The European Nucleotide Archive (ENA; https://www.ebi.ac.uk/ena), provided by the European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI), has for almost forty years continued in its mission to freely archive and present the world's public sequencing data for the benefit of the entire scientific community and for the acceleration of the global research effort. Here we highlight the major developments to ENA services and content in 2020, focussing in particular on the recently released updated ENA browser, modernisation of our release process and our data coordination collaborations with specific research communities.

Assuntos

Biologia Computacional/métodos , Bases de Dados de Ácidos Nucleicos/tendências , Ácidos Nucleicos/genética , Nucleotídeos/genética , Bases de Dados de Ácidos Nucleicos/estatística & dados numéricos , Europa (Continente) , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Internet , Anotação de Sequência Molecular , Ácidos Nucleicos/química , Nucleotídeos/química , Análise de Sequência de DNA , Análise de Sequência de RNA

5.

BacPipe: A Rapid, User-Friendly Whole-Genome Sequencing Pipeline for Clinical Diagnostic Bacteriology.

Xavier, Basil B; Mysara, Mohamed; Bolzan, Mattia; Ribeiro-Gonçalves, Bruno; Alako, Blaise T F; Harrison, Peter; Lammens, Christine; Kumar-Singh, Samir; Goossens, Herman; Carriço, João A; Cochrane, Guy; Malhotra-Kumar, Surbhi.

iScience ; 23(1): 100769, 2020 Jan 24.

Artigo em Inglês | MEDLINE | ID: mdl-31887656

RESUMO

Despite rapid advances in whole genome sequencing (WGS) technologies, their integration into routine microbiological diagnostics has been hampered by the lack of standardized downstream bioinformatics analysis. We developed a comprehensive and computationally low-resource bioinformatics pipeline (BacPipe) enabling direct analyses of bacterial whole-genome sequences (raw reads or contigs) obtained from second- or third-generation sequencing technologies. A graphical user interface was developed to visualize real-time progression of the analysis. The scalability and speed of BacPipe in handling large datasets was demonstrated using 4,139 Illumina paired-end sequence files of publicly available bacterial genomes (2.9-5.4 Mb) from the European Nucleotide Archive. BacPipe is integrated in EBI-SELECTA, a project-specific portal (H2020-COMPARE), and is available as an independent docker image that can be used across Windows- and Unix-based systems. BacPipe offers a fully automated "one-stop" bacterial WGS analysis pipeline to overcome the major hurdle of WGS data analysis in hospitals and public-health and for infection control monitoring.

6.

The European Nucleotide Archive in 2019.

Amid, Clara; Alako, Blaise T F; Balavenkataraman Kadhirvelu, Vishnukumar; Burdett, Tony; Burgin, Josephine; Fan, Jun; Harrison, Peter W; Holt, Sam; Hussein, Abdulrahman; Ivanov, Eugene; Jayathilaka, Suran; Kay, Simon; Keane, Thomas; Leinonen, Rasko; Liu, Xin; Martinez-Villacorta, Josue; Milano, Annalisa; Pakseresht, Amir; Rahman, Nadim; Rajan, Jeena; Reddy, Kethi; Richards, Edward; Smirnov, Dmitriy; Sokolov, Alexey; Vijayaraja, Senthilnathan; Cochrane, Guy.

Nucleic Acids Res ; 48(D1): D70-D76, 2020 01 08.

Artigo em Inglês | MEDLINE | ID: mdl-31722421

RESUMO

The European Nucleotide Archive (ENA, https://www.ebi.ac.uk/ena) at the European Molecular Biology Laboratory's European Bioinformatics Institute provides open and freely available data deposition and access services across the spectrum of nucleotide sequence data types. Making the world's public sequencing datasets available to the scientific community, the ENA represents a globally comprehensive nucleotide sequence resource. Here, we outline ENA services and content in 2019 and provide an insight into selected key areas of development in this period.

Assuntos

Biologia Computacional , Bases de Dados de Ácidos Nucleicos , Genômica , Biologia Computacional/métodos , Europa (Continente) , Genômica/métodos , Anotação de Sequência Molecular , Software , Interface Usuário-Computador , Navegador

7.

The COMPARE Data Hubs.

Amid, Clara; Pakseresht, Nima; Silvester, Nicole; Jayathilaka, Suran; Lund, Ole; Dynovski, Lukasz D; Pataki, Bálint Á; Visontai, Dávid; Xavier, Basil Britto; Alako, Blaise T F; Belka, Ariane; Cisneros, Jose L B; Cotten, Matthew; Haringhuizen, George B; Harrison, Peter W; Höper, Dirk; Holt, Sam; Hundahl, Camilla; Hussein, Abdulrahman; Kaas, Rolf S; Liu, Xin; Leinonen, Rasko; Malhotra-Kumar, Surbhi; Nieuwenhuijse, David F; Rahman, Nadim; Dos S Ribeiro, Carolina; Skiby, Jeffrey E; Schmitz, Dennis; Stéger, József; Szalai-Gindl, János M; Thomsen, Martin C F; Cacciò, Simone M; Csabai, István; Kroneman, Annelies; Koopmans, Marion; Aarestrup, Frank; Cochrane, Guy.

Database (Oxford) ; 20192019 01 01.

Artigo em Inglês | MEDLINE | ID: mdl-31868882

RESUMO

Data sharing enables research communities to exchange findings and build upon the knowledge that arises from their discoveries. Areas of public and animal health as well as food safety would benefit from rapid data sharing when it comes to emergencies. However, ethical, regulatory and institutional challenges, as well as lack of suitable platforms which provide an infrastructure for data sharing in structured formats, often lead to data not being shared or at most shared in form of supplementary materials in journal publications. Here, we describe an informatics platform that includes workflows for structured data storage, managing and pre-publication sharing of pathogen sequencing data and its analysis interpretations with relevant stakeholders.

Assuntos

Bases de Dados Factuais , Disseminação de Informação , Bactérias/classificação , Metagenômica , Filogenia , Interface Usuário-Computador

8.

The European Nucleotide Archive in 2018.

Harrison, Peter W; Alako, Blaise; Amid, Clara; Cerdeño-Tárraga, Ana; Cleland, Iain; Holt, Sam; Hussein, Abdulrahman; Jayathilaka, Suran; Kay, Simon; Keane, Thomas; Leinonen, Rasko; Liu, Xin; Martínez-Villacorta, Josué; Milano, Annalisa; Pakseresht, Nima; Rajan, Jeena; Reddy, Kethi; Richards, Edward; Rosello, Marc; Silvester, Nicole; Smirnov, Dmitriy; Toribio, Ana-Luisa; Vijayaraja, Senthilnathan; Cochrane, Guy.

Nucleic Acids Res ; 47(D1): D84-D88, 2019 01 08.

Artigo em Inglês | MEDLINE | ID: mdl-30395270

RESUMO

The European Nucleotide Archive (ENA; https://www.ebi.ac.uk/ena), provided from EMBL-EBI, has for more than three decades been responsible for archiving the world's public sequencing data and presenting this important resource to the scientific community to support and accelerate the global research effort. Here, we outline ENA services and content in 2018 and provide an overview of a selection of focus areas of development work: extending data coordination services around ENA, sequence submissions through template expansion, early pre-submission validation tools and our move towards a new browser and retrieval infrastructure.

Assuntos

Biologia Computacional/métodos , Bases de Dados de Ácidos Nucleicos , Genômica/métodos , Europa (Continente) , Genoma , Humanos , Anotação de Sequência Molecular , Ferramenta de Busca , Software , Transcriptoma , Interface Usuário-Computador , Navegador

9.

EBI Metagenomics in 2017: enriching the analysis of microbial communities, from sequence reads to assemblies.

Mitchell, Alex L; Scheremetjew, Maxim; Denise, Hubert; Potter, Simon; Tarkowska, Aleksandra; Qureshi, Matloob; Salazar, Gustavo A; Pesseat, Sebastien; Boland, Miguel A; Hunter, Fiona M I; Ten Hoopen, Petra; Alako, Blaise; Amid, Clara; Wilkinson, Darren J; Curtis, Thomas P; Cochrane, Guy; Finn, Robert D.

Nucleic Acids Res ; 46(D1): D726-D735, 2018 01 04.

Artigo em Inglês | MEDLINE | ID: mdl-29069476

RESUMO

EBI metagenomics (http://www.ebi.ac.uk/metagenomics) provides a free to use platform for the analysis and archiving of sequence data derived from the microbial populations found in a particular environment. Over the past two years, EBI metagenomics has increased the number of datasets analysed 10-fold. In addition to increased throughput, the underlying analysis pipeline has been overhauled to include both new or updated tools and reference databases. Of particular note is a new workflow for taxonomic assignments that has been extended to include assignments based on both the large and small subunit RNA marker genes and to encompass all cellular micro-organisms. We also describe the addition of metagenomic assembly as a new analysis service. Our pilot studies have produced over 2400 assemblies from datasets in the public domain. From these assemblies, we have produced a searchable, non-redundant protein database of over 50 million sequences. To provide improved access to the data stored within the resource, we have developed a programmatic interface that provides access to the analysis results and associated sample metadata. Finally, we have integrated the results of a series of statistical analyses that provide estimations of diversity and sample comparisons.

Assuntos

Bases de Dados Genéticas , Metagenômica , Microbiota , Algoritmos , Sequência de Bases , Classificação/métodos , Conjuntos de Dados como Assunto , Metagenômica/métodos , RNA Arqueal/genética , RNA Bacteriano/genética , RNA Viral/genética , Ribotipagem , Software , Transcriptoma , Interface Usuário-Computador , Navegador , Fluxo de Trabalho

10.

The European Nucleotide Archive in 2017.

Silvester, Nicole; Alako, Blaise; Amid, Clara; Cerdeño-Tarrága, Ana; Clarke, Laura; Cleland, Iain; Harrison, Peter W; Jayathilaka, Suran; Kay, Simon; Keane, Thomas; Leinonen, Rasko; Liu, Xin; Martínez-Villacorta, Josué; Menchi, Manuela; Reddy, Kethi; Pakseresht, Nima; Rajan, Jeena; Rossello, Marc; Smirnov, Dmitriy; Toribio, Ana L; Vaughan, Daniel; Zalunin, Vadim; Cochrane, Guy.

Nucleic Acids Res ; 46(D1): D36-D40, 2018 01 04.

Artigo em Inglês | MEDLINE | ID: mdl-29140475

RESUMO

For 35 years the European Nucleotide Archive (ENA; https://www.ebi.ac.uk/ena) has been responsible for making the world's public sequencing data available to the scientific community. Advances in sequencing technology have driven exponential growth in the volume of data to be processed and stored and a substantial broadening of the user community. Here, we outline ENA services and content in 2017 and provide insight into a selection of current key areas of development in ENA driven by challenges arising from the above growth.

Assuntos

Bases de Dados de Ácidos Nucleicos , Biologia Computacional , Bases de Dados de Ácidos Nucleicos/tendências , Europa (Continente) , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Armazenamento e Recuperação da Informação , Internet , Anotação de Sequência Molecular

11.

European Nucleotide Archive in 2016.

Toribio, Ana Luisa; Alako, Blaise; Amid, Clara; Cerdeño-Tarrága, Ana; Clarke, Laura; Cleland, Iain; Fairley, Susan; Gibson, Richard; Goodgame, Neil; Ten Hoopen, Petra; Jayathilaka, Suran; Kay, Simon; Leinonen, Rasko; Liu, Xin; Martínez-Villacorta, Josué; Pakseresht, Nima; Rajan, Jeena; Reddy, Kethi; Rosello, Marc; Silvester, Nicole; Smirnov, Dmitriy; Vaughan, Daniel; Zalunin, Vadim; Cochrane, Guy.

Nucleic Acids Res ; 45(D1): D32-D36, 2017 01 04.

Artigo em Inglês | MEDLINE | ID: mdl-27899630

RESUMO

The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) offers a rich platform for data sharing, publishing and archiving and a globally comprehensive data set for onward use by the scientific community. With a broad scope spanning raw sequencing reads, genome assemblies and functional annotation, the resource provides extensive data submission, search and download facilities across web and programmatic interfaces. Here, we outline ENA content and major access modalities, highlight major developments in 2016 and outline a number of examples of data reuse from ENA.

Assuntos

Bases de Dados de Ácidos Nucleicos , Análise de Sequência de DNA , Análise de Sequência de RNA , Genômica , Internet , Anotação de Sequência Molecular

12.

Patterns of database citation in articles and patents indicate long-term scientific and industry value of biological data resources.

Bousfield, David; McEntyre, Johanna; Velankar, Sameer; Papadatos, George; Bateman, Alex; Cochrane, Guy; Kim, Jee-Hyub; Graef, Florian; Vartak, Vid; Alako, Blaise; Blomberg, Niklas.

F1000Res ; 52016.

Artigo em Inglês | MEDLINE | ID: mdl-27092246

RESUMO

Data from open access biomolecular data resources, such as the European Nucleotide Archive and the Protein Data Bank are extensively reused within life science research for comparative studies, method development and to derive new scientific insights. Indicators that estimate the extent and utility of such secondary use of research data need to reflect this complex and highly variable data usage. By linking open access scientific literature, via Europe PubMedCentral, to the metadata in biological data resources we separate data citations associated with a deposition statement from citations that capture the subsequent, long-term, reuse of data in academia and industry. We extend this analysis to begin to investigate citations of biomolecular resources in patent documents. We find citations in more than 8,000 patents from 2014, demonstrating substantial use and an important role for data resources in defining biological concepts in granted patents to both academic and industrial innovators. Combined together our results indicate that the citation patterns in biomedical literature and patents vary, not only due to citation practice but also according to the data resource cited. The results guard against the use of simple metrics such as citation counts and show that indicators of data use must not only take into account citations within the biomedical literature but also include reuse of data in industry and other parts of society by including patents and other scientific and technical documents such as guidelines, reports and grant applications.

13.

Biocuration of functional annotation at the European nucleotide archive.

Gibson, Richard; Alako, Blaise; Amid, Clara; Cerdeño-Tárraga, Ana; Cleland, Iain; Goodgame, Neil; Ten Hoopen, Petra; Jayathilaka, Suran; Kay, Simon; Leinonen, Rasko; Liu, Xin; Pallreddy, Swapna; Pakseresht, Nima; Rajan, Jeena; Rosselló, Marc; Silvester, Nicole; Smirnov, Dmitriy; Toribio, Ana Luisa; Vaughan, Daniel; Zalunin, Vadim; Cochrane, Guy.

Nucleic Acids Res ; 44(D1): D58-66, 2016 Jan 04.

Artigo em Inglês | MEDLINE | ID: mdl-26615190

RESUMO

The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) is a repository for the submission, maintenance and presentation of nucleotide sequence data and related sample and experimental information. In this article we report on ENA in 2015 regarding general activity, notable published data sets and major achievements. This is followed by a focus on sustainable biocuration of functional annotation, an area which has particularly felt the pressure of sequencing growth. The importance of functional annotation, how it can be submitted and the shifting role of the biocurator in the context of increasing volumes of data are all discussed.

Assuntos

Bases de Dados de Ácidos Nucleicos , Anotação de Sequência Molecular , Análise de Sequência de DNA , Análise de Sequência de RNA , Curadoria de Dados

14.

Content discovery and retrieval services at the European Nucleotide Archive.

Silvester, Nicole; Alako, Blaise; Amid, Clara; Cerdeño-Tárraga, Ana; Cleland, Iain; Gibson, Richard; Goodgame, Neil; Ten Hoopen, Petra; Kay, Simon; Leinonen, Rasko; Li, Weizhong; Liu, Xin; Lopez, Rodrigo; Pakseresht, Nima; Pallreddy, Swapna; Plaister, Sheila; Radhakrishnan, Rajesh; Rossello, Marc; Senf, Alexander; Smirnov, Dmitriy; Toribio, Ana Luisa; Vaughan, Daniel; Zalunin, Vadim; Cochrane, Guy.

Nucleic Acids Res ; 43(Database issue): D23-9, 2015 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-25404130

RESUMO

The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) is Europe's primary resource for nucleotide sequence information. With the growing volume and diversity of public sequencing data comes the need for increased sophistication in data organisation, presentation and search services so as to maximise its discoverability and usability. In response to this, ENA has been introducing and improving checklists for use during submission and expanding its search facilities to provide targeted search results. Here, we give a brief update on ENA content and some major developments undertaken in data submission services during 2014. We then describe in more detail the services we offer for data discovery and retrieval.

Assuntos

Bases de Dados de Ácidos Nucleicos , Sequência de Bases , Genômica , Anotação de Sequência Molecular , Análise de Sequência

15.

Assembly information services in the European Nucleotide Archive.

Pakseresht, Nima; Alako, Blaise; Amid, Clara; Cerdeño-Tárraga, Ana; Cleland, Iain; Gibson, Richard; Goodgame, Neil; Gur, Tamer; Jang, Mikyung; Kay, Simon; Leinonen, Rasko; Li, Weizhong; Liu, Xin; Lopez, Rodrigo; McWilliam, Hamish; Oisel, Arnaud; Pallreddy, Swapna; Plaister, Sheila; Radhakrishnan, Rajesh; Rivière, Stephane; Rossello, Marc; Senf, Alexander; Silvester, Nicole; Smirnov, Dmitriy; Squizzato, Silvano; ten Hoopen, Petra; Toribio, Ana Luisa; Vaughan, Daniel; Zalunin, Vadim; Cochrane, Guy.

Nucleic Acids Res ; 42(Database issue): D38-43, 2014 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-24214989

RESUMO

The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) is a repository for the world public domain nucleotide sequence data output. ENA content covers a spectrum of data types including raw reads, assembly data and functional annotation. ENA has faced a dramatic growth in genome assembly submission rates, data volumes and complexity of datasets. This has prompted a broad reworking of assembly submission services, for which we now reach the end of a major programme of work and many enhancements have already been made available over the year to components of the submission service. In this article, we briefly review ENA content and growth over 2013, describe our rapidly developing services for genome assembly information and outline further major developments over the last year.

Assuntos

Bases de Dados de Ácidos Nucleicos , Genômica , Europa (Continente) , Internet

16.

Facing growth in the European Nucleotide Archive.

Cochrane, Guy; Alako, Blaise; Amid, Clara; Bower, Lawrence; Cerdeño-Tárraga, Ana; Cleland, Iain; Gibson, Richard; Goodgame, Neil; Jang, Mikyung; Kay, Simon; Leinonen, Rasko; Lin, Xiu; Lopez, Rodrigo; McWilliam, Hamish; Oisel, Arnaud; Pakseresht, Nima; Pallreddy, Swapna; Park, Youngmi; Plaister, Sheila; Radhakrishnan, Rajesh; Rivière, Stephane; Rossello, Marc; Senf, Alexander; Silvester, Nicole; Smirnov, Dmitriy; Ten Hoopen, Petra; Toribio, Ana; Vaughan, Daniel; Zalunin, Vadim.

Nucleic Acids Res ; 41(Database issue): D30-5, 2013 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-23203883

RESUMO

The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena/) collects, maintains and presents comprehensive nucleic acid sequence and related information as part of the permanent public scientific record. Here, we provide brief updates on ENA content developments and major service enhancements in 2012 and describe in more detail two important areas of development and policy that are driven by ongoing growth in sequencing technologies. First, we describe the ENA data warehouse, a resource for which we provide a programmatic entry point to integrated content across the breadth of ENA. Second, we detail our plans for the deployment of CRAM data compression technology in ENA.

Assuntos

Sequência de Bases , Bases de Dados de Ácidos Nucleicos , Compressão de Dados , Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Internet , Interface Usuário-Computador

17.

A major role for the Plasmodium falciparum ApiAP2 protein PfSIP2 in chromosome end biology.

Flueck, Christian; Bartfai, Richard; Niederwieser, Igor; Witmer, Kathrin; Alako, Blaise T F; Moes, Suzette; Bozdech, Zbynek; Jenoe, Paul; Stunnenberg, Hendrik G; Voss, Till S.

PLoS Pathog ; 6(2): e1000784, 2010 Feb 26.

Artigo em Inglês | MEDLINE | ID: mdl-20195509

RESUMO

The heterochromatic environment and physical clustering of chromosome ends at the nuclear periphery provide a functional and structural framework for antigenic variation and evolution of subtelomeric virulence gene families in the malaria parasite Plasmodium falciparum. While recent studies assigned important roles for reversible histone modifications, silent information regulator 2 and heterochromatin protein 1 (PfHP1) in epigenetic control of variegated expression, factors involved in the recruitment and organization of subtelomeric heterochromatin remain unknown. Here, we describe the purification and characterization of PfSIP2, a member of the ApiAP2 family of putative transcription factors, as the unknown nuclear factor interacting specifically with cis-acting SPE2 motif arrays in subtelomeric domains. Interestingly, SPE2 is not bound by the full-length protein but rather by a 60kDa N-terminal domain, PfSIP2-N, which is released during schizogony. Our experimental re-definition of the SPE2/PfSIP2-N interaction highlights the strict requirement of both adjacent AP2 domains and a conserved bipartite SPE2 consensus motif for high-affinity binding. Genome-wide in silico mapping identified 777 putative binding sites, 94% of which cluster in heterochromatic domains upstream of subtelomeric var genes and in telomere-associated repeat elements. Immunofluorescence and chromatin immunoprecipitation (ChIP) assays revealed co-localization of PfSIP2-N with PfHP1 at chromosome ends. Genome-wide ChIP demonstrated the exclusive binding of PfSIP2-N to subtelomeric SPE2 landmarks in vivo but not to single chromosome-internal sites. Consistent with this specialized distribution pattern, PfSIP2-N over-expression has no effect on global gene transcription. Hence, contrary to the previously proposed role for this factor in gene activation, our results provide strong evidence for the first time for the involvement of an ApiAP2 factor in heterochromatin formation and genome integrity. These findings are highly relevant for our understanding of chromosome end biology and variegated expression in P. falciparum and other eukaryotes, and for the future analysis of the role of ApiAP2-DNA interactions in parasite biology.

Assuntos

Proteínas Cromossômicas não Histona/genética , Cromossomos/genética , Regulação da Expressão Gênica/genética , Plasmodium falciparum/genética , Proteínas de Protozoários/metabolismo , Fatores de Transcrição/metabolismo , Southern Blotting , Western Blotting , Imunoprecipitação da Cromatina , Homólogo 5 da Proteína Cromobox , Ensaio de Desvio de Mobilidade Eletroforética , Imunofluorescência , Genes de Protozoários , Heterocromatina , Reação em Cadeia da Polimerase Via Transcriptase Reversa

18.

Plasmodium falciparum heterochromatin protein 1 marks genomic loci linked to phenotypic variation of exported virulence factors.

Flueck, Christian; Bartfai, Richard; Volz, Jennifer; Niederwieser, Igor; Salcedo-Amaya, Adriana M; Alako, Blaise T F; Ehlgen, Florian; Ralph, Stuart A; Cowman, Alan F; Bozdech, Zbynek; Stunnenberg, Hendrik G; Voss, Till S.

PLoS Pathog ; 5(9): e1000569, 2009 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-19730695

RESUMO

Epigenetic processes are the main conductors of phenotypic variation in eukaryotes. The malaria parasite Plasmodium falciparum employs antigenic variation of the major surface antigen PfEMP1, encoded by 60 var genes, to evade acquired immune responses. Antigenic variation of PfEMP1 occurs through in situ switches in mono-allelic var gene transcription, which is PfSIR2-dependent and associated with the presence of repressive H3K9me3 marks at silenced loci. Here, we show that P. falciparum heterochromatin protein 1 (PfHP1) binds specifically to H3K9me3 but not to other repressive histone methyl marks. Based on nuclear fractionation and detailed immuno-localization assays, PfHP1 constitutes a major component of heterochromatin in perinuclear chromosome end clusters. High-resolution genome-wide chromatin immuno-precipitation demonstrates the striking association of PfHP1 with virulence gene arrays in subtelomeric and chromosome-internal islands and a high correlation with previously mapped H3K9me3 marks. These include not only var genes, but also the majority of P. falciparum lineage-specific gene families coding for exported proteins involved in host-parasite interactions. In addition, we identified a number of PfHP1-bound genes that were not enriched in H3K9me3, many of which code for proteins expressed during invasion or at different life cycle stages. Interestingly, PfHP1 is absent from centromeric regions, implying important differences in centromere biology between P. falciparum and its human host. Over-expression of PfHP1 results in an enhancement of variegated expression and highlights the presence of well-defined heterochromatic boundaries. In summary, we identify PfHP1 as a major effector of virulence gene silencing and phenotypic variation. Our results are instrumental for our understanding of this widely used survival strategy in unicellular pathogens.

Assuntos

Proteínas Cromossômicas não Histona/genética , Plasmodium falciparum/genética , Proteínas de Protozoários/genética , Fatores de Virulência/genética , Animais , Núcleo Celular/metabolismo , Centrômero/metabolismo , Homólogo 5 da Proteína Cromobox , Proteínas Cromossômicas não Histona/metabolismo , Cromossomos , Inativação Gênica , Genoma de Protozoário , Família Multigênica , Análise de Sequência com Séries de Oligonucleotídeos , Fenótipo , Plasmodium falciparum/patogenicidade , Proteínas de Protozoários/metabolismo , Reprodutibilidade dos Testes , Fatores de Virulência/metabolismo

19.

Dynamic histone H3 epigenome marking during the intraerythrocytic cycle of Plasmodium falciparum.

Salcedo-Amaya, Adriana M; van Driel, Marc A; Alako, Blaise T; Trelle, Morten B; van den Elzen, Antonia M G; Cohen, Adrian M; Janssen-Megens, Eva M; van de Vegte-Bolmer, Marga; Selzer, Rebecca R; Iniguez, A Leonardo; Green, Roland D; Sauerwein, Robert W; Jensen, Ole N; Stunnenberg, Hendrik G.

Proc Natl Acad Sci U S A ; 106(24): 9655-60, 2009 Jun 16.

Artigo em Inglês | MEDLINE | ID: mdl-19497874

RESUMO

Epigenome profiling has led to the paradigm that promoters of active genes are decorated with H3K4me3 and H3K9ac marks. To explore the epigenome of Plasmodium falciparum asexual stages, we performed MS analysis of histone modifications and found a general preponderance of H3/H4 acetylation and H3K4me3. ChIP-on-chip profiling of H3, H3K4me3, H3K9me3, and H3K9ac from asynchronous parasites revealed an extensively euchromatic epigenome with heterochromatin restricted to variant surface antigen gene families (VSA) and a number of genes hitherto unlinked to VSA. Remarkably, the vast majority of the genome shows an unexpected pattern of enrichment of H3K4me3 and H3K9ac. Analysis of synchronized parasites revealed significant developmental stage specificity of the epigenome. In rings, H3K4me3 and H3K9ac are homogenous across the genes marking active and inactive genes equally, whereas in schizonts, they are enriched at the 5' end of active genes. This study reveals an unforeseen and unique plasticity in the use of the epigenetic marks and implies the presence of distinct epigenetic pathways in gene silencing/activation throughout the erythrocytic cycle.

Assuntos

Eritrócitos/parasitologia , Genoma de Protozoário , Histonas/genética , Plasmodium falciparum/genética , Animais , Imunoprecipitação da Cromatina , Heterocromatina/metabolismo , Histonas/metabolismo , Espectrometria de Massas , Análise de Sequência com Séries de Oligonucleotídeos , Plasmodium falciparum/fisiologia

20.

TreeDomViewer: a tool for the visualization of phylogeny and protein domain structure.

Alako, Blaise T F; Rainey, Daphne; Nijveen, Harm; Leunissen, Jack A M.

Nucleic Acids Res ; 34(Web Server issue): W104-9, 2006 Jul 01.

Artigo em Inglês | MEDLINE | ID: mdl-16844970

RESUMO

Phylogenetic analysis and examination of protein domains allow accurate genome annotation and are invaluable to study proteins and protein complex evolution. However, two sequences can be homologous without sharing statistically significant amino acid or nucleotide identity, presenting a challenging bioinformatics problem. We present TreeDomViewer, a visualization tool available as a web-based interface that combines phylogenetic tree description, multiple sequence alignment and InterProScan data of sequences and generates a phylogenetic tree projecting the corresponding protein domain information onto the multiple sequence alignment. Thereby it makes use of existing domain prediction tools such as InterProScan. TreeDomViewer adopts an evolutionary perspective on how domain structure of two or more sequences can be aligned and compared, to subsequently infer the function of an unknown homolog. This provides insight into the function assignment of, in terms of amino acid substitution, very divergent but yet closely related family members. Our tool produces an interactive scalar vector graphics image that provides orthological relationship and domain content of proteins of interest at one glance. In addition, PDF, JPEG or PNG formatted output is also provided. These features make TreeDomViewer a valuable addition to the annotation pipeline of unknown genes or gene products. TreeDomViewer is available at http://www.bioinformatics.nl/tools/treedom/.

Assuntos

Gráficos por Computador , Filogenia , Estrutura Terciária de Proteína , Proteínas/classificação , Software , Internet , Proteínas/genética , Alinhamento de Sequência , Análise de Sequência de Proteína , Design de Software

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA