Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 83
Filter
1.
Sci Data ; 10(1): 758, 2023 11 03.
Article in English | MEDLINE | ID: mdl-37923731

ABSTRACT

Articular cartilage has only very limited regenerative capacities in humans. Tissue engineering techniques for cartilage damage repair are limited in the production of hyaline cartilage. Mesenchymal stem/stromal cells (MSCs) are multipotent stem cells and can be differentiated into mature cartilage cells, chondrocytes, which could be used for repairing damaged cartilage. Chondrogenesis is a highly complex, relatively inefficient process lasting over 3 weeks in vitro. Methods: In order to better understand chondrogenic differentiation, especially the commitment phase, we have performed transcriptional profiling of MSC differentiation into chondrocytes from early timepoints starting 15 minutes after induction to 16 hours and fully differentiated chondrocytes at 21 days in triplicates.


Subject(s)
Cell Differentiation , Chondrocytes , Mesenchymal Stem Cells , Humans , Cartilage, Articular , Transcriptome
2.
Methods Mol Biol ; 2231: 3-16, 2021.
Article in English | MEDLINE | ID: mdl-33289883

ABSTRACT

Clustal Omega is a version, completely rewritten and revised in 2011, of the widely used Clustal series of programs for multiple sequence alignment. It can deal with very large numbers (many tens of thousands) of DNA/RNA or protein sequences due to its use of the mBed algorithm for calculating guide-trees. This algorithm allows very large alignment problems to be tackled very quickly, even on personal computers. The accuracy of the program has been considerably improved over earlier Clustal programs, through the use of the HHalign method for aligning profile hidden Markov models. The program currently is used from the command-line or can be run online.


Subject(s)
Sequence Alignment/methods , Sequence Analysis, Protein/methods , Algorithms , Amino Acid Sequence , Base Sequence , Software
3.
Bioinformatics ; 36(1): 90-95, 2020 01 01.
Article in English | MEDLINE | ID: mdl-31292629

ABSTRACT

MOTIVATION: Secondary structure prediction accuracy (SSPA) in the QuanTest benchmark can be used to measure accuracy of a multiple sequence alignment. SSPA correlates well with the sum-of-pairs score, if the results are averaged over many alignments but not on an alignment-by-alignment basis. This is due to a sub-optimal selection of reference and non-reference sequences in QuanTest. RESULTS: We develop an improved strategy for selecting reference and non-reference sequences for a new benchmark, QuanTest2. In QuanTest2, SSPA and SP correlate better on an alignment-by-alignment basis than in QuanTest. Guide-trees for QuanTest2 are more balanced with respect to reference sequences than in QuanTest. QuanTest2 scores correlate well with other well-established benchmarks. AVAILABILITY AND IMPLEMENTATION: QuanTest2 is available at http://bioinf.ucd.ie/quantest2.tar, comprises of reference and non-reference sequence sets and a scoring script. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Algorithms , Benchmarking , Sequence Alignment , Benchmarking/methods , Protein Structure, Secondary , Sequence Alignment/methods , Software
4.
PLoS Genet ; 14(5): e1007429, 2018 05.
Article in English | MEDLINE | ID: mdl-29852014

ABSTRACT

Riboswitches are non-coding RNA molecules that regulate gene expression by binding to specific ligands. They are primarily found in bacteria. However, one riboswitch type, the thiamin pyrophosphate (TPP) riboswitch, has also been described in some plants, marine protists and fungi. We find that riboswitches are widespread in the budding yeasts (Saccharomycotina), and they are most common in homologs of DUR31, originally described as a spermidine transporter. We show that DUR31 (an ortholog of N. crassa gene NCU01977) encodes a thiamin transporter in Candida species. Using an RFP/riboswitch expression system, we show that the functional elements of the riboswitch are contained within the native intron of DUR31 from Candida parapsilosis, and that the riboswitch regulates splicing in a thiamin-dependent manner when RFP is constitutively expressed. The DUR31 gene has been lost from Saccharomyces, and may have been displaced by an alternative thiamin transporter. TPP riboswitches are also present in other putative transporters in yeasts and filamentous fungi. However, they are rare in thiamin biosynthesis genes THI4 and THI5 in the Saccharomycotina, and have been lost from all genes in the sequenced species in the family Saccharomycetaceae, including S. cerevisiae.


Subject(s)
Candida parapsilosis/genetics , Fungal Proteins/genetics , Membrane Transport Proteins/genetics , Riboswitch/genetics , Thiamine/metabolism , Biological Transport, Active/genetics , Candida parapsilosis/metabolism , Introns/genetics , Neurospora crassa/genetics , Saccharomyces/genetics
5.
Mol Biol Evol ; 35(6): 1390-1406, 2018 06 01.
Article in English | MEDLINE | ID: mdl-29562344

ABSTRACT

The olfactory receptor (OR) gene families, which govern mammalian olfaction, have undergone extensive expansion and contraction through duplication and pseudogenization. Previous studies have shown that broadly defined environmental adaptations (e.g., terrestrial vs. aquatic) are correlated with the number of functional and non-functional OR genes retained. However, to date, no study has examined species-specific gene duplications in multiple phylogenetically divergent mammals to elucidate OR evolution and adaptation. Here, we identify the OR gene families driving adaptation to different ecological niches by mapping the fate of species-specific gene duplications in the OR repertoire of 94 diverse mammalian taxa, using molecular phylogenomic methods. We analyze >70,000 OR gene sequences mined from whole genomes, generated from novel amplicon sequencing data, and collated with data from previous studies, comprising one of the largest OR studies to date. For the first time, we demonstrate statistically significant patterns of OR species-specific gene duplications associated with the presence of a functioning vomeronasal organ. With respect to dietary niche, we uncover a novel link between a large number of duplications in OR family 5/8/9 and herbivory. Our results also highlight differences between social and solitary niches, indicating that a greater OR repertoire expansion may be associated with a solitary lifestyle. This study demonstrates the utility of species-specific duplications in elucidating gene family evolution, revealing how the OR repertoire has undergone expansion and contraction with respect to a number of ecological adaptations in mammals.


Subject(s)
Adaptation, Biological , Biological Evolution , Mammals/genetics , Multigene Family , Receptors, Odorant/genetics , Animals , Ecosystem , Gene Duplication , Species Specificity
6.
PLoS One ; 13(2): e0192898, 2018.
Article in English | MEDLINE | ID: mdl-29444186

ABSTRACT

Metagenomics uses nucleic acid sequencing to characterize species diversity in different niches such as environmental biomes or the human microbiome. Most studies have used 16S rRNA amplicon sequencing to identify bacteria. However, the decreasing cost of sequencing has resulted in a gradual shift away from amplicon analyses and towards shotgun metagenomic sequencing. Shotgun metagenomic data can be used to identify a wide range of species, but have rarely been applied to fungal identification. Here, we develop a sequence classification pipeline, FindFungi, and use it to identify fungal sequences in public metagenome datasets. We focus primarily on animal metagenomes, especially those from pig and mouse microbiomes. We identified fungi in 39 of 70 datasets comprising 71 fungal species. At least 11 pathogenic species with zoonotic potential were identified, including Candida tropicalis. We identified Pseudogymnoascus species from 13 Antarctic soil samples initially analyzed for the presence of bacteria capable of degrading diesel oil. We also show that Candida tropicalis and Candida loboi are likely the same species. In addition, we identify several examples where contaminating DNA was erroneously included in fungal genome assemblies.


Subject(s)
Fungi/genetics , Metagenomics , Animals , Antarctic Regions , Ascomycota/classification , Ascomycota/genetics , Ascomycota/isolation & purification , Candida tropicalis/genetics , Candida tropicalis/pathogenicity , Databases, Genetic , Fungi/classification , Fungi/pathogenicity , Humans , Metagenome , Mice , Microbiota/genetics , Phylogeny , Soil Microbiology , Swine , Zoonoses/microbiology
7.
Protein Sci ; 27(1): 135-145, 2018 01.
Article in English | MEDLINE | ID: mdl-28884485

ABSTRACT

Clustal Omega is a widely used package for carrying out multiple sequence alignment. Here, we describe some recent additions to the package and benchmark some alternative ways of making alignments. These benchmarks are based on protein structure comparisons or predictions and include a recently described method based on secondary structure prediction. In general, Clustal Omega is fast enough to make very large alignments and the accuracy of protein alignments is high when compared to alternative packages. The package is freely available as executables or source code from www.clustal.org or can be run on-line from a variety of sites, especially the EBI www.ebi.ac.uk.


Subject(s)
Proteins/genetics , Sequence Alignment/methods , Sequence Analysis, Protein/methods , Software , Proteins/chemistry
8.
Cell Death Differ ; 24(11): 1975-1986, 2017 11.
Article in English | MEDLINE | ID: mdl-28885616

ABSTRACT

We have previously reported that myeloid differentiation primary response gene 88 (MyD88) is downregulated during all-trans retinoic acid (RA)-induced differentiation of pluripotent NTera2 human embryonal carcinoma cells (hECCs), whereas its maintained expression is associated with RA differentiation resistance in nullipotent 2102Ep hECCs. MyD88 is the main adapter for toll-like receptor (TLR) signalling, where it determines the secretion of chemokines and cytokines in response to pathogens. In this study, we report that loss of MyD88 is essential for RA-facilitated differentiation of hECCs. Functional analysis using a specific MyD88 peptide inhibitor (PepInh) demonstrated that high MyD88 expression in the self-renewal state inhibits the expression of a specific set of HOX genes. In NTera2 cells, MyD88 is downregulated during RA-induced differentiation, a mechanism that could be broadly replicated by MyD88 PepInh treatment of 2102Ep cells. Notably, MyD88 inhibition transitioned 2102Ep cells into a stable, self-renewing state that appears to be primed for differentiation upon addition of RA. At a molecular level, MyD88 inhibition combined with RA treatment upregulated HOX, RA signalling and TLR signalling genes. These events permit differentiation through a standard downregulation of Oct4-Sox2-Nanog mechanism. In line with its role in regulating secretion of specific proteins, conditioned media experiments demonstrated that differentiated (MyD88 low) NTera2 cell media was sufficient to differentiate NTera2 cells. Protein array analysis indicated that this was owing to secretion of factors known to regulate angiogenesis, neurogenesis and all three branches of TGF-ß Superfamily signalling. Collectively, these data offer new insights into RA controlled differentiation of pluripotent cells, with notable parallels to the ground state model of embryonic stem cell self-renewal. These data may provide insights to facilitate improved differentiation protocols for regenerative medicine and differentiation-therapies in cancer treatment.


Subject(s)
Cell Differentiation/drug effects , Embryonal Carcinoma Stem Cells/pathology , Myeloid Differentiation Factor 88/metabolism , Pluripotent Stem Cells/pathology , Tretinoin/pharmacology , Cell Differentiation/genetics , Cell Self Renewal/drug effects , Cell Self Renewal/genetics , Embryonal Carcinoma Stem Cells/drug effects , Embryonal Carcinoma Stem Cells/metabolism , Gene Expression Regulation, Neoplastic/drug effects , Homeodomain Proteins/genetics , Homeodomain Proteins/metabolism , Humans , Mesoderm/pathology , Models, Biological , Pluripotent Stem Cells/drug effects , Pluripotent Stem Cells/metabolism , Up-Regulation/drug effects , Up-Regulation/genetics
9.
Genome Med ; 9(1): 15, 2017 02 10.
Article in English | MEDLINE | ID: mdl-28187790

ABSTRACT

BACKGROUND: Retinoid therapy is widely employed in clinical oncology to differentiate malignant cells into their more benign counterparts. However, certain high-risk cohorts, such as patients with MYCN-amplified neuroblastoma, are innately resistant to retinoid therapy. Therefore, we employed a precision medicine approach to globally profile the retinoid signalling response and to determine how an excess of cellular MYCN antagonises these signalling events to prevent differentiation and confer resistance. METHODS: We applied RNA sequencing (RNA-seq) and interaction proteomics coupled with network-based systems level analysis to identify targetable vulnerabilities of MYCN-mediated retinoid resistance. We altered MYCN expression levels in a MYCN-inducible neuroblastoma cell line to facilitate or block retinoic acid (RA)-mediated neuronal differentiation. The relevance of differentially expressed genes and transcriptional regulators for neuroblastoma outcome were then confirmed using existing patient microarray datasets. RESULTS: We determined the signalling networks through which RA mediates neuroblastoma differentiation and the inhibitory perturbations to these networks upon MYCN overexpression. We revealed opposing regulation of RA and MYCN on a number of differentiation-relevant genes, including LMO4, CYP26A1, ASCL1, RET, FZD7 and DKK1. Furthermore, we revealed a broad network of transcriptional regulators involved in regulating retinoid responsiveness, such as Neurotrophin, PI3K, Wnt and MAPK, and epigenetic signalling. Of these regulators, we functionally confirmed that MYCN-driven inhibition of transforming growth factor beta (TGF-ß) signalling is a vulnerable node of the MYCN network and that multiple levels of cross-talk exist between MYCN and TGF-ß. Co-targeting of the retinoic acid and TGF-ß pathways, through RA and kartogenin (KGN; a TGF-ß signalling activating small molecule) combination treatment, induced the loss of viability of MYCN-amplified retinoid-resistant neuroblastoma cells. CONCLUSIONS: Our approach provides a powerful precision oncology tool for identifying the driving signalling networks for malignancies not primarily driven by somatic mutations, such as paediatric cancers. By applying global omics approaches to the signalling networks regulating neuroblastoma differentiation and stemness, we have determined the pathways involved in the MYCN-mediated retinoid resistance, with TGF-ß signalling being a key regulator. These findings revealed a number of combination treatments likely to improve clinical response to retinoid therapy, including co-treatment with retinoids and KGN, which may prove valuable in the treatment of high-risk MYCN-amplified neuroblastoma.


Subject(s)
Anilides/therapeutic use , N-Myc Proto-Oncogene Protein/drug effects , Neuroblastoma/drug therapy , Phthalic Acids/therapeutic use , Signal Transduction , Transforming Growth Factor beta/drug effects , Tretinoin/therapeutic use , Antineoplastic Agents/therapeutic use , Cell Line, Tumor , Drug Resistance, Neoplasm , Gene Expression Regulation, Neoplastic , Humans , Neuroblastoma/genetics , Neuroblastoma/metabolism , Precision Medicine , Retinoids/therapeutic use
10.
Bioinformatics ; 33(9): 1331-1337, 2017 05 01.
Article in English | MEDLINE | ID: mdl-28093407

ABSTRACT

Motivation: Multiple sequence alignment (MSA) is commonly used to analyze sets of homologous protein or DNA sequences. This has lead to the development of many methods and packages for MSA over the past 30 years. Being able to compare different methods has been problematic and has relied on gold standard benchmark datasets of 'true' alignments or on MSA simulations. A number of protein benchmark datasets have been produced which rely on a combination of manual alignment and/or automated superposition of protein structures. These are either restricted to very small MSAs with few sequences or require manual alignment which can be subjective. In both cases, it remains very difficult to properly test MSAs of more than a few dozen sequences. PREFAB and HomFam both rely on using a small subset of sequences of known structure and do not fairly test the quality of a full MSA. Results: In this paper we describe QuanTest, a fully automated and highly scalable test system for protein MSAs which is based on using secondary structure prediction accuracy (SSPA) to measure alignment quality. This is based on the assumption that better MSAs will give more accurate secondary structure predictions when we include sequences of known structure. SSPA measures the quality of an entire alignment however, not just the accuracy on a handful of selected sequences. It can be scaled to alignments of any size but here we demonstrate its use on alignments of either 200 or 1000 sequences. This allows the testing of slow accurate programs as well as faster, less accurate ones. We show that the scores from QuanTest are highly correlated with existing benchmark scores. We also validate the method by comparing a wide range of MSA alignment options and by including different levels of mis-alignment into MSA, and examining the effects on the scores. Availability and Implementation: QuanTest is available from http://www.bioinf.ucd.ie/download/QuanTest.tgz. Contact: quan.le@ucd.ie. Supplementary information: Supplementary data are available at Bioinformatics online.


Subject(s)
Protein Structure, Secondary , Sequence Alignment/methods , Sequence Analysis, Protein/methods , Software , Algorithms , Amino Acid Sequence , Benchmarking , Computational Biology/methods , Data Accuracy , Sequence Alignment/standards , Sequence Analysis, Protein/standards
11.
PLoS Genet ; 12(11): e1006404, 2016 Nov.
Article in English | MEDLINE | ID: mdl-27806045

ABSTRACT

Mating between different species produces hybrids that are usually asexual and stuck as diploids, but can also lead to the formation of new species. Here, we report the genome sequences of 27 isolates of the pathogenic yeast Candida orthopsilosis. We find that most isolates are diploid hybrids, products of mating between two unknown parental species (A and B) that are 5% divergent in sequence. Isolates vary greatly in the extent of homogenization between A and B, making their genomes a mosaic of highly heterozygous regions interspersed with homozygous regions. Separate phylogenetic analyses of SNPs in the A- and B-derived portions of the genome produces almost identical trees of the isolates with four major clades. However, the presence of two mutually exclusive genotype combinations at the mating type locus, and recombinant mitochondrial genomes diagnostic of inter-clade mating, shows that the species C. orthopsilosis does not have a single evolutionary origin but was created at least four times by separate interspecies hybridizations between parents A and B. Older hybrids have lost more heterozygosity. We also identify two isolates with homozygous genomes derived exclusively from parent A, which are pure non-hybrid strains. The parallel emergence of the same hybrid species from multiple independent hybridization events is common in plant evolution, but is much less documented in pathogenic fungi.


Subject(s)
Candida/genetics , Genetic Speciation , Hybridization, Genetic , Phylogeny , Animals , Candida/growth & development , Diploidy , Genome, Fungal , Haplotypes , Heterozygote , Larva/genetics , Mitochondria/genetics , Polymorphism, Single Nucleotide , Saccharomyces cerevisiae/genetics
12.
PLoS One ; 11(9): e0163235, 2016.
Article in English | MEDLINE | ID: mdl-27658249

ABSTRACT

The Candida CTG clade is a monophyletic group of fungal species that translates CTG as serine, and includes the pathogens Candida albicans and Candida parapsilosis. Research has typically focused on identifying protein-coding genes in these species. Here, we use bioinformatic and experimental approaches to annotate known classes of non-coding RNAs in three CTG-clade species, Candida parapsilosis, Candida orthopsilosis and Lodderomyces elongisporus. We also update the annotation of ncRNAs in the C. albicans genome. The majority of ncRNAs identified were snoRNAs. Approximately 50% of snoRNAs (including most of the C/D box class) are encoded in introns. Most are within mono- and polycistronic transcripts with no protein coding potential. Five polycistronic clusters of snoRNAs are highly conserved in fungi. In polycistronic regions, splicing occurs via the classical pathway, as well as by nested and recursive splicing. We identified spliceosomal small nuclear RNAs, the telomerase RNA component, signal recognition particle, RNase P RNA component and the related RNase MRP RNA component in all three genomes. Stem loop IV of the U2 spliceosomal RNA and the associated binding proteins were lost from the ancestor of C. parapsilosis and C. orthopsilosis, following the divergence from L. elongisporus. The RNA component of the MRP is longer in C. parapsilosis, C. orthopsilosis and L. elongisporus than in S. cerevisiae, but is substantially shorter than in C. albicans.

13.
Oncotarget ; 7(37): 60310-60331, 2016 Sep 13.
Article in English | MEDLINE | ID: mdl-27531891

ABSTRACT

Wnt signalling is involved in the formation, metastasis and relapse of a wide array of cancers. However, there is ongoing debate as to whether activation or inhibition of the pathway holds the most promise as a therapeutic treatment for cancer, with conflicting evidence from a variety of tumour types. We show that Wnt/ß-catenin signalling is a bi-directional vulnerability of neuroblastoma, malignant melanoma and colorectal cancer, with hyper-activation or repression of the pathway both representing a promising therapeutic strategy, even within the same cancer type. Hyper-activation directs cancer cells to undergo apoptosis, even in cells oncogenically driven by ß-catenin. Wnt inhibition blocks proliferation of cancer cells and promotes neuroblastoma differentiation. Wnt and retinoic acid co-treatments synergise, representing a promising combination treatment for MYCN-amplified neuroblastoma. Additionally, we report novel cross-talks between MYCN and ß-catenin signalling, which repress normal ß-catenin mediated transcriptional regulation. A ß-catenin target gene signature could predict patient outcome, as could the expression level of its DNA binding partners, the TCF/LEFs. This ß-catenin signature provides a tool to identify neuroblastoma patients likely to benefit from Wnt-directed therapy. Taken together, we show that Wnt/ß-catenin signalling is a bi-directional vulnerability of a number of cancer entities, and potentially a more broadly conserved feature of malignant cells.


Subject(s)
Gene Expression Regulation, Neoplastic , N-Myc Proto-Oncogene Protein/genetics , Neuroblastoma/genetics , Wnt Signaling Pathway/genetics , beta Catenin/genetics , Antineoplastic Agents/pharmacology , Bridged Bicyclo Compounds, Heterocyclic/pharmacology , Cell Differentiation/drug effects , Cell Differentiation/genetics , Cell Line, Tumor , Cell Proliferation/drug effects , Cell Proliferation/genetics , Gene Expression Profiling/methods , Humans , N-Myc Proto-Oncogene Protein/metabolism , Neuroblastoma/metabolism , Neuroblastoma/pathology , Proteomics/methods , Pyrimidinones/pharmacology , RNA Interference , Survival Analysis , Tretinoin/pharmacology , Wnt Proteins/antagonists & inhibitors , Wnt Proteins/genetics , Wnt Proteins/metabolism , beta Catenin/metabolism
14.
Biochem Biophys Res Commun ; 474(3): 579-586, 2016 06 03.
Article in English | MEDLINE | ID: mdl-27130823

ABSTRACT

Hepatocyte death is an important contributing factor in a number of diseases of the liver. PHD1 confers hypoxic sensitivity upon transcription factors including the hypoxia inducible factor (HIF) and nuclear factor-kappaB (NF-κB). Reduced PHD1 activity is linked to decreased apoptosis. Here, we investigated the underlying mechanism(s) in hepatocytes. Basal NF-κB activity was elevated in PHD1(-/-) hepatocytes compared to wild type controls. ChIP-seq analysis confirmed enhanced binding of NF-κB to chromatin in regions proximal to the promoters of genes involved in the regulation of apoptosis. Inhibition of NF-κB (but not knock-out of HIF-1 or HIF-2) reversed the anti-apoptotic effects of pharmacologic hydroxylase inhibition. We hypothesize that PHD1 inhibition leads to altered expression of NF-κB-dependent genes resulting in reduced apoptosis. This study provides new information relating to the possible mechanism of therapeutic action of hydroxylase inhibitors that has been reported in pre-clinical models of intestinal and hepatic disease.


Subject(s)
Apoptosis/physiology , Hepatocytes/cytology , Hepatocytes/physiology , Hypoxia-Inducible Factor-Proline Dioxygenases/metabolism , NF-kappa B/metabolism , Procollagen-Proline Dioxygenase/metabolism , Animals , Cell Hypoxia/physiology , Cell Line , Gene Expression Regulation, Enzymologic/physiology , HEK293 Cells , Humans , Mice
15.
Nucleic Acids Res ; 44(W1): W11-5, 2016 07 08.
Article in English | MEDLINE | ID: mdl-27085803

ABSTRACT

Low-throughput experiments and high-throughput proteomic and genomic analyses have created enormous quantities of data that can be used to explore protein function and evolution. The ability to consolidate these data into an informative and intuitive format is vital to our capacity to comprehend these distinct but complementary sources of information. However, existing tools to visualize protein-related data are restricted by their presentation, sources of information, functionality or accessibility. We introduce ProViz, a powerful browser-based tool to aid biologists in building hypotheses and designing experiments by simplifying the analysis of functional and evolutionary features of proteins. Feature information is retrieved in an automated manner from resources describing protein modular architecture, post-translational modification, structure, sequence variation and experimental characterization of functional regions. These features are mapped to evolutionary information from precomputed multiple sequence alignments. Data are displayed in an interactive and information-rich yet intuitive visualization, accessible through a simple protein search interface. This allows users with limited bioinformatic skills to rapidly access data pertinent to their research. Visualizations can be further customized with user-defined data either manually or using a REST API. ProViz is available at http://proviz.ucd.ie/.


Subject(s)
Computational Biology/statistics & numerical data , Datasets as Topic/statistics & numerical data , User-Computer Interface , Amino Acid Sequence , Biomedical Research , Computational Biology/methods , Computer Graphics , Databases, Protein , Evolution, Molecular , Humans , Internet , Sequence Alignment
16.
Bioinformatics ; 32(6): 814-20, 2016 03 15.
Article in English | MEDLINE | ID: mdl-26568625

ABSTRACT

MOTIVATION: Multiple sequence alignments (MSAs) with large numbers of sequences are now commonplace. However, current multiple alignment benchmarks are ill-suited for testing these types of alignments, as test cases either contain a very small number of sequences or are based purely on simulation rather than empirical data. RESULTS: We take advantage of recent developments in protein structure prediction methods to create a benchmark (ContTest) for protein MSAs containing many thousands of sequences in each test case and which is based on empirical biological data. We rank popular MSA methods using this benchmark and verify a recent result showing that chained guide trees increase the accuracy of progressive alignment packages on datasets with thousands of proteins. AVAILABILITY AND IMPLEMENTATION: Benchmark data and scripts are available for download at http://www.bioinf.ucd.ie/download/ContTest.tar.gz CONTACT: des.higgins@ucd.ie SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Sequence Alignment , Algorithms , Proteins , Software
17.
Oncotarget ; 6(41): 43182-201, 2015 Dec 22.
Article in English | MEDLINE | ID: mdl-26673823

ABSTRACT

Despite intensive study, many mysteries remain about the MYCN oncogene's functions. Here we focus on MYCN's role in neuroblastoma, the most common extracranial childhood cancer. MYCN gene amplification occurs in 20% of cases, but other recurrent somatic mutations are rare. This scarcity of tractable targets has hampered efforts to develop new therapeutic options. We employed a multi-level omics approach to examine MYCN functioning and identify novel therapeutic targets for this largely un-druggable oncogene. We used systems medicine based computational network reconstruction and analysis to integrate a range of omic techniques: sequencing-based transcriptomics, genome-wide chromatin immunoprecipitation, siRNA screening and interaction proteomics, revealing that MYCN controls highly connected networks, with MYCN primarily supressing the activity of network components. MYCN's oncogenic functions are likely independent of its classical heterodimerisation partner, MAX. In particular, MYCN controls its own protein interaction network by transcriptionally regulating its binding partners.Our network-based approach identified vulnerable therapeutically targetable nodes that function as critical regulators or effectors of MYCN in neuroblastoma. These were validated by siRNA knockdown screens, functional studies and patient data. We identified ß-estradiol and MAPK/ERK as having functional cross-talk with MYCN and being novel targetable vulnerabilities of MYCN-amplified neuroblastoma. These results reveal surprising differences between the functioning of endogenous, overexpressed and amplified MYCN, and rationalise how different MYCN dosages can orchestrate cell fate decisions and cancerous outcomes. Importantly, this work describes a systems-level approach to systematically uncovering network based vulnerabilities and therapeutic targets for multifactorial diseases by integrating disparate omic data types.


Subject(s)
Genes, myc/physiology , Neuroblastoma/genetics , Nuclear Proteins/physiology , Oncogene Proteins/physiology , Protein Interaction Maps/physiology , Blotting, Western , Chromatin Immunoprecipitation , Computational Biology/methods , Gene Expression Regulation, Neoplastic/physiology , High-Throughput Nucleotide Sequencing/methods , Humans , N-Myc Proto-Oncogene Protein , Neuroblastoma/metabolism , Neuroblastoma/pathology , Oligonucleotide Array Sequence Analysis , Polymerase Chain Reaction , Proteomics/methods , Signal Transduction/physiology
18.
Algorithms Mol Biol ; 10: 26, 2015.
Article in English | MEDLINE | ID: mdl-26457114

ABSTRACT

BACKGROUND: Progressive alignment is the standard approach used to align large numbers of sequences. As with all heuristics, this involves a tradeoff between alignment accuracy and computation time. RESULTS: We examine this tradeoff and find that, because of a loss of information in the early steps of the approach, the alignments generated by the most common multiple sequence alignment programs are inherently unstable, and simply reversing the order of the sequences in the input file will cause a different alignment to be generated. Although this effect is more obvious with larger numbers of sequences, it can also be seen with data sets in the order of one hundred sequences. We also outline the means to determine the number of sequences in a data set beyond which the probability of instability will become more pronounced. CONCLUSIONS: This has major ramifications for both the designers of large-scale multiple sequence alignment algorithms, and for the users of these alignments.

19.
J Mol Biol ; 427(21): 3368-74, 2015 Oct 23.
Article in English | MEDLINE | ID: mdl-26362006

ABSTRACT

Identifying changes in the transcriptional regulation of target genes from high-throughput studies is important for unravelling molecular mechanisms controlled by a given perturbation. When measuring global transcript levels only, the effect of the perturbation [e.g., transcription factor (TF) overexpression or drug treatment] on its target genes is often obscured by delayed feedback and secondary effects until the changes are fully propagated. As a proof of principle, we show that selective measuring of transcripts that are only synthesised after a perturbation [4-thiouridine (4sU) sequencing (4sU-seq)] is a more sensitive method to identify targets and time-dependent transcriptional responses than global transcript profiling. By metabolically labelling RNA in a time-course setup, we could vastly increase the sensitivity of MYCN target gene detection compared to traditional RNA sequencing. The validity of targets identified by 4sU-seq was demonstrated using chromatin immunoprecipitation sequencing and neuroblastoma microarray tumour data. Here, we describe the methodology, both molecular biology and computational aspects, required to successfully apply this 4sU-seq approach.


Subject(s)
Gene Expression Profiling/methods , Neuroblastoma/genetics , Nuclear Proteins/genetics , Oncogene Proteins/genetics , Thiouridine/metabolism , Transcription Factors/metabolism , Base Sequence , Binding Sites , Cell Line, Tumor , Gene Expression Regulation , Humans , N-Myc Proto-Oncogene Protein , Neuroblastoma/metabolism , RNA/genetics , RNA/metabolism , Sequence Analysis, RNA/methods , Systems Biology , Thiouridine/analysis
20.
BMC Bioinformatics ; 16: 269, 2015 Aug 25.
Article in English | MEDLINE | ID: mdl-26303676

ABSTRACT

BACKGROUND: Multiple sequence alignments (MSA) are widely used in sequence analysis for a variety of tasks. Outlier sequences can make downstream analyses unreliable or make the alignments less accurate while they are being constructed. This paper describes a simple method for automatically detecting outliers and accompanying software called OD-seq. It is based on finding sequences whose average distance to the rest of the sequences in a dataset, is anomalous. RESULTS: The software can take a MSA, distance matrix or set of unaligned sequences as input. Outlier sequences are found by examining the average distance of each sequence to the rest. Anomalous average distances are then found using the interquartile range of the distribution of average distances or by bootstrapping them. The complexity of any analysis of a distance matrix is normally at least O(N(2)) for N sequences. This is prohibitive for large N but is reduced here by using the mBed algorithm from Clustal Omega. This reduces the complexity to O(N log(N)) which makes even very large alignments easy to analyse on a single core. We tested the ability of OD-seq to detect outliers using artificial test cases of sequences from Pfam families, seeded with sequences from other Pfam families. Using a MSA as input, OD-seq is able to detect outliers with very high sensitivity and specificity. CONCLUSION: OD-seq is a practical and simple method to detect outliers in MSAs. It can also detect outliers in sets of unaligned sequences, but with reduced accuracy. For medium sized alignments, of a few thousand sequences, it can detect outliers in a few seconds. Software available as http://www.bioinf.ucd.ie/download/od-seq.tar.gz.


Subject(s)
ATP-Binding Cassette Transporters , Algorithms , Sequence Alignment/methods , Sequence Analysis, Protein/methods , Software , Amino Acid Sequence , Humans , Molecular Sequence Data , Sequence Homology, Amino Acid
SELECTION OF CITATIONS
SEARCH DETAIL
...