Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 18 de 18
Filter
1.
Cancers (Basel) ; 16(4)2024 Feb 08.
Article in English | MEDLINE | ID: mdl-38398114

ABSTRACT

Neuroblastoma is the most common extracranial solid tumour in children, comprising close to 10% of childhood cancer-related deaths. We have demonstrated that activation of NTRK1 by TP53 repression of PTPN6 expression is significantly associated with favourable survival in neuroblastoma. The molecular mechanisms by which this activation elicits cell molecular changes need to be determined. This is critical to identify dependable biomarkers for the early detection and prognosis of tumours, and for the development of personalised treatment. In this investigation we have identified and validated a gene signature for the prognosis of neuroblastoma using genes differentially expressed upon activation of the NTRK1-PTPN6-TP53 module. A random survival forest model was used to construct a gene signature, which was then assessed across validation datasets using Kaplan-Meier analysis and ROC curves. The analysis demonstrated that high BASP1, CD9, DLG2, FNBP1, FRMD3, IL11RA, ISGF10, IQCE, KCNQ3, and TOX2, and low BSG/CD147, CCDC125, GABRB3, GNB2L1/RACK1 HAPLN4, HEBP2, and HSD17B12 expression was significantly associated with favourable patient event-free survival (EFS). The gene signature was associated with favourable tumour histology and NTRK1-PTPN6-TP53 module activation. Importantly, all genes were significantly associated with favourable EFS in an independent manner. Six of the signature genes, BSG/CD147, GNB2L1/RACK1, TXNDC5, FNPB1, B3GAT1, and IGSF10, play a role in cell differentiation. Our findings strongly suggest that the identified gene signature is a potential prognostic biomarker and therapeutic target for neuroblastoma patients and that it is associated with neuroblastoma cell differentiation through the activation of the NTRK1-PTPN6-TP53 module.

2.
Biochem Biophys Rep ; 27: 101081, 2021 Sep.
Article in English | MEDLINE | ID: mdl-34307909

ABSTRACT

SARS-CoV-2 viral contagion has given rise to a worldwide pandemic. Although most children experience minor symptoms from SARS-CoV-2 infection, some have severe complications including Multisystem Inflammatory Syndrome in Children. Neuroblastoma patients may be at higher risk of severe infection as treatment requires immunocompromising chemotherapy and SARS-CoV-2 has demonstrated tropism for nervous cells. To date, there is no sufficient epidemiological data on neuroblastoma patients with SARS-CoV-2. Therefore, we evaluated datasets of non-SARS-CoV-2 infected neuroblastoma patients to assess for key genes involved with SARS-CoV-2 infection as possible neuroblastoma prognostic and infection biomarkers. We hypothesized that ACE2, CD147, PPIA and PPIB, which are associated with viral-cell entry, are potential biomarkers for poor prognosis neuroblastoma and SARS-CoV-2 infection. We have analysed three publicly available neuroblastoma gene expression datasets to understand the specific molecular susceptibilities that high-risk neuroblastoma patients have to the virus. Gene Expression Omnibus (GEO) GSE49711 and GEO GSE62564 are the microarray and RNA-Seq data, respectively, from 498 neuroblastoma samples published as part of the Sequencing Quality Control initiative. TARGET, contains microarray data from 249 samples and is part of the Therapeutically Applicable Research to Generate Effective Treatments (TARGET) initiative. ACE2, CD147, PPIA and PPIB were identified through their involvement in both SARS-CoV-2 infection and cancer pathogenesis. In-depth statistical analysis using Kaplan-Meier, differential gene expression, and Cox multivariate regression analysis, demonstrated that overexpression of ACE2, CD147, PPIA and PPIB is significantly associated with poor-prognosis neuroblastoma samples. These results were seen in the presence of amplified MYCN, unfavourable tumour histology and in patients older than 18 months of age. Previously, we have shown that high levels of the nerve growth factor receptor NTRK1 together with low levels of the phosphatase PTPN6 and TP53 are associated with increased relapse-free survival of neuroblastoma patients. Interestingly, low levels of expression of ACE2, CD147, PPIA and PPIB are associated with this NTRK1-PTPN6-TP53 module, suggesting that low expression levels of these genes are associated with good prognosis. These findings have implications for clinical care and therapeutic treatment. The upregulation of ACE2, CD147, PPIA and PPIB in poor-prognosis neuroblastoma samples suggests that these patients may be at higher risk of severe SARS-CoV-2 infection. Importantly, our findings reveal ACE2, CD147, PPIA and PPIB as potential biomarkers and therapeutic targets for neuroblastoma.

3.
Front Mol Biosci ; 7: 565530, 2020.
Article in English | MEDLINE | ID: mdl-33102519

ABSTRACT

Cardiovascular disease accounts for millions of deaths each year and is currently the leading cause of mortality worldwide. The aging process is clearly linked to cardiovascular disease, however, the exact relationship between aging and heart function is not fully understood. Furthermore, a holistic view of cardiac aging, linking features of early life development to changes observed in old age, has not been synthesized. Here, we re-purpose RNA-sequencing data previously-collected by our group, investigating gene expression differences between wild-type mice of different age groups that represent key developmental milestones in the murine lifespan. DESeq2's generalized linear model was applied with two hypothesis testing approaches to identify differentially-expressed (DE) genes, both between pairs of age groups and across mice of all ages. Pairwise comparisons identified genes associated with specific age transitions, while comparisons across all age groups identified a large set of genes associated with the aging process more broadly. An unsupervised machine learning approach was then applied to extract common expression patterns from this set of age-associated genes. Sets of genes with both linear and non-linear expression trajectories were identified, suggesting that aging not only involves the activation of gene expression programs unique to different age groups, but also the re-activation of gene expression programs from earlier ages. Overall, we present a comprehensive transcriptomic analysis of cardiac gene expression patterns across the entirety of the murine lifespan.

4.
Front Cell Dev Biol ; 7: 201, 2019.
Article in English | MEDLINE | ID: mdl-31612134

ABSTRACT

Homology between mitochondrial DNA (mtDNA) and nuclear DNA of mitochondrial origin (nuMTs) causes confounding when aligning short sequence reads to the reference human genome, as the true sequence origin cannot be determined. Using a systematic in silico approach, we here report the impact of all potential mitochondrial variants on alignment accuracy and variant calling. A total of 49,707 possible mutations were introduced across the 16,569 bp reference mitochondrial genome (16,569 × 3 alternative alleles), one variant at-at-time. The resulting in silico fragmentation and alignment to the entire reference genome (GRCh38) revealed preferential mapping of mutated mitochondrial fragments to nuclear loci, as variants increased loci similarity to nuMTs, for a total of 807, 362, and 41 variants at 333, 144, and 27 positions when using 100, 150, and 300 bp single-end fragments. We subsequently modeled these affected variants at 50% heteroplasmy and carried out variant calling, observing bias in the reported allele frequencies in favor of the reference allele. Four variants (chrM:6023A, chrM:4456T, chrM:5147A, and chrM:7521A) including a possible hypertension factor, chrM:4456T, caused 100% loss of coverage at the mutated position (with all 100 bp single-end fragments aligning to homologous, nuclear positions instead of chrM), rendering these variants undetectable when aligning to the entire reference genome. Furthermore, four mitochondrial variants reported to be pathogenic were found to cause significant loss of coverage and select haplogroup-defining SNPs were shown to exacerbate the loss of coverage caused by surrounding variants. Increased fragment length and use of paired-end reads both improved alignment accuracy.

5.
mSystems ; 1(3)2016.
Article in English | MEDLINE | ID: mdl-27822537

ABSTRACT

Greater understanding of the functions of host gene products in response to infection is required. While many of these genes enable pathogen clearance, some enhance pathogen growth or contribute to disease symptoms. Many studies have profiled transcriptomic and proteomic responses to infection, generating large data sets, but selecting targets for further study is challenging. Here we propose a novel data-mining approach combining multiple heterogeneous data sets to prioritize genes for further study by using respiratory syncytial virus (RSV) infection as a model pathogen with a significant health care impact. The assumption was that the more frequently a gene is detected across multiple studies, the more important its role is. A literature search was performed to find data sets of genes and proteins that change after RSV infection. The data sets were standardized, collated into a single database, and then panned to determine which genes occurred in multiple data sets, generating a candidate gene list. This candidate gene list was validated by using both a clinical cohort and in vitro screening. We identified several genes that were frequently expressed following RSV infection with no assigned function in RSV control, including IFI27, IFIT3, IFI44L, GBP1, OAS3, IFI44, and IRF7. Drilling down into the function of these genes, we demonstrate a role in disease for the gene for interferon regulatory factor 7, which was highly ranked on the list, but not for IRF1, which was not. Thus, we have developed and validated an approach for collating published data sets into a manageable list of candidates, identifying novel targets for future analysis. IMPORTANCE Making the most of "big data" is one of the core challenges of current biology. There is a large array of heterogeneous data sets of host gene responses to infection, but these data sets do not inform us about gene function and require specialized skill sets and training for their utilization. Here we describe an approach that combines and simplifies these data sets, distilling this information into a single list of genes commonly upregulated in response to infection with RSV as a model pathogen. Many of the genes on the list have unknown functions in RSV disease. We validated the gene list with new clinical, in vitro, and in vivo data. This approach allows the rapid selection of genes of interest for further, more-detailed studies, thus reducing time and costs. Furthermore, the approach is simple to use and widely applicable to a range of diseases.

6.
F1000Res ; 3: 199, 2014.
Article in English | MEDLINE | ID: mdl-25485096

ABSTRACT

Previously, we have described the development of the generic mobile phone data gathering tool, EpiCollect, and an associated web application, providing two-way communication between multiple data gatherers and a project database. This software only allows data collection on the phone using a single questionnaire form that is tailored to the needs of the user (including a single GPS point and photo per entry), whereas many applications require a more complex structure, allowing users to link a series of forms in a linear or branching hierarchy, along with the addition of any number of media types accessible from smartphones and/or tablet devices (e.g., GPS, photos, videos, sound clips and barcode scanning). A much enhanced version of EpiCollect has been developed (EpiCollect+). The individual data collection forms in EpiCollect+ provide more design complexity than the single form used in EpiCollect, and the software allows the generation of complex data collection projects through the ability to link many forms together in a linear (or branching) hierarchy. Furthermore, EpiCollect+ allows the collection of multiple media types as well as standard text fields, increased data validation and form logic. The entire process of setting up a complex mobile phone data collection project to the specification of a user (project and form definitions) can be undertaken at the EpiCollect+ website using a simple 'drag and drop' procedure, with visualisation of the data gathered using Google Maps and charts at the project website. EpiCollect+ is suitable for situations where multiple users transmit complex data by mobile phone (or other Android devices) to a single project web database and is already being used for a range of field projects, particularly public health projects in sub-Saharan Africa. However, many uses can be envisaged from education, ecology and epidemiology to citizen science.

7.
J Virol ; 88(21): 12907-9, 2014 Nov.
Article in English | MEDLINE | ID: mdl-25142605

ABSTRACT

Following the recent availability of high-coverage genomes for Denisovan and Neanderthal hominids, we conducted a screen for endogenized retroviruses, identifying six novel, previously unreported HERV-K(HML2) elements (HERV-K is human endogenous retrovirus K). These elements are absent from the human genome (hg38) and appear to be unique to archaic hominids. These findings provide further evidence supporting the recent activity of the HERV-K(HML2) group, which has been implicated in human disease. They will also provide insights into the evolution of archaic hominids.


Subject(s)
Endogenous Retroviruses/genetics , Endogenous Retroviruses/isolation & purification , Fossils/virology , Hominidae/virology , Animals , Endogenous Retroviruses/classification , Female , Genome
8.
Dev Cell ; 23(2): 265-79, 2012 Aug 14.
Article in English | MEDLINE | ID: mdl-22841499

ABSTRACT

X chromosome inactivation involves multiple levels of chromatin modification, established progressively and in a stepwise manner during early development. The chromosomal protein Smchd1 was recently shown to play an important role in DNA methylation of CpG islands (CGIs), a late step in the X inactivation pathway that is required for long-term maintenance of gene silencing. Here we show that inactive X chromosome (Xi) CGI methylation can occur via either Smchd1-dependent or -independent pathways. Smchd1-dependent CGI methylation, the primary pathway, is acquired gradually over an extended period, whereas Smchd1-independent CGI methylation occurs rapidly after the onset of X inactivation. The de novo methyltransferase Dnmt3b is required for methylation of both classes of CGI, whereas Dnmt3a and Dnmt3L are dispensable. Xi CGIs methylated by these distinct pathways differ with respect to their sequence characteristics and immediate chromosomal environment. We discuss the implications of these results for understanding CGI methylation during development.


Subject(s)
Chromosomal Proteins, Non-Histone/metabolism , CpG Islands , DNA Methylation , X Chromosome Inactivation , Alleles , Animals , Cell Line , Chromosomal Proteins, Non-Histone/genetics , Mice , Protein Isoforms/genetics , Protein Isoforms/metabolism
9.
PLoS One ; 6(9): e25023, 2011.
Article in English | MEDLINE | ID: mdl-21966401

ABSTRACT

BACKGROUND: With the ever-increasing information emerging from the various sequencing and gene annotation projects, there is an urgent need to elucidate the cellular functions of the newly discovered genes. The genetically regulated cell suicide of apoptosis is especially suitable for such endeavours as it is governed by a vast number of factors. METHODOLOGY/PRINCIPAL FINDINGS: We have set up a high-throughput screen in 96-well microtiter plates for genes that induce apoptosis upon their individual transfection into human cells. Upon screening approximately 100,000 cDNA clones we determined 74 genes that initiate this cellular suicide programme. A thorough bioinformatics analysis of these genes revealed that 91% are novel apoptosis regulators. Careful sequence analysis and functional annotation showed that the apoptosis factors exhibit a distinct functional distribution that distinguishes the cell death process from other signalling pathways. While only a minority of classic signal transducers were determined, a substantial number of the genes fall into the transporter- and enzyme-category. The apoptosis factors are distributed throughout all cellular organelles and many signalling circuits, but one distinct signalling pathway connects at least some of the isolated genes. Comparisons with microarray data suggest that several genes are dysregulated in specific types of cancers and degenerative diseases. CONCLUSIONS/SIGNIFICANCE: Many unknown genes for cell death were revealed through our screen, supporting the enormous complexity of cell death regulation. Our results will serve as a repository for other researchers working with genomics data related to apoptosis or for those seeking to reveal novel signalling pathways for cell suicide.


Subject(s)
Apoptosis , Gene Expression Regulation, Neoplastic , Signal Transduction , Animals , Computational Biology/methods , DNA, Complementary/metabolism , Enzyme-Linked Immunosorbent Assay/methods , Gene Expression Profiling , Genomics , Green Fluorescent Proteins/metabolism , HEK293 Cells , Humans , Mice , Models, Genetic , Oligonucleotide Array Sequence Analysis , Robotics , Subcellular Fractions/metabolism
10.
BMC Genomics ; 11: 321, 2010 May 24.
Article in English | MEDLINE | ID: mdl-20497534

ABSTRACT

BACKGROUND: Invasive amoebiasis, caused by infection with the human parasite Entamoeba histolytica remains a major cause of morbidity and mortality in some less-developed countries. Genetically E. histolytica exhibits a number of unusual features including having approximately 20% of its genome comprised of repetitive elements. These include a number of families of SINEs - non-autonomous elements which can, however, move with the help of partner LINEs. In many eukaryotes SINE mobility has had a profound effect on gene expression; in this study we concentrated on one such element - EhSINE1, looking in particular for evidence of recent transposition. RESULTS: EhSINE1s were detected in the newly reassembled E. histolytica genome by searching with a Hidden Markov Model developed to encapsulate the key features of this element; 393 were detected. Examination of their sequences revealed that some had an internal structure showing one to four 26-27 nt repeats. Members of the different classes differ in a number of ways and in particular those with two internal repeats show the properties expected of fairly recently transposed SINEs - they are the most homogeneous in length and sequence, they have the longest (i.e. the least decayed) target site duplications and are the most likely to show evidence (in a cDNA library) of active transcription. Furthermore we were able to identify 15 EhSINE1s (6 pairs and one triplet) which appeared to be identical or very nearly so but inserted into different sites in the genome; these provide good evidence that if mobility has now ceased it has only done so very recently. CONCLUSIONS: Of the many families of repetitive elements present in the genome of E. histolytica we have examined in detail just one - EhSINE1. We have shown that there is evidence for waves of transposition at different points in the past and no evidence that mobility has entirely ceased. There are many aspects of the biology of this parasite which are not understood, in particular why it is pathogenic while the closely related species E. dispar is not, the great genetic diversity found amongst patient isolates and the fact, which may be related, that only a small proportion of those infected develop clinical invasive amoebiasis. Mobile genetic elements, with their ability to alter gene expression may well be important in unravelling these puzzles.


Subject(s)
Computational Biology , Entamoeba histolytica/genetics , Base Sequence , Gene Duplication , Genome, Protozoan/genetics , Mutagenesis, Insertional/genetics , Promoter Regions, Genetic/genetics , RNA, Messenger/genetics , Repetitive Sequences, Nucleic Acid/genetics , Transcription, Genetic
11.
Epigenetics Chromatin ; 3(1): 10, 2010 May 07.
Article in English | MEDLINE | ID: mdl-20459652

ABSTRACT

BACKGROUND: X chromosome inactivation, the mechanism used by mammals to equalise dosage of X-linked genes in XX females relative to XY males, is triggered by chromosome-wide localisation of a cis-acting non-coding RNA, Xist. The mechanism of Xist RNA spreading and Xist-dependent silencing is poorly understood. A large body of evidence indicates that silencing is more efficient on the X chromosome than on autosomes, leading to the idea that the X chromosome has acquired sequences that facilitate propagation of silencing. LINE-1 (L1) repeats are relatively enriched on the X chromosome and have been proposed as candidates for these sequences. To determine the requirements for efficient silencing we have analysed the relationship of chromosome features, including L1 repeats, and the extent of silencing in cell lines carrying inducible Xist transgenes located on one of three different autosomes. RESULTS: Our results show that the organisation of the chromosome into large gene-rich and L1-rich domains is a key determinant of silencing efficiency. Specifically genes located in large gene-rich domains with low L1 density are relatively resistant to Xist-mediated silencing whereas genes located in gene-poor domains with high L1 density are silenced more efficiently. These effects are observed shortly after induction of Xist RNA expression, suggesting that chromosomal domain organisation influences establishment rather than long-term maintenance of silencing. The X chromosome and some autosomes have only small gene-rich L1-depleted domains and we suggest that this could confer the capacity for relatively efficient chromosome-wide silencing. CONCLUSIONS: This study provides insight into the requirements for efficient Xist mediated silencing and specifically identifies organisation of the chromosome into gene-rich L1-depleted and gene-poor L1-dense domains as a major influence on the ability of Xist-mediated silencing to be propagated in a continuous manner in cis.

12.
PLoS One ; 4(9): e6968, 2009 Sep 16.
Article in English | MEDLINE | ID: mdl-19756138

ABSTRACT

BACKGROUND: Epidemiologists and ecologists often collect data in the field and, on returning to their laboratory, enter their data into a database for further analysis. The recent introduction of mobile phones that utilise the open source Android operating system, and which include (among other features) both GPS and Google Maps, provide new opportunities for developing mobile phone applications, which in conjunction with web applications, allow two-way communication between field workers and their project databases. METHODOLOGY: Here we describe a generic framework, consisting of mobile phone software, EpiCollect, and a web application located within www.spatialepidemiology.net. Data collected by multiple field workers can be submitted by phone, together with GPS data, to a common web database and can be displayed and analysed, along with previously collected data, using Google Maps (or Google Earth). Similarly, data from the web database can be requested and displayed on the mobile phone, again using Google Maps. Data filtering options allow the display of data submitted by the individual field workers or, for example, those data within certain values of a measured variable or a time period. CONCLUSIONS: Data collection frameworks utilising mobile phones with data submission to and from central databases are widely applicable and can give a field worker similar display and analysis tools on their mobile phone that they would have if viewing the data in their laboratory via the web. We demonstrate their utility for epidemiological data collection and display, and briefly discuss their application in ecological and community data collection. Furthermore, such frameworks offer great potential for recruiting 'citizen scientists' to contribute data easily to central databases through their mobile phone.


Subject(s)
Cell Phone , Computational Biology/instrumentation , Data Collection/instrumentation , Ecology/instrumentation , Epidemiology/instrumentation , Computational Biology/methods , Computers , Data Collection/methods , Database Management Systems , Geography , Humans , Information Dissemination , Internet , Programming Languages , Software
13.
PLoS Genet ; 5(4): e1000446, 2009 Apr.
Article in English | MEDLINE | ID: mdl-19360092

ABSTRACT

Genomic mapping of DNA replication origins (ORIs) in mammals provides a powerful means for understanding the regulatory complexity of our genome. Here we combine a genome-wide approach to identify preferential sites of DNA replication initiation at 0.4% of the mouse genome with detailed molecular analysis at distinct classes of ORIs according to their location relative to the genes. Our study reveals that 85% of the replication initiation sites in mouse embryonic stem (ES) cells are associated with transcriptional units. Nearly half of the identified ORIs map at promoter regions and, interestingly, ORI density strongly correlates with promoter density, reflecting the coordinated organisation of replication and transcription in the mouse genome. Detailed analysis of ORI activity showed that CpG island promoter-ORIs are the most efficient ORIs in ES cells and both ORI specification and firing efficiency are maintained across cell types. Remarkably, the distribution of replication initiation sites at promoter-ORIs exactly parallels that of transcription start sites (TSS), suggesting a co-evolution of the regulatory regions driving replication and transcription. Moreover, we found that promoter-ORIs are significantly enriched in CAGE tags derived from early embryos relative to all promoters. This association implies that transcription initiation early in development sets the probability of ORI activation, unveiling a new hallmark in ORI efficiency regulation in mammalian cells.


Subject(s)
Mammals/genetics , Replication Origin , Transcription, Genetic , Animals , Cell Line , CpG Islands , Embryonic Stem Cells/cytology , Mice , Promoter Regions, Genetic
14.
BMC Bioinformatics ; 9: 501, 2008 Nov 27.
Article in English | MEDLINE | ID: mdl-19038045

ABSTRACT

BACKGROUND: There is accumulating evidence that the milieu of repeat elements and other non-genic sequence features at a given chromosomal locus, here defined as the genome environment, can play an important role in regulating chromosomal processes such as transcription, replication and recombination. The availability of whole-genome sequences has allowed us to annotate the genome environment of any locus in detail. The development of genome wide experimental analyses of gene expression, chromatin modification and chromatin proteins means that it is now possible to identify potential links between chromosomal processes and the underlying genome environment. There is a need for novel bioinformatic tools that facilitate these studies. RESULTS: We developed the Genome Environment Browser (GEB) in order to visualise the integration of experimental data from large scale high throughput analyses with repeat sequence features that define the local genome environment. The browser has incorporated dynamic scales adjustable in real-time, which enables scanning of large regions of the genome as well as detailed investigation of local regions on the same page without the need to load new pages. The interface also accommodates a 2-dimensional display of repetitive features which vary substantially in size, such as LINE-1 repeats. Specific queries for preliminary quantitative analysis of genome features can also be formulated, results of which can be exported for further analysis. CONCLUSION: The Genome Environment Browser is a versatile program which can be easily adapted for displaying all types of genome data with known genomic coordinates. It is currently available at http://web.bioinformatics.ic.ac.uk/geb/.


Subject(s)
Computational Biology/methods , Genomics/methods , Repetitive Sequences, Nucleic Acid/genetics , Software , User-Computer Interface
15.
Bioinformatics ; 22(4): 495-6, 2006 Feb 15.
Article in English | MEDLINE | ID: mdl-16357032

ABSTRACT

SEAN is an application that predicts single nucleotide polymorphisms (SNPs) using multiple sequence alignments produced from expressed sequence tag (EST) clusters. The algorithm uses rules of sequence identity and SNP abundance to determine the quality of the prediction. A Java viewer is provided to display the EST alignments and predicted SNPs.


Subject(s)
DNA Mutational Analysis/methods , Expressed Sequence Tags , Polymorphism, Single Nucleotide/genetics , Sequence Alignment/methods , Sequence Analysis, DNA/methods , Software , User-Computer Interface , Algorithms , Chromosome Mapping/methods , Cluster Analysis , Computer Graphics , Pattern Recognition, Automated/methods
16.
BMC Evol Biol ; 5: 23, 2005 Mar 18.
Article in English | MEDLINE | ID: mdl-15777474

ABSTRACT

BACKGROUND: Protein interaction networks aim to summarize the complex interplay of proteins in an organism. Early studies suggested that the position of a protein in the network determines its evolutionary rate but there has been considerable disagreement as to what extent other factors, such as protein abundance, modify this reported dependence. RESULTS: We compare the genomes of Saccharomyces cerevisiae and Caenorhabditis elegans with those of closely related species to elucidate the recent evolutionary history of their respective protein interaction networks. Interaction and expression data are studied in the light of a detailed phylogenetic analysis. The underlying network structure is incorporated explicitly into the statistical analysis. The increased phylogenetic resolution, paired with high-quality interaction data, allows us to resolve the way in which protein interaction network structure and abundance of proteins affect the evolutionary rate. We find that expression levels are better predictors of the evolutionary rate than a protein's connectivity. Detailed analysis of the two organisms also shows that the evolutionary rates of interacting proteins are not sufficiently similar to be mutually predictive. CONCLUSION: It appears that meaningful inferences about the evolution of protein interaction networks require comparative analysis of reasonably closely related species. The signature of protein evolution is shaped by a protein's abundance in the organism and its function and the biological process it is involved in. Its position in the interaction networks and its connectivity may modulate this but they appear to have only minor influence on a protein's evolutionary rate.


Subject(s)
Caenorhabditis elegans/genetics , Evolution, Molecular , Protein Interaction Mapping/methods , Saccharomyces cerevisiae/genetics , Animals , Genome , Likelihood Functions , Models, Statistical , Phylogeny , Species Specificity
17.
Genome Res ; 13(9): 2195-202, 2003 Sep.
Article in English | MEDLINE | ID: mdl-12952886

ABSTRACT

GANESH is a software package designed to support the genetic analysis of regions of human and other genomes. It provides a set of components that may be assembled to construct a self-updating database of DNA sequence, mapping data, and annotations of possible genome features. Once one or more remote sources of data for the target region have been identified, all sequences for that region are downloaded, assimilated, and subjected to a (configurable) set of standard database-searching and genome-analysis packages. The results are stored in compressed form in a relational database, and are updated automatically on a regular schedule so that they are always immediately available in their most up-to-date versions. A Java front-end, executed as a stand alone application or web applet, provides a graphical interface for navigating the database and for viewing the annotations. There are facilities for importing and exporting data in the format of the Distributed Annotation System (DAS), enabling a GANESH database to be used as a component of a DAS configuration. The system has been used to construct databases for about a dozen regions of human chromosomes and for three regions of mouse chromosomes.


Subject(s)
Computational Biology/methods , Databases, Genetic , Genome, Human , Genome , Software , Animals , Base Sequence , Calcium-Binding Proteins/classification , Calcium-Binding Proteins/genetics , Computational Biology/trends , Database Management Systems , Databases, Genetic/classification , Databases, Genetic/standards , Databases, Genetic/statistics & numerical data , Eye Proteins , Homeodomain Proteins/classification , Homeodomain Proteins/genetics , Humans , Molecular Sequence Data , PAX6 Transcription Factor , Paired Box Transcription Factors , Proteome/classification , Proteome/genetics , Repressor Proteins , Takifugu/genetics , WT1 Proteins/classification , WT1 Proteins/genetics
18.
Gene ; 283(1-2): 71-82, 2002 Jan 23.
Article in English | MEDLINE | ID: mdl-11867214

ABSTRACT

A variety of loci with interesting patterns of regulation such as imprinted expression, and critical functions such as involvement in tumour necrosis factor pathways, map to a distal portion of mouse chromosome 12. This region also contains disease related loci including the 'Legs at odd angles' mutation (Loa) that we are pursuing in a positional cloning project. To further define the region and prepare for comparative sequencing projects, we have produced genetic, radiation hybrid, physical and transcript maps of the region, with probes providing anchors between the maps. We show a summary of 95 markers and 91 genomic clones that has enabled us to identify 18 transcripts including new genes and candidates for Loa which will help in future studies of gene context and regulation.


Subject(s)
Chromosome Mapping , Chromosomes/genetics , Genomic Imprinting , Animals , Chromosomes, Human, Pair 14/genetics , Contig Mapping , Gene Order , Humans , Mice , Physical Chromosome Mapping , Radiation Hybrid Mapping , Synteny , Transcription, Genetic
SELECTION OF CITATIONS
SEARCH DETAIL
...