Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 19 de 19
Filter
Add more filters










Publication year range
1.
BMC Genomics ; 20(1): 613, 2019 Jul 27.
Article in English | MEDLINE | ID: mdl-31351464

ABSTRACT

BACKGROUND: Histone deacetylases (HDACs) are the proteins responsible for removing the acetyl group from lysine residues of core histones in chromosomes, a crucial component of gene regulation. Eleven known HDACs exist in humans and most other vertebrates. While the basic function of HDACs has been well characterized and new discoveries are still being made, the transcriptional regulation of their corresponding genes is still poorly understood. RESULTS: Here, we conducted a computational analysis of the eleven HDAC promoter sequences in 25 vertebrate species to determine whether transcription factor binding sites (TFBSs) are conserved in HDAC evolution, and if so, whether they provide useful information about HDAC expression and function. Furthermore, we used tissue-specific information of transcription factors to investigate the potential expression patterns of HDACs in different human tissues based on their transcription factor binding sites. We found that the TFBS profiles of most of the HDACs were well conserved in closely related species for all HDAC promoters except HDAC7 and HDAC10. HDAC5 had particularly strong conservation across over half of the species studied, with nearly identical profiles in the primate species. Our comparisons of TFBSs with the tissue specific gene expression profiles of their corresponding TFs showed that most HDACs had the ability to be ubiquitously expressed. A few HDAC promoters exhibited the potential for preferential expression in certain tissues, most notably HDAC11 in gall bladder, while HDAC9 seemed to have less propensity for expression in the nervous system. CONCLUSIONS: In general, we found evolutionary conservation in HDAC promoters that seems to be more prominent for the ubiquitously expressed HDACs. In turn, when conservation did not follow usual phylogeny, human TFBS patterns indicated possible functional relevance. While we found that HDACs appear to uniformly expressed, we confirm that the functional differences in HDACs may be less a matter of location of activity than a question of which proteins and which acetyl groups they may be acting on.


Subject(s)
Conserved Sequence , Histone Deacetylases/genetics , Promoter Regions, Genetic , Animals , Binding Sites , Humans , Transcription Factors , Vertebrates/genetics
2.
BMC Evol Biol ; 17(1): 11, 2017 01 11.
Article in English | MEDLINE | ID: mdl-28077092

ABSTRACT

BACKGROUND: The neurotransmitter L-Glutamate (L-Glu) acting at ionotropic L-Glu receptors (iGluR) conveys fast excitatory signal transmission in the nervous systems of all animals. iGluR-dependent neurotransmission is a key component of the synaptic plasticity that underlies learning and memory. During learning, two subtypes of iGluR, α-Amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid receptors (AMPAR) and N-methyl-D-aspartate receptors (NMDAR), are dynamically regulated postsynaptically in vertebrates. Invertebrate organisms such as Aplysia californica (Aplysia) are well-studied models for iGluR-mediated function, yet no studies to date have analyzed the evolutionary relationships between iGluR genes in these species and those in vertebrates, to identify genes that may mediate plasticity. We conducted a thorough phylogenetic analysis spanning Bilateria to elucidate these relationships. The expression status of iGluR genes in the Aplysia nervous system was also examined. RESULTS: Our analysis shows that ancestral genes for both NMDAR and AMPAR subtypes were present in the common bilaterian ancestor. NMDAR genes show very high conservation in motifs responsible for forming the conductance pore of the ion channel. The number of NMDAR subunits is greater in vertebrates due to an increased number of splice variants and an increased number of genes, likely due to gene duplication events. AMPAR subunits form an orthologous group, and there is high variability in the number of AMPAR genes in each species due to extensive taxon specific gene gain and loss. qPCR results show that all 12 Aplysia iGluR subunits are expressed in all nervous system ganglia. CONCLUSIONS: Orthologous NMDAR subunits in all species studied suggests conserved function across Bilateria, and potentially a conserved mechanism of neuroplasticity and learning. Vertebrates display an increased number of NMDAR genes and splice variants, which may play a role in their greater diversity of physiological responses. Extensive gene gain and loss of AMPAR genes may result in different physiological properties that are taxon specific. Our results suggest a significant role for L-Glu mediated responses throughout the Aplysia nervous system, consistent with L-Glu's role as the primary excitatory neurotransmitter.


Subject(s)
Aplysia/genetics , Phylogeny , Receptors, Ionotropic Glutamate/genetics , Animals , Conserved Sequence , Evolution, Molecular , Invertebrates/genetics , Protein Domains , Receptors, Ionotropic Glutamate/chemistry , Sequence Alignment , Sequence Analysis, Protein , Synaptic Transmission/genetics
3.
mSystems ; 1(2)2016.
Article in English | MEDLINE | ID: mdl-27822522

ABSTRACT

The investigation of host-pathogen interaction interfaces and their constituent factors is crucial for our understanding of an organism's pathogenesis. Here, we explored the interactomes of HIV, hepatitis C virus, influenza A virus, human papillomavirus, herpes simplex virus, and vaccinia virus in a human host by analyzing the combined sets of virus targets and human genes that are required for viral infection. We also considered targets and required genes of bacteriophages lambda and T7 infection in Escherichia coli. We found that targeted proteins and their immediate network neighbors significantly pool with proteins required for infection and essential for cell growth, forming large connected components in both the human and E. coli protein interaction networks. The impact of both viruses and phages on their protein targets appears to extend to their network neighbors, as these are enriched with topologically central proteins that have a significant disruptive topological effect and connect different protein complexes. Moreover, viral and phage targets and network neighbors are enriched with transcription factors, methylases, and acetylases in human viruses, while such interactions are much less prominent in bacteriophages. IMPORTANCE While host-virus interaction interfaces have been previously investigated, relatively little is known about the indirect interactions of pathogen and host proteins required for viral infection and host cell function. Therefore, we investigated the topological relationships of human and bacterial viruses and how they interact with their hosts. We focused on those host proteins that are directly targeted by viruses, those that are required for infection, and those that are essential for both human and bacterial cells (here, E. coli). Generally, we observed that targeted, required, and essential proteins in both hosts interact in a highly intertwined fashion. While there exist highly similar topological patterns, we found that human viruses target transcription factors through methylases and acetylases, proteins that played no such role in bacteriophages.

4.
BMC Bioinformatics ; 16: 109, 2015 Apr 01.
Article in English | MEDLINE | ID: mdl-25880655

ABSTRACT

BACKGROUND: Minimum dominating sets (MDSet) of protein interaction networks allow the control of underlying protein interaction networks through their topological placement. While essential proteins are enriched in MDSets, we hypothesize that the statistical properties of biological functions of essential genes are enhanced when we focus on essential MDSet proteins (e-MDSet). RESULTS: Here, we determined minimum dominating sets of proteins (MDSet) in interaction networks of E. coli, S. cerevisiae and H. sapiens, defined as subsets of proteins whereby each remaining protein can be reached by a single interaction. We compared several topological and functional parameters of essential, MDSet, and essential MDSet (e-MDSet) proteins. In particular, we observed that their topological placement allowed e-MDSet proteins to provide a positive correlation between degree and lethality, connect more protein complexes, and have a stronger impact on network resilience than essential proteins alone. In comparison to essential proteins we further found that interactions between e-MDSet proteins appeared more frequently within complexes, while interactions of e-MDSet proteins between complexes were depleted. Finally, these e-MDSet proteins classified into functional groupings that play a central role in survival and adaptability. CONCLUSIONS: The determination of e-MDSet of an organism highlights a set of proteins that enhances the enrichment signals of biological functions of essential proteins. As a consequence, we surmise that e-MDSets may provide a new method of evaluating the core proteins of an organism.


Subject(s)
Escherichia coli Proteins/metabolism , Genes, Essential , Protein Interaction Maps , Proteins/metabolism , Saccharomyces cerevisiae Proteins/metabolism , Computational Biology/methods , Databases, Protein , Gene Expression Regulation , Humans
5.
BMC Genomics ; 14: 917, 2013 Dec 24.
Article in English | MEDLINE | ID: mdl-24365332

ABSTRACT

BACKGROUND: Myzus persicae, the green peach aphid, is a polyphagous herbivore that feeds from hundreds of species of mostly dicot crop plants. Like other phloem-feeding aphids, M. persicae rely on the endosymbiotic bacterium, Buchnera aphidicola (Buchnera Mp), for biosynthesis of essential amino acids and other nutrients that are not sufficiently abundant in their phloem sap diet. Tobacco-specialized M. persicae are typically red and somewhat distinct from other lineages of this species. To determine whether the endosymbiotic bacteria of M. persicae could play a role in tobacco adaptation, we sequenced the Buchnera Mp genomes from two tobacco-adapted and two non-tobacco M. persicae lineages. RESULTS: With a genome size of 643.5 kb and 579 predicted genes, Buchnera Mp is the largest Buchnera genome sequenced to date. No differences in gene content were found between the four sequenced Buchnera Mp strains. Compared to Buchnera APS from the well-studied pea aphid, Acyrthosiphon pisum, Buchnera Mp has 21 additional genes. These include genes encoding five enzymes required for biosynthesis of the modified nucleoside queosine, the heme pathway enzyme uroporphyrinogen III synthase, and asparaginase. Asparaginase, which is also encoded by the genome of the aphid host, may allow Buchnera Mp to synthesize essential amino acids from asparagine, a relatively abundant phloem amino acid. CONCLUSIONS: Together our results indicate that the obligate intracellular symbiont Buchnera aphidicola does not contribute to the adaptation of Myzus persicae to feeding on tobacco.


Subject(s)
Aphids/microbiology , Buchnera/genetics , Genome, Bacterial , Symbiosis , Adaptation, Biological , Animals , Buchnera/classification , Chromosome Mapping , DNA, Bacterial/genetics , DNA, Bacterial/isolation & purification , Microsatellite Repeats , Plasmids/genetics , Sequence Analysis, DNA , Nicotiana
6.
Adv Bioinformatics ; 2011: 271563, 2011.
Article in English | MEDLINE | ID: mdl-22194743

ABSTRACT

The GenSensor Suite consists of four web tools for elucidating relationships among genes and proteins. GenPath results show which biochemical, regulatory, or other gene set categories are over- or under-represented in an input list compared to a background list. All common gene sets are available for searching in GenPath, plus some specialized sets. Users can add custom background lists. GenInteract builds an interaction gene list from a single gene input and then analyzes this in GenPath. GenPubMed uses a PubMed query to identify a list of PubMed IDs, from which a gene list is extracted and queried in GenPath. GenViewer allows the user to query one gene set against another in GenPath. GenPath results are presented with relevant P- and q-values in an uncluttered, fully linked, and integrated table. Users can easily copy this table and paste it directly into a spreadsheet or document.

7.
PLoS One ; 6(7): e22538, 2011.
Article in English | MEDLINE | ID: mdl-21799891

ABSTRACT

BACKGROUND: Cachexia, or weight loss despite adequate nutrition, significantly impairs quality of life and response to therapy in cancer patients. In cancer patients, skeletal muscle wasting, weight loss and mortality are all positively associated with increased serum cytokines, particularly Interleukin-6 (IL-6), and the presence of the acute phase response. Acute phase proteins, including fibrinogen and serum amyloid A (SAA) are synthesized by hepatocytes in response to IL-6 as part of the innate immune response. To gain insight into the relationships among these observations, we studied mice with moderate and severe Colon-26 (C26)-carcinoma cachexia. METHODOLOGY/PRINCIPAL FINDINGS: Moderate and severe C26 cachexia was associated with high serum IL-6 and IL-6 family cytokines and highly similar patterns of skeletal muscle gene expression. The top canonical pathways up-regulated in both were the complement/coagulation cascade, proteasome, MAPK signaling, and the IL-6 and STAT3 pathways. Cachexia was associated with increased muscle pY705-STAT3 and increased STAT3 localization in myonuclei. STAT3 target genes, including SOCS3 mRNA and acute phase response proteins, were highly induced in cachectic muscle. IL-6 treatment and STAT3 activation both also induced fibrinogen in cultured C2C12 myotubes. Quantitation of muscle versus liver fibrinogen and SAA protein levels indicates that muscle contributes a large fraction of serum acute phase proteins in cancer. CONCLUSIONS/SIGNIFICANCE: These results suggest that the STAT3 transcriptome is a major mechanism for wasting in cancer. Through IL-6/STAT3 activation, skeletal muscle is induced to synthesize acute phase proteins, thus establishing a molecular link between the observations of high IL-6, increased acute phase response proteins and muscle wasting in cancer. These results suggest a mechanism by which STAT3 might causally influence muscle wasting by altering the profile of genes expressed and translated in muscle such that amino acids liberated by increased proteolysis in cachexia are synthesized into acute phase proteins and exported into the blood.


Subject(s)
Acute-Phase Reaction/etiology , Acute-Phase Reaction/metabolism , Cachexia/complications , Cachexia/metabolism , Colonic Neoplasms/complications , Muscle, Skeletal/metabolism , STAT3 Transcription Factor/metabolism , Acute-Phase Reaction/blood , Acute-Phase Reaction/genetics , Animals , Cachexia/blood , Cachexia/genetics , Cell Line, Tumor , Cytokines/blood , Female , Gene Expression Regulation, Neoplastic , Humans , Liver/metabolism , Liver/pathology , Mice , Muscle, Skeletal/growth & development , Muscle, Skeletal/pathology , Transcriptome
8.
BMC Bioinformatics ; 12: 217, 2011 May 29.
Article in English | MEDLINE | ID: mdl-21619696

ABSTRACT

BACKGROUND: Conotoxin has been proven to be effective in drug design and could be used to treat various disorders such as schizophrenia, neuromuscular disorders and chronic pain. With the rapidly growing interest in conotoxin, accurate conotoxin superfamily classification tools are desirable to systematize the increasing number of newly discovered sequences and structures. However, despite the significance and extensive experimental investigations on conotoxin, those tools have not been intensively explored. RESULTS: In this paper, we propose to consider suboptimal alignments of words with restricted length. We developed a scoring system based on local alignment partition functions, called free score. The scoring system plays the key role in the feature extraction step of support vector machine classification. In the classification of conotoxin proteins, our method, SVM-Freescore, features an improved sensitivity and specificity by approximately 5.864% and 3.76%, respectively, over previously reported methods. For the generalization purpose, SVM-Freescore was also applied to classify superfamilies from curated and high quality database such as ConoServer. The average computed sensitivity and specificity for the superfamily classification were found to be 0.9742 and 0.9917, respectively. CONCLUSIONS: The SVM-Freescore method is shown to be a useful sequence-based analysis tool for functional and structural characterization of conotoxin proteins. The datasets and the software are available at http://faculty.uaeu.ac.ae/nzaki/SVM-Freescore.htm.


Subject(s)
Algorithms , Artificial Intelligence , Conotoxins/classification , Conus Snail/chemistry , Neuropeptides/classification , Animals , Conotoxins/analysis , Neuropeptides/analysis , Software
9.
Saudi Med J ; 32(4): 353-9, 2011 Apr.
Article in English | MEDLINE | ID: mdl-21483992

ABSTRACT

OBJECTIVE: To identify the mutations underlying a number of inborn errors of metabolism (IEM) disorders among United Arab Emirates (UAE) residents. METHODS: Molecular diagnostic and bioinformatics tools were used to identify the causative mutations of IEM disorders from multi-ethnic patients residing in UAE. The study was conducted in Al-Ain, UAE, between April 2009 and September 2010. This is a case series retrospective study where patients attending the metabolic clinic at Tawam Hospital were recruited. Thirty patients and 26 parents were included. RESULTS: We present evidence in the UAE of 7 new mutations and 19 mutations that have previously been reported in other populations, all causing a number of common IEM disorders, including phenylketonuria, maple syrup urine disease, glycogen storage diseases, beta-ketothiolase deficiency, and Zellweger syndrome among many others. CONCLUSION: Reflecting the diverse ethnic groups residing in the UAE, we found mutations in several different population groups. However, consanguinity is evident in most cases. This report is of utmost importance for taking the necessary steps toward the prevention of inherited disorders, not just in the UAE, but anywhere in the world where these Arab and Asian populations reside, or where consanguinity is a cultural norm.


Subject(s)
Genetics, Population , Metabolism, Inborn Errors/genetics , Mutation , Humans , Retrospective Studies , United Arab Emirates
10.
PLoS One ; 6(2): e16917, 2011 Feb 22.
Article in English | MEDLINE | ID: mdl-21364952

ABSTRACT

Parkinson's disease (PD) has had six genome-wide association studies (GWAS) conducted as well as several gene expression studies. However, only variants in MAPT and SNCA have been consistently replicated. To improve the utility of these approaches, we applied pathway analyses integrating both GWAS and gene expression. The top 5000 SNPs (p<0.01) from a joint analysis of three existing PD GWAS were identified and each assigned to a gene. For gene expression, rather than the traditional comparison of one anatomical region between sets of patients and controls, we identified differentially expressed genes between adjacent Braak regions in each individual and adjusted using average control expression profiles. Over-represented pathways were calculated using a hyper-geometric statistical comparison. An integrated, systems meta-analysis of the over-represented pathways combined the expression and GWAS results using a Fisher's combined probability test. Four of the top seven pathways from each approach were identical. The top three pathways in the meta-analysis, with their corrected p-values, were axonal guidance (p = 2.8E-07), focal adhesion (p = 7.7E-06) and calcium signaling (p = 2.9E-05). These results support that a systems biology (pathway) approach will provide additional insight into the genetic etiology of PD and that these pathways have both biological and statistical support to be important in PD.


Subject(s)
Consensus , Parkinson Disease/genetics , Signal Transduction/genetics , Systems Biology/methods , Systems Integration , Chromosome Mapping/methods , Chromosome Mapping/statistics & numerical data , Gene Expression Profiling/methods , Gene Expression Profiling/statistics & numerical data , Gene Regulatory Networks , Genetic Predisposition to Disease , Genome-Wide Association Study/methods , Genome-Wide Association Study/statistics & numerical data , Humans , MAP Kinase Signaling System/genetics , MAP Kinase Signaling System/physiology , Metabolic Networks and Pathways/genetics , Models, Biological , Polymorphism, Single Nucleotide , Receptor Cross-Talk/physiology , Signal Transduction/physiology
11.
BMC Genomics ; 11: 509, 2010 Sep 22.
Article in English | MEDLINE | ID: mdl-20860821

ABSTRACT

BACKGROUND: MicroRNAs are non-coding RNAs that regulate gene expression including differentiation and development by either inhibiting translation or inducing target degradation. The aim of this study is to determine the microRNA expression signature during human pancreatic development and to identify potential microRNA gene targets calculating correlations between the signature microRNAs and their corresponding mRNA targets, predicted by bioinformatics, in genome-wide RNA microarray study. RESULTS: The microRNA signature of human fetal pancreatic samples 10-22 weeks of gestational age (wga), was obtained by PCR-based high throughput screening with Taqman Low Density Arrays. This method led to identification of 212 microRNAs. The microRNAs were classified in 3 groups: Group number I contains 4 microRNAs with the increasing profile; II, 35 microRNAs with decreasing profile and III with 173 microRNAs, which remain unchanged. We calculated Pearson correlations between the expression profile of microRNAs and target mRNAs, predicted by TargetScan 5.1 and miRBase algorithms, using genome-wide mRNA expression data. Group I correlated with the decreasing expression of 142 target mRNAs and Group II with the increasing expression of 876 target mRNAs. Most microRNAs correlate with multiple targets, just as mRNAs are targeted by multiple microRNAs. Among the identified targets are the genes and transcription factors known to play an essential role in pancreatic development. CONCLUSIONS: We have determined specific groups of microRNAs in human fetal pancreas that change the degree of their expression throughout the development. A negative correlative analysis suggests an intertwined network of microRNAs and mRNAs collaborating with each other. This study provides information leading to potential two-way level of combinatorial control regulating gene expression through microRNAs targeting multiple mRNAs and, conversely, target mRNAs regulated in parallel by other microRNAs as well. This study may further the understanding of gene expression regulation in the human developing pancreas.


Subject(s)
Gene Expression Profiling , Gene Expression Regulation, Developmental , MicroRNAs/genetics , Pancreas/embryology , Pancreas/metabolism , Algorithms , Female , Humans , MicroRNAs/classification , MicroRNAs/metabolism , Polymerase Chain Reaction , RNA, Messenger/genetics , RNA, Messenger/metabolism
12.
Ann Hum Genet ; 74(2): 110-6, 2010 Mar.
Article in English | MEDLINE | ID: mdl-20201937

ABSTRACT

Rare mutations in more than 20 genes have been suggested to cause dilated cardiomyopathy (DCM), but explain only a small percentage of cases, mainly in familial forms. We hypothesised that more common variants may also play a role in increasing genetic susceptibility to DCM, similar to that observed in other common complex disorders. To test this hypothesis, we performed case-control analyses on all DNA polymorphic variation identified in a resequencing study of six candidate DCM genes (CSRP3, LDB3, MYH7, SCN5A, TCAP, and TNNT2) conducted in 289 unrelated white probands with DCM of unknown cause and 188 unrelated white controls. In univariate analyses, we identified associated common variants at LDB3 site 10779, LDB3 site 57877, MYH7 sites 16384 and 17404, and TCAP sites 140 and 1735. Multivariate analyses to examine the joint effects of multiple gene variants confirmed univariate results for MYH7 and TCAP and identified a block of nine variants in MYH7 that was strongly associated with DCM. Common variants in genes known to be causative of DCM may play a role in genetic susceptibility to DCM. Our results suggest that examination of common genetic variants may be warranted in future studies of DCM and other Mendelian-like disorders.


Subject(s)
Cardiomyopathy, Dilated/genetics , Genetic Predisposition to Disease , Case-Control Studies , Humans , Multivariate Analysis
13.
Inductive Log Program ; 5989: 149-165, 2010.
Article in English | MEDLINE | ID: mdl-25309972

ABSTRACT

Hexoses are simple sugars that play a key role in many cellular pathways, and in the regulation of development and disease mechanisms. Current protein-sugar computational models are based, at least partially, on prior biochemical findings and knowledge. They incorporate different parts of these findings in predictive black-box models. We investigate the empirical support for biochemical findings by comparing Inductive Logic Programming (ILP) induced rules to actual biochemical results. We mine the Protein Data Bank for a representative data set of hexose binding sites, non-hexose binding sites and surface grooves. We build an ILP model of hexose-binding sites and evaluate our results against several baseline machine learning classifiers. Our method achieves an accuracy similar to that of other black-box classifiers while providing insight into the discriminating process. In addition, it confirms wet-lab findings and reveals a previously unreported Trp-Glu amino acids dependency.

14.
Proteins ; 77(1): 121-32, 2009 Oct.
Article in English | MEDLINE | ID: mdl-19415755

ABSTRACT

Glucose is a simple sugar that plays an essential role in many basic metabolic and signaling pathways. Many proteins have binding sites that are highly specific to glucose. The exponential increase of genomic data has revealed the identity of many proteins that seem to be central to biological processes, but whose exact functions are unknown. Many of these proteins seem to be associated with disease processes. Being able to predict glucose-specific binding sites in these proteins will greatly enhance our ability to annotate protein function and may significantly contribute to drug design. We hereby present the first glucose-binding site classifier algorithm. We consider the sugar-binding pocket as a spherical spatio-chemical environment and represent it as a vector of geometric and chemical features. We then perform Random Forests feature selection to identify key features and analyze them using support vector machines classification. Our work shows that glucose binding sites can be modeled effectively using a limited number of basic chemical and residue features. Using a leave-one-out cross-validation method, our classifier achieves a 8.11% error, a 89.66% sensitivity and a 93.33% specificity over our dataset. From a biochemical perspective, our results support the relevance of ordered water molecules and ions in determining glucose specificity. They also reveal the importance of carboxylate residues in glucose binding and the high concentration of negatively charged atoms in direct contact with the bound glucose molecule.


Subject(s)
Computational Biology/methods , Glucose/metabolism , Proteins/chemistry , Proteins/metabolism , Algorithms , Binding Sites , Databases, Protein , Hydrogen Bonding , Hydrophobic and Hydrophilic Interactions , Protein Binding , Software
15.
BMC Genomics ; 9: 186, 2008 Apr 22.
Article in English | MEDLINE | ID: mdl-18430246

ABSTRACT

BACKGROUND: Nucleosomes are the basic structural units of eukaryotic chromatin, and they play a significant role in regulating gene expression. Specific DNA sequence patterns are known, from empirical and theoretical studies, to influence DNA bending and flexibility, and have been shown to exclude nucleosomes. A whole genome localization of these patterns, and their analysis, can add important insights on the gene regulation mechanisms that depend upon the structure of chromatin in and around a gene. RESULTS: A whole genome annotation for nucleosome exclusion regions (NXRegions) was carried out on the human genome. Nucleosome exclusion scores (NXScores) were calculated individually for each nucleotide, giving a measure of how likely a specific nucleotide and its immediate neighborhood would impair DNA bending and, consequently, exclude nucleosomes. The resulting annotations were correlated with 19055 gene expression profiles. We developed a new method based on Grubbs' outliers test for ranking genes based on their tissue specificity, and correlated this ranking with NXScores. The results show a strong correlation between tissue specificity of a gene and the propensity of its promoter to exclude nucleosomes (the promoter region was taken as -1500 to +500 bp from the RefSeq-annotated transcription start site). In addition, NXScores correlated well with gene density, gene expression levels, and DNaseI hypersensitive sites. CONCLUSION: We present, for the first time, a whole genome prediction of nucleosome exclusion regions for the human genome (the data are available for download from Additional Materials). Nucleosome exclusion patterns are correlated with various factors that regulate gene expression, which emphasizes the need to include chromatin structural parameters in experimental analysis of gene expression.


Subject(s)
Chromatin/chemistry , Genome, Human , Nucleosomes , Deoxyribonuclease I/metabolism , Gene Expression , Humans , Organ Specificity , Saccharomyces cerevisiae/genetics
16.
Nucleic Acids Res ; 34(Web Server issue): W560-5, 2006 Jul 01.
Article in English | MEDLINE | ID: mdl-16845070

ABSTRACT

Nucleosomes, a basic structural unit of eukaryotic chromatin, play a significant role in regulating gene expression. We have developed a web tool based on DNA sequences known from empirical and theoretical studies to influence DNA bending and flexibility, and to exclude nucleosomes. NXSensor (available at http://www.sfu.ca/~ibajic/NXSensor/) finds nucleosome exclusion sequences, evaluates their length and spacing, and computes an 'accessibility score' giving the proportion of base pairs likely to be nucleosome-free. Application of NXSensor to the promoter regions of housekeeping (HK) genes and those of tissue-specific (TS) genes revealed a significant difference between the two classes of gene, the former being significantly more open, on average, particularly near transcription start sites (TSSs). NXSensor should be a useful tool in assessing the likelihood of nucleosome formation in regions involved in gene regulation and other aspects of chromatin function.


Subject(s)
DNA-Binding Proteins/metabolism , Nucleosomes/chemistry , Promoter Regions, Genetic , Sequence Analysis, DNA/methods , Software , Transcription Factors/metabolism , Binding Sites , DNA/chemistry , Internet , Nucleic Acid Conformation , Nucleosomes/metabolism , User-Computer Interface
17.
Mol Genet Metab ; 87(3): 198-203, 2006 Mar.
Article in English | MEDLINE | ID: mdl-16378742

ABSTRACT

The X-linked form of Opitz syndrome (OS) affects midline structures and produces a characteristic, but heterogeneous, phenotype that may include severe mental retardation, hypertelorism, broad nasal bridge, widow's peak, cleft lip/cleft palate, congenital heart disease, laryngotracheal defects, and hypospadias. The MID1 gene was implicated in OS by linkage to Xp22. It encodes a 667 amino acid protein that contains a RING finger motif, two B-box zinc fingers, a coiled-coil, a fibronectin type III (FNIII) domain, and a B30.2 domain. Several mutations in MID1 are associated with severe OS. Here, we describe an intelligent male with a milder phenotype characterized by hypertelorism, broad nasal bridge, widow's peak, mild hypospadias, pectus excavatum, and a surgically corrected tracheo-esophageal fistula. He has an above average intelligence and no cleft lip/palate or heart disease. We identified a novel mutation in MID1 (P441L) which is in exon 8 and functionally associated with the FNIII domain. While OS phenotypes have been attributed to mutations in the C-terminal part of MID1, little is currently known about the structure-function relationships of MID1 mutations, and how they affect phenotype. We find from a literature review that missense mutations within the FNIII domain of MID1 are associated with a milder presentation of OS than missense mutations elsewhere in MID1. All truncating mutations (frameshift, insertions/deletions) lead to severe OS. We used homology analysis of the MID1 FNIII domain to investigate structure-function changes caused by our missense mutation. This and other missense mutations probably cause disruption of protein-protein interactions, either within MID1 or between MID1 and other proteins. We correlate these protein structure-function findings to the absence of CNS or palatal changes and conclude that the FNIII domain of the MID1 protein may be involved in midline differentiation after neural tube and palatal structures are completed.


Subject(s)
Microtubule Proteins/chemistry , Microtubule Proteins/metabolism , Mutation/genetics , Nuclear Proteins/chemistry , Nuclear Proteins/metabolism , Smith-Lemli-Opitz Syndrome/genetics , Smith-Lemli-Opitz Syndrome/pathology , Transcription Factors/chemistry , Transcription Factors/metabolism , Adult , Amino Acid Sequence , Central Nervous System/physiopathology , Fibronectins/chemistry , Humans , Male , Microtubule Proteins/genetics , Models, Molecular , Molecular Sequence Data , Nuclear Proteins/genetics , Phenotype , Protein Structure, Tertiary , Smith-Lemli-Opitz Syndrome/physiopathology , Structure-Activity Relationship , Transcription Factors/genetics , Ubiquitin-Protein Ligases
18.
Phytochemistry ; 65(1): 7-17, 2004 Jan.
Article in English | MEDLINE | ID: mdl-14697267

ABSTRACT

The cupin superfamily of proteins, named on the basis of a conserved beta-barrel fold ('cupa' is the Latin term for a small barrel), was originally discovered using a conserved motif found within germin and germin-like proteins from higher plants. Previous analysis of cupins had identified some 18 different functional classes that range from single-domain bacterial enzymes such as isomerases and epimerases involved in the modification of cell wall carbohydrates, through to two-domain bicupins such as the desiccation-tolerant seed storage globulins, and multidomain transcription factors including one linked to the nodulation response in legumes. Recent advances in comparative genomics, and the resolution of many more 3-D structures have now revealed that the largest subset of the cupin superfamily is the 2-oxyglutarate-Fe(2+) dependent dioxygenases. The substrates for this subclass of enzyme are many and varied and in total amount to probably 50-100 different biochemical reactions, including several involved in plant growth and development. Although the majority of enzymatic cupins contain iron as an active site metal, other members contain either copper, zinc, cobalt, nickel or manganese ions as a cofactor, with each cofactor allowing a different type of chemistry to occur within the conserved tertiary structure. This review discusses the range of structures and functions found in this most diverse of superfamilies.


Subject(s)
Bacterial Proteins/chemistry , Bacterial Proteins/metabolism , Plant Proteins/chemistry , Plant Proteins/metabolism , Amino Acid Sequence , Bacterial Proteins/genetics , Models, Molecular , Molecular Sequence Data , Plant Proteins/genetics , Protein Structure, Secondary , Sequence Alignment , Sequence Homology, Amino Acid
19.
Plant Biotechnol J ; 1(4): 271-85, 2003 Jul.
Article in English | MEDLINE | ID: mdl-17163904

ABSTRACT

We have compiled two comprehensive gene expression profiles from mature leaf and immature seed tissue of rice (Oryza sativa ssp. japonica cultivar Nipponbare) using Serial Analysis of Gene Expression (SAGE) technology. Analysis revealed a total of 50 519 SAGE tags, corresponding to 15 131 unique transcripts. Of these, the large majority (approximately 70%) occur only once in both libraries. Unexpectedly, the most abundant transcript (approximately 3% of the total) in the leaf library was derived from a type 3 metallothionein gene. The overall frequency profiles of the abundant tag species from both tissues differ greatly and reveal seed tissue as exhibiting a non-typical pattern of gene expression characterized by an over abundance of a small number of transcripts coding for storage proteins. A high proportion ( approximately 80%) of the abundant tags (> or = 9) matched entries in our reference rice EST database, with many fewer matches for low abundant tags. Singleton transcripts that are common to both tissues were collated to generate a summary of low abundant transcripts that are expressed constitutively in rice tissues. Finally and most surprisingly, a significant number of tags were found to code for antisense transcripts, a finding that suggests a novel mechanism of gene regulation, and may have implications for the use of antisense constructs in transgenic technology.

SELECTION OF CITATIONS
SEARCH DETAIL
...