Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 20
Filter
Add more filters










Publication year range
1.
Genes (Basel) ; 15(3)2024 02 28.
Article in English | MEDLINE | ID: mdl-38540378

ABSTRACT

Inherited cardiomyopathies represent a highly heterogeneous group of cardiac diseases. DNA variants in genes expressed in cardiomyocytes cause a diverse spectrum of cardiomyopathies, ultimately leading to heart failure, arrythmias, and sudden cardiac death. We applied massive parallel DNA sequencing using a 72-gene panel for studying inherited cardiomyopathies. We report on variants in 25 families, where pathogenicity was predicted by different computational approaches, databases, and an in-house filtering analysis. All variants were validated using Sanger sequencing. Familial segregation was tested when possible. We identified 41 different variants in 26 genes. Analytically, we identified fifteen variants previously reported in the Human Gene Mutation Database: twelve mentioned as disease-causing mutations (DM) and three as probable disease-causing mutations (DM?). Additionally, we identified 26 novel variants. We classified the forty-one variants as follows: twenty-eight (68.3%) as variants of uncertain significance, eight (19.5%) as likely pathogenic, and five (12.2%) as pathogenic. We genetically characterized families with a cardiac phenotype. The genetic heterogeneity and the multiplicity of candidate variants are making a definite molecular diagnosis challenging, especially when there is a suspicion of incomplete penetrance or digenic-oligogenic inheritance. This is the first systematic study of inherited cardiac conditions in Cyprus, enabling us to develop a genetic baseline and precision cardiology.


Subject(s)
Cardiomyopathies , Multifactorial Inheritance , Humans , Cyprus/epidemiology , Cardiomyopathies/genetics , Mutation , Sequence Analysis, DNA
2.
Biology (Basel) ; 12(9)2023 Sep 11.
Article in English | MEDLINE | ID: mdl-37759625

ABSTRACT

The most common approach in transcriptomics (RNA-seq and microarrays) is differential gene expression analysis (DGEA) [...].

3.
Genes (Basel) ; 14(9)2023 08 25.
Article in English | MEDLINE | ID: mdl-37761826

ABSTRACT

Familial hematuria is a clinical sign of a genetically heterogeneous group of conditions, accompanied by broad inter- and intrafamilial variable expressivity. The most frequent condition is caused by pathogenic (or likely pathogenic) variants in the collagen-IV genes, COL4A3/A4/A5. Pathogenic variants in COL4A5 are responsible for the severe X-linked glomerulopathy, Alport syndrome (AS), while homozygous or compound heterozygous variants in the COL4A3 or the COL4A4 gene cause autosomal recessive AS. AS usually leads to progressive kidney failure before the age of 40-years when left untreated. People who inherit heterozygous COL4A3/A4 variants are at-risk of a slowly progressive form of the disease, starting with microscopic hematuria in early childhood, developing Alport spectrum nephropathy. Sometimes, they are diagnosed with benign familial hematuria, and sometimes with autosomal dominant AS. At diagnosis, they often show thin basement membrane nephropathy, reflecting the uniform thin glomerular basement membrane lesion, inherited as an autosomal dominant condition. On a long follow-up, most patients will retain normal or mildly affected kidney function, while a substantial proportion will develop chronic kidney disease (CKD), even kidney failure at an average age of 55-years. A question that remains unanswered is how to distinguish those patients with AS or with heterozygous COL4A3/A4 variants who will manifest a more aggressive kidney function decline, requiring prompt medical intervention. The hypothesis that a subgroup of patients coinherit additional genetic modifiers that exacerbate their clinical course has been investigated by several researchers. Here, we review all publications that describe the potential role of candidate genetic modifiers in patients and include a summary of studies in AS mouse models.


Subject(s)
Nephritis, Hereditary , Renal Insufficiency , Child, Preschool , Humans , Animals , Mice , Middle Aged , Adult , Hematuria/genetics , Nephritis, Hereditary/genetics , Collagen Type IV/genetics
4.
N Biotechnol ; 77: 12-19, 2023 Nov 25.
Article in English | MEDLINE | ID: mdl-37295722

ABSTRACT

Data quality has recently become a critical topic for the research community. European guidelines recommend that scientific data should be made FAIR: findable, accessible, interoperable and reusable. However, as FAIR guidelines do not specify how the stated principles should be implemented, it might not be straightforward for researchers to know how actually to make their data FAIR. This can prevent life-science researchers from sharing their datasets and pipelines, ultimately hindering the progress of research. To address this difficulty, we developed the BIBBOX, which is a platform that supports researchers publishing their datasets and the associated software in a FAIR manner.


Subject(s)
Mobile Applications
5.
Cells ; 12(3)2023 01 21.
Article in English | MEDLINE | ID: mdl-36766730

ABSTRACT

Genes with similar expression patterns in a set of diverse samples may be considered coexpressed. Human Gene Coexpression Analysis 2.0 (HGCA2.0) is a webtool which studies the global coexpression landscape of human genes. The website is based on the hierarchical clustering of 55,431 Homo sapiens genes based on a large-scale coexpression analysis of 3500 GTEx bulk RNA-Seq samples of healthy individuals, which were selected as the best representative samples of each tissue type. HGCA2.0 presents subclades of coexpressed genes to a gene of interest, and performs various built-in gene term enrichment analyses on the coexpressed genes, including gene ontologies, biological pathways, protein families, and diseases, while also being unique in revealing enriched transcription factors driving coexpression. HGCA2.0 has been successful in identifying not only genes with ubiquitous expression patterns, but also tissue-specific genes. Benchmarking showed that HGCA2.0 belongs to the top performing coexpression webtools, as shown by STRING analysis. HGCA2.0 creates working hypotheses for the discovery of gene partners or common biological processes that can be experimentally validated. It offers a simple and intuitive website design and user interface, as well as an API endpoint.


Subject(s)
Gene Expression Profiling , Gene Regulatory Networks , Genes , Humans , RNA-Seq , Transcription Factors , Genes/genetics , Genes/physiology
6.
Biology (Basel) ; 12(2)2023 Jan 21.
Article in English | MEDLINE | ID: mdl-36829450

ABSTRACT

Removal of the 5' cap structure of RNAs (termed decapping) is a pivotal event in the life of cytoplasmic mRNAs mainly catalyzed by a conserved holoenzyme, composed of the catalytic subunit DCP2 and its essential cofactor DCP1. While decapping was initially considered merely a step in the general 5'-3' mRNA decay, recent data suggest a great degree of selectivity that plays an active role in the post-transcriptional control of gene expression, and regulates multiple biological functions. Studies in Caenorhabditis elegans have shown that old age is accompanied by the accumulation of decapping factors in cytoplasmic RNA granules, and loss of decapping activity shortens the lifespan. However, the link between decapping and ageing remains elusive. Here, we present a comparative microarray study that was aimed to uncover the differences in the transcriptome of mid-aged dcap-1/DCP1 mutant and wild-type nematodes. Our data indicate that DCAP-1 mediates the silencing of spermatogenic genes during late oogenesis, and suppresses the aberrant uprise of immunity gene expression during ageing. The latter is achieved by destabilizing the mRNA that encodes the transcription factor PQM-1 and impairing its nuclear translocation. Failure to exert decapping-mediated control on PQM-1 has a negative impact on the lifespan, but mitigates the toxic effects of polyglutamine expression that are involved in human disease.

7.
Biology (Basel) ; 11(7)2022 Jul 06.
Article in English | MEDLINE | ID: mdl-36101400

ABSTRACT

Gene coexpression analysis constitutes a widely used practice for gene partner identification and gene function prediction, consisting of many intricate procedures. The analysis begins with the collection of primary transcriptomic data and their preprocessing, continues with the calculation of the similarity between genes based on their expression values in the selected sample dataset and results in the construction and visualisation of a gene coexpression network (GCN) and its evaluation using biological term enrichment analysis. As gene coexpression analysis has been studied extensively, we present most parts of the methodology in a clear manner and the reasoning behind the selection of some of the techniques. In this review, we offer a comprehensive and comprehensible account of the steps required for performing a complete gene coexpression analysis in eukaryotic organisms. We comment on the use of RNA-Seq vs. microarrays, as well as the best practices for GCN construction. Furthermore, we recount the most popular webtools and standalone applications performing gene coexpression analysis, with details on their methods, features and outputs.

8.
STAR Protoc ; 3(1): 101208, 2022 03 18.
Article in English | MEDLINE | ID: mdl-35243384

ABSTRACT

Coexpressed genes tend to participate in related biological processes. Gene coexpression analysis allows the discovery of functional gene partners or the assignment of biological roles to genes of unknown function. In this protocol, we describe the steps necessary to create a gene coexpression tree for Arabidopsis thaliana, using publicly available Affymetrix CEL microarray data. Because the computational analysis described here is highly dependent on sample quality, we detail an automatic quality control approach. For complete details on the use and execution of this protocol, please refer to Zogopoulos et al. (2021).


Subject(s)
Arabidopsis Proteins , Arabidopsis , Arabidopsis/genetics , Arabidopsis Proteins/genetics , Gene Expression Profiling/methods , Genetic Testing , Oligonucleotide Array Sequence Analysis/methods
9.
J Clin Med ; 10(24)2021 Dec 15.
Article in English | MEDLINE | ID: mdl-34945178

ABSTRACT

Long-term persistence and the heterogeneity of humoral response to SARS-CoV-2 have not yet been thoroughly investigated. The aim of this work is to study the production of circulating immunoglobulin class G (IgG) antibodies against SARS-CoV-2 in individuals with past infection in Cyprus. Individuals of the general population, with or without previous SARS-CoV-2 infection, were invited to visit the Biobank at the Center of Excellence in Biobanking and Biomedical Research of the University of Cyprus. Serum IgG antibodies were measured using the SARS-CoV-2 IgG and the SARS-CoV-2 IgG II Quant assays of Abbott Laboratories. Antibody responses to SARS-CoV-2 were also evaluated against participants' demographic and clinical data. All statistical analyses were conducted in Stata 16. The median levels of receptor binding domain (RBD)-specific IgG in 969 unvaccinated individuals, who were reportedly infected between November 2020 and September 2021, were 432.1 arbitrary units (AI)/mL (interquartile range-IQR: 182.4-1147.3). Higher antibody levels were observed in older participants, males, and those who reportedly developed symptoms or were hospitalized. The RBD-specific IgG levels peaked at three months post symptom onset and subsequently decreased up to month six, with a slower decay thereafter. IgG response to the RBD of SARS-CoV-2 is bi-phasic with considerable titer variability. Levels of IgG are significantly associated with several parameters, including age, gender, and severity of symptoms.

10.
Front Digit Health ; 3: 628646, 2021.
Article in English | MEDLINE | ID: mdl-34713101

ABSTRACT

Biobanks have long existed to support research activities with BBMRI-ERIC formed as a European research infrastructure supporting the coordination for biobanking with 20 country members and one international organization. Although the benefits of biobanks to the research community are well-established, the direct benefit to citizens is limited to the generic benefit of promoting future research. Furthermore, the advent of General Data Protection Regulation (GDPR) legislation raised a series of challenges for scientific research especially related to biobanking associate activities and longitudinal research studies. Electronic health record (EHR) registries have long existed in healthcare providers. In some countries, even at the national level, these record the state of the health of citizens through time for the purposes of healthcare and data portability between different providers. The potential of EHRs in research is great and has been demonstrated in many projects that have transformed EHR data into retrospective medical history information on participating subjects directly from their physician's collected records; many key challenges, however, remain. In this paper, we present a citizen-centric framework called eHealthBioR, which would enable biobanks to link to EHR systems, thus enabling not just retrospective but also lifelong prospective longitudinal studies of participating citizens. It will also ensure strict adherence to legal and ethical requirements, enabling greater control that encourages participation. Citizens would benefit from the real and direct control of their data and samples, utilizing technology, to empower them to make informed decisions about providing consent and practicing their rights related to the use of their data, as well as by having access to knowledge and data generated from samples they provided to biobanks. This is expected to motivate patient engagement in future research and even leads to participatory design methodologies with citizen/patient-centric designed studies. The development of platforms based on the eHealthBioR framework would need to overcome significant challenges. However, it would shift the burden of addressing these to experts in the field while providing solutions enabling in the long term the lower monetary and time cost of longitudinal studies coupled with the option of lifelong monitoring through EHRs.

11.
iScience ; 24(8): 102848, 2021 Aug 20.
Article in English | MEDLINE | ID: mdl-34381973

ABSTRACT

Gene coexpression analysis refers to the discovery of sets of genes which exhibit similar expression patterns across multiple transcriptomic data sets, such as microarray experiment data of public repositories. Arabidopsis Coexpression Tool (ACT), a gene coexpression analysis web tool for Arabidopsis thaliana, identifies genes which are correlated to a driver gene. Primary microarray data from ATH1 Affymetrix platform were processed with Single-Channel Array Normalization algorithm and combined to produce a coexpression tree which contains ∼21,000 A. thaliana genes. ACT was developed to present subclades of coexpressed genes, as well as to perform gene set enrichment analysis, being unique in revealing enriched transcription factors targeting coexpressed genes. ACT offers a simple and user-friendly interface producing working hypotheses which can be experimentally verified for the discovery of gene partnership, pathway membership, and transcriptional regulation. ACT analyses have been successful in identifying not only genes with coordinated ubiquitous expressions but also genes with tissue-specific expressions.

12.
J Neuromuscul Dis ; 8(s2): S223-S239, 2021.
Article in English | MEDLINE | ID: mdl-34308911

ABSTRACT

BACKGROUND: Molecular interaction networks (MINs) aim to capture the complex relationships between interacting molecules within a biological system. MINs can be constructed from existing knowledge of molecular functional associations, such as protein-protein binding interactions (PPI) or gene co-expression, and these different sources may be combined into a single MIN. A given MIN may be more or less optimal in its representation of the important functional relationships of molecules in a tissue. OBJECTIVE: The aim of this study was to establish whether a combined MIN derived from different types of functional association could better capture muscle-relevant biology compared to its constituent single-source MINs. METHODS: MINs were constructed from functional association databases for both protein-binding and gene co-expression. The networks were then compared based on the capture of muscle-relevant genes and gene ontology (GO) terms, tested in two different ways using established biological network clustering algorithms. The top performing MINs were combined to test whether an optimal MIN for skeletal muscle could be constructed. RESULTS: The STRING PPI network was the best performing single-source MIN among those tested. Combining STRING with interactions from either the MyoMiner or CoXPRESSdb gene co-expression sources resulted in a combined network with improved performance relative to its constituent networks. CONCLUSION: MINs constructed from multiple types of functional association can better represent the functional relationships of molecules in a given tissue. Such networks may be used to improve the analysis and interpretation of functional genomics data in the study of skeletal muscle and neuromuscular diseases. Networks and clusters described by this study, including the combinations of STRING with MyoMiner or with CoXPRESSdb, are available for download from https://www.sys-myo.com/myominer/download.php.


Subject(s)
Muscle, Skeletal/physiology , Protein Interaction Maps , Algorithms , Databases, Factual , Gene Expression Profiling , Genomics , Humans
13.
BMC Med Genomics ; 13(1): 67, 2020 05 11.
Article in English | MEDLINE | ID: mdl-32393257

ABSTRACT

BACKGROUND: High-throughput transcriptomics measures mRNA levels for thousands of genes in a biological sample. Most gene expression studies aim to identify genes that are differentially expressed between different biological conditions, such as between healthy and diseased states. However, these data can also be used to identify genes that are co-expressed within a biological condition. Gene co-expression is used in a guilt-by-association approach to prioritize candidate genes that could be involved in disease, and to gain insights into the functions of genes, protein relations, and signaling pathways. Most existing gene co-expression databases are generic, amalgamating data for a given organism regardless of tissue-type. METHODS: To study muscle-specific gene co-expression in both normal and pathological states, publicly available gene expression data were acquired for 2376 mouse and 2228 human striated muscle samples, and separated into 142 categories based on species (human or mouse), tissue origin, age, gender, anatomic part, and experimental condition. Co-expression values were calculated for each category to create the MyoMiner database. RESULTS: Within each category, users can select a gene of interest, and the MyoMiner web interface will return all correlated genes. For each co-expressed gene pair, adjusted p-value and confidence intervals are provided as measures of expression correlation strength. A standardized expression-level scatterplot is available for every gene pair r-value. MyoMiner has two extra functions: (a) a network interface for creating a 2-shell correlation network, based either on the most highly correlated genes or from a list of genes provided by the user with the option to include linked genes from the database and (b) a comparison tool from which the users can test whether any two correlation coefficients from different conditions are significantly different. CONCLUSIONS: These co-expression analyses will help investigators to delineate the tissue-, cell-, and pathology-specific elements of muscle protein interactions, cell signaling and gene regulation. Changes in co-expression between pathologic and healthy tissue may suggest new disease mechanisms and help define novel therapeutic targets. Thus, MyoMiner is a powerful muscle-specific database for the discovery of genes that are associated with related functions based on their co-expression. MyoMiner is freely available at https://www.sys-myo.com/myominer.


Subject(s)
Computational Biology/methods , Gene Expression Regulation , Muscle Proteins/genetics , Muscles/metabolism , Muscular Diseases/genetics , Software , Transcriptome , Adolescent , Adult , Animals , Child , Child, Preschool , Female , Gene Regulatory Networks , Humans , Infant , Infant, Newborn , Male , Mice , Middle Aged , Muscles/cytology , Muscular Diseases/metabolism , Muscular Diseases/pathology , Young Adult
14.
Skelet Muscle ; 9(1): 10, 2019 05 03.
Article in English | MEDLINE | ID: mdl-31053169

ABSTRACT

BACKGROUND: The approach of building large collections of gene sets and then systematically testing hypotheses across these collections is a powerful tool in functional genomics, both in the pathway analysis of omics data and to uncover the polygenic effects associated with complex diseases in genome-wide association study. The Molecular Signatures Database includes collections of oncogenic and immunologic signatures enabling researchers to compare transcriptional datasets across hundreds of previous studies and leading to important insights in these fields, but such a resource does not currently exist for neuromuscular research. In previous work, we have shown the utility of gene set approaches to understand muscle cell physiology and pathology. METHODS: Following a systematic survey of public muscle data, we passed gene expression profiles from 4305 samples through a robust pre-processing and standardized data analysis pipeline. Two hundred eighty-two samples were discarded based on a battery of rigorous global quality controls. From among the remaining studies, 578 comparisons of interest were identified by a combination of text mining and manual curation of the study meta-data. For each comparison, significantly dysregulated genes (FDR adjusted p < 0.05) were identified. RESULTS: Lists of dysregulated genes were divided between upregulated and downregulated to give 1156 Muscle Gene Sets (MGS). This resource is available for download ( www.sys-myo.com/muscle_gene_sets ) and is accessible through three commonly used functional genomics platforms (GSEA, EnrichR, and WebGestalt). Basic guidance and recommendations are provided for the use of MGS through these platforms. In addition, consensus muscle gene sets were created to capture the overlap between the results of similar studies, and analysis of these highlighted the potential for novel disease-relevant findings. CONCLUSIONS: The MGS resource can be used to investigate the behaviour of any list of genes across previous comparisons of muscle conditions, to compare previous studies to one another, and to explore the functional relationship of muscle dysregulation to the Gene Ontology. Its major intended use is in enrichment testing for functional genomics analysis.


Subject(s)
Genomics/methods , Muscle, Skeletal/physiology , Software , Animals , Chromosome Mapping , Computational Biology , Databases, Genetic , Gene Expression Profiling , Gene Regulatory Networks , Genome-Wide Association Study/statistics & numerical data , Genomics/statistics & numerical data , Humans , Mice , Muscle, Skeletal/innervation , Muscle, Skeletal/pathology , Oligonucleotide Array Sequence Analysis , Transcriptome
15.
Hum Mol Genet ; 26(11): 1979-1991, 2017 06 01.
Article in English | MEDLINE | ID: mdl-28334824

ABSTRACT

Repair of skeletal muscle after sarcolemmal damage involves dysferlin and dysferlin-interacting proteins such as annexins. Mice and patient lacking dysferlin exhibit chronic muscle inflammation and adipogenic replacement of the myofibers. Here, we show that similar to dysferlin, lack of annexin A2 (AnxA2) also results in poor myofiber repair and progressive muscle weakening with age. By longitudinal analysis of AnxA2-deficient muscle we find that poor myofiber repair due to the lack of AnxA2 does not result in chronic inflammation or adipogenic replacement of the myofibers. Further, deletion of AnxA2 in dysferlin deficient mice reduced muscle inflammation, adipogenic replacement of myofibers, and improved muscle function. These results identify multiple roles of AnxA2 in muscle repair, which includes facilitating myofiber repair, chronic muscle inflammation and adipogenic replacement of dysferlinopathic muscle. It also identifies inhibition of AnxA2-mediated inflammation as a novel therapeutic avenue for treating muscle loss in dysferlinopathy.


Subject(s)
Annexin A2/metabolism , Annexin A2/physiology , Adipogenesis , Animals , Annexin A2/genetics , Dysferlin , Inflammation/metabolism , Membrane Proteins/metabolism , Membrane Proteins/physiology , Mice , Mice, Knockout , Muscle, Skeletal/metabolism , Muscular Dystrophies, Limb-Girdle/metabolism , Muscular Dystrophies, Limb-Girdle/therapy , Myofibrils/physiology , Sarcolemma/metabolism
16.
Nat Commun ; 7: 12846, 2016 Sep 26.
Article in English | MEDLINE | ID: mdl-27667448

ABSTRACT

Gene expression data are accumulating exponentially in public repositories. Reanalysis and integration of themed collections from these studies may provide new insights, but requires further human curation. Here we report a crowdsourcing project to annotate and reanalyse a large number of gene expression profiles from Gene Expression Omnibus (GEO). Through a massive open online course on Coursera, over 70 participants from over 25 countries identify and annotate 2,460 single-gene perturbation signatures, 839 disease versus normal signatures, and 906 drug perturbation signatures. All these signatures are unique and are manually validated for quality. Global analysis of these signatures confirms known associations and identifies novel associations between genes, diseases and drugs. The manually curated signatures are used as a training set to develop classifiers for extracting similar signatures from the entire GEO repository. We develop a web portal to serve these signatures for query, download and visualization.

17.
Nucleic Acids Res ; 43(W1): W571-5, 2015 Jul 01.
Article in English | MEDLINE | ID: mdl-25883154

ABSTRACT

Given a query list of genes or proteins, CellWhere produces an interactive graphical display that mimics the structure of a cell, showing the local interaction network organized into subcellular locations. This user-friendly tool helps in the formulation of mechanistic hypotheses by enabling the experimental biologist to explore simultaneously two elements of functional context: (i) protein subcellular localization and (ii) protein-protein interactions or gene functional associations. Subcellular localization terms are obtained from public sources (the Gene Ontology and UniProt-together containing several thousand such terms) then mapped onto a smaller number of CellWhere localizations. These localizations include all major cell compartments, but the user may modify the mapping as desired. Protein-protein interaction listings, and their associated evidence strength scores, are obtained from the Mentha interactome server, or power-users may upload a pre-made network produced using some other interactomics tool. The Cytoscape.js JavaScript library is used in producing the graphical display. Importantly, for a protein that has been observed at multiple subcellular locations, users may prioritize the visual display of locations that are of special relevance to their research domain. CellWhere is at http://cellwhere-myology.rhcloud.com.


Subject(s)
Protein Interaction Mapping , Proteins/analysis , Software , Computer Graphics , Genes , Internet , Intracellular Space/chemistry
18.
J Neuromuscul Dis ; 2(3): 205-217, 2015 Sep 02.
Article in English | MEDLINE | ID: mdl-27858742

ABSTRACT

Aging is associated with both muscle weakness and a loss of muscle mass, contributing towards overall frailty in the elderly. Aging skeletal muscle is also characterised by a decreasing efficiency in repair and regeneration, together with a decline in the number of adult stem cells. Commensurate with this are general changes in whole body endocrine signalling, in local muscle secretory environment, as well as in intrinsic properties of the stem cells themselves. The present review discusses the various mechanisms that may be implicated in these age-associated changes, focusing on aspects of cell-cell communication and long-distance signalling factors, such as levels of circulating growth hormone, IL-6, IGF1, sex hormones, and inflammatory cytokines. Changes in the local environment are also discussed, implicating IL-6, IL-4, FGF-2, as well as other myokines, and processes that lead to thickening of the extra-cellular matrix. These factors, involved primarily in communication, can also modulate the intrinsic properties of muscle stem cells, including reduced DNA accessibility and repression of specific genes by methylation. Finally we discuss the decrease in the stem cell pool, particularly the failure of elderly myoblasts to re-quiesce after activation, and the consequences of all these changes on general muscle homeostasis.

19.
Insect Biochem Mol Biol ; 43(2): 189-96, 2013 Feb.
Article in English | MEDLINE | ID: mdl-23262288

ABSTRACT

Chorion proteins of Lepidoptera have a tripartite structure, which consists of a central domain and two, more variable, flanking arms. The central domain is highly conserved and it is used for the classification of chorion proteins into two major classes, A and B. Annotated and unreviewed Lepidopteran chorion protein sequences are available in various databases. A database, named LepChorionDB, was constructed by searching 5 different protein databases using class A and B central domain-specific profile Hidden Markov Models (pHMMs), developed in this work. A total of 413 Lepidopteran chorion proteins from 9 moths and 1 butterfly species were retrieved. These data were enriched and organised in order to populate LepChorionDB, the first relational database, available on the web, containing Lepidopteran chorion proteins grouped in A and B classes. LepChorionDB may provide insights in future functional and evolutionary studies of Lepidopteran chorion proteins and thus, it will be a useful tool for the Lepidopteran scientific community and Lepidopteran genome annotators, since it also provides access to the two pHMMs developed in this work, which may be used to discriminate A and B class chorion proteins. LepChorionDB is freely available at http://bioinformatics.biol.uoa.gr/LepChorionDB.


Subject(s)
Computational Biology/methods , Databases, Protein , Egg Proteins/genetics , Insect Proteins/genetics , Lepidoptera/genetics , Proteome/genetics , Amino Acid Sequence , Animals , Computational Biology/instrumentation , Egg Proteins/chemistry , Egg Proteins/metabolism , Insect Proteins/chemistry , Insect Proteins/metabolism , Internet , Lepidoptera/chemistry , Lepidoptera/metabolism , Molecular Sequence Data , Proteome/chemistry , Proteome/metabolism , Sequence Alignment
20.
BMC Res Notes ; 5: 265, 2012 Jun 06.
Article in English | MEDLINE | ID: mdl-22672625

ABSTRACT

BACKGROUND: Bioinformatics and high-throughput technologies such as microarray studies allow the measure of the expression levels of large numbers of genes simultaneously, thus helping us to understand the molecular mechanisms of various biological processes in a cell. FINDINGS: We calculate the Pearson Correlation Coefficient (r-value) between probe set signal values from Affymetrix Human Genome Microarray samples and cluster the human genes according to the r-value correlation matrix using the Neighbour Joining (NJ) clustering method. A hyper-geometric distribution is applied on the text annotations of the probe sets to quantify the term overrepresentations. The aim of the tool is the identification of closely correlated genes for a given gene of interest and/or the prediction of its biological function, which is based on the annotations of the respective gene cluster. CONCLUSION: Human Gene Correlation Analysis (HGCA) is a tool to classify human genes according to their coexpression levels and to identify overrepresented annotation terms in correlated gene groups. It is available at: http://biobank-informatics.bioacademy.gr/coexpression/.


Subject(s)
Computational Biology , Gene Expression Profiling/methods , Gene Expression Regulation , High-Throughput Screening Assays/methods , Oligonucleotide Array Sequence Analysis , Transcription, Genetic , Cluster Analysis , DEAD-box RNA Helicases/genetics , Databases, Genetic , Gene Regulatory Networks , HLA Antigens/genetics , Humans , Intramolecular Oxidoreductases/genetics , Lipocalins/genetics , Metallothionein/genetics , Models, Genetic , Models, Statistical , Molecular Sequence Annotation , Promoter Regions, Genetic , Ribosomal Proteins/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...