Search | VHL Regional Portal

A knowledge-based clustering algorithm driven by Gene Ontology.

Cheng, Jill; Cline, Melissa; Martin, John; Finkelstein, David; Awad, Tarif; Kulp, David; Siani-Rose, Michael A.

J Biopharm Stat ; 14(3): 687-700, 2004 Aug.

Article in English | MEDLINE | ID: mdl-15468759

ABSTRACT

We have developed an algorithm for inferring the degree of similarity between genes by using the graph-based structure of Gene Ontology (GO). We applied this knowledge-based similarity metric to a clique-finding algorithm for detecting sets of related genes with biological classifications. We also combined it with an expression-based distance metric to produce a co-cluster analysis, which accentuates genes with both similar expression profiles and similar biological characteristics and identifies gene clusters that are more stable and biologically meaningful. These algorithms are demonstrated in the analysis of MPRO cell differentiation time series experiments.

Subject(s)

Algorithms , Artificial Intelligence , Cluster Analysis , Oligonucleotide Array Sequence Analysis/statistics & numerical data , Cell Differentiation/drug effects , Cell Differentiation/physiology , Humans , Neutrophils/drug effects , Tretinoin/pharmacology

NetAffx Gene Ontology Mining Tool: a visual approach for microarray data analysis.

Cheng, Jill; Sun, Shaw; Tracy, Adam; Hubbell, Earl; Morris, Joseph; Valmeekam, Venu; Kimbrough, Andrew; Cline, Melissa S; Liu, Guoying; Shigeta, Ron; Kulp, David; Siani-Rose, Michael A.

Bioinformatics ; 20(9): 1462-3, 2004 Jun 12.

Article in English | MEDLINE | ID: mdl-14962933

ABSTRACT

SUMMARY: The NetAffx Gene Ontology (GO) Mining Tool is a web-based, interactive tool that permits traversal of the GO graph in the context of microarray data. It accepts a list of Affymetrix probe sets and renders a GO graph as a heat map colored according to significance measurements. The rendered graph is interactive, with nodes linked to public web sites and to lists of the relevant probe sets. The GO Mining Tool provides visualization combining biological annotation with expression data, encompassing thousands of genes in one interactive view. AVAILABILITY: GO Mining Tool is freely available at http://www.affymetrix.com/analysis/query/go_analysis.affx

Subject(s)

Algorithms , Documentation/methods , Information Storage and Retrieval/methods , Natural Language Processing , Oligonucleotide Array Sequence Analysis , Software , User-Computer Interface , Abstracting and Indexing/methods , Computer Graphics , Database Management Systems , Gene Expression Profiling/methods

Gene structure-based splice variant deconvolution using a microarray platform.

Wang, Hui; Hubbell, Earl; Hu, Jing-shan; Mei, Gangwu; Cline, Melissa; Lu, Gang; Clark, Tyson; Siani-Rose, Michael A; Ares, Manuel; Kulp, David C; Haussler, David.

Bioinformatics ; 19 Suppl 1: i315-22, 2003.

Article in English | MEDLINE | ID: mdl-12855476

ABSTRACT

MOTIVATION: Alternative splicing allows a single gene to generate multiple mRNAs, which can be translated into functionally and structurally diverse proteins. One gene can have multiple variants coexisting at different concentrations. Estimating the relative abundance of each variant is important for the study of underlying biological function. Microarrays are standard tools that measure gene expression. But most design and analysis has not accounted for splice variants. Thus splice variant-specific chip designs and analysis algorithms are needed for accurate gene expression profiling. RESULTS: Inspired by Li and Wong (2001), we developed a gene structure-based algorithm to determine the relative abundance of known splice variants. Probe intensities are modeled across multiple experiments using gene structures as constraints. Model parameters are obtained through a maximum likelihood estimation (MLE) process/framework. The algorithm produces the relative concentration of each variant, as well as an affinity term associated with each probe. Validation of the algorithm is performed by a set of controlled spike experiments as well as endogenous tissue samples using a human splice variant array.

Subject(s)

Algorithms , Alternative Splicing/genetics , Drosophila Proteins , Gene Expression Profiling/methods , Oligonucleotide Array Sequence Analysis/methods , Sequence Alignment/methods , Sequence Analysis, DNA/methods , DNA Probes/genetics , Equipment Design , Equipment Failure Analysis , Genetic Variation , Humans , Models, Genetic , Models, Statistical , Oligonucleotide Array Sequence Analysis/instrumentation , Tropomyosin/genetics

GPCR-GRAPA-LIB--a refined library of hidden Markov Models for annotating GPCRs.

Shigeta, Ron; Cline, Melissa; Liu, Guoying; Siani-Rose, Michael A.

Bioinformatics ; 19(5): 667-8, 2003 Mar 22.

Article in English | MEDLINE | ID: mdl-12651732

ABSTRACT

GPCR-GRAPA-LIB is a library of HMMs describing G protein coupled receptor families. These families are initially defined by class of receptor ligand, with divergent families divided into subfamilies using phylogenic analysis and knowledge of GPCR function. Protein sequences are applied to the models with the GRAPA curve-based selection criteria. RefSeq sequences for Homo sapiens, Drosophila melanogaster, and Caenorhabditis elegans have been annotated using this approach.

Subject(s)

Databases, Protein , GTP-Binding Proteins/chemistry , GTP-Binding Proteins/genetics , Models, Genetic , Models, Statistical , Sequence Alignment/methods , Sequence Analysis, Protein/methods , Algorithms , Amino Acid Sequence , Documentation , Evolution, Molecular , GTP-Binding Proteins/classification , Molecular Sequence Data , Structure-Activity Relationship

NetAffx: Affymetrix probesets and annotations.

Liu, Guoying; Loraine, Ann E; Shigeta, Ron; Cline, Melissa; Cheng, Jill; Valmeekam, Venu; Sun, Shaw; Kulp, David; Siani-Rose, Michael A.

Nucleic Acids Res ; 31(1): 82-6, 2003 Jan 01.

Article in English | MEDLINE | ID: mdl-12519953

ABSTRACT

NetAffx (http://www.affymetrix.com) details and annotates probesets on Affymetrix GeneChip microarrays. These annotations include (i) static information specific to the probeset composition; (ii) sequence annotations extracted from public databases; and (iii) protein sequence-level annotations derived from public domain programs, as well as libraries of hidden Markov models (HMMs) developed at Affymetrix. For each probeset, NetAffx lists the probe sequences, and the consensus sequence interrogated by the probes; for the larger chip sets, interactive maps display this sequence data in genomic context. Sequence annotations include Gene Ontology (GO) terms and depiction of GO graph relationships; predicted protein domains and motifs; orthologous sequences; links to relevant pathways; and links to public databases including UniGene, LocusLink, SWISS-PROT and OMIM.

Subject(s)

Databases, Genetic , Gene Expression Profiling , Oligonucleotide Array Sequence Analysis , Animals , Consensus Sequence , Information Storage and Retrieval , Markov Chains , Proteins/chemistry , Sequence Analysis, Protein , Software

Exploring alternative transcript structure in the human genome using blocks and InterPro.

Loraine, Ann E; Helt, Gregg A; Cline, Melissa S; Siani-Rose, Michael A.

J Bioinform Comput Biol ; 1(2): 289-306, 2003 Jul.

Article in English | MEDLINE | ID: mdl-15290774

ABSTRACT

Understanding how alternative splicing affects gene function is an important challenge facing modern-day molecular biology. Using homology-based, protein sequence analysis methods, it should be possible to investigate how transcript diversity impacts protein function. To test this, high-quality exon-intron structures were deduced for over 8000 human genes, including over 1300 (17 percent) that produce multiple transcript variants. A data mining technique (DiffMotif) was developed to identify genes in which transcript variation coincides with changes in conserved motifs between variants. Applying this method, we found that 30 percent of the multi-variant genes in our test set exhibited a differential profile of conserved InterPro and/or BLOCKS motifs across different mRNA variants. To investigate these, a visualization tool (ProtAnnot) that displays amino acid motifs in the context of genomic sequence was developed. Using this tool, genes revealed by the DiffMotif method were analyzed, and when possible, hypotheses regarding the potential role of alternative transcript structure in modulating gene function were developed. Examples of these, including: MEOX1, a homeobox-containing protein; AIRE, involved in auto-immune disease; PLAT, tissue type plasminogen activator; and CD79b, a component of the B-cell receptor complex, are presented. These results demonstrate that amino acid motif databases like BLOCKS and InterPro are useful tools for investigating how alternative transcript structure affects gene function.

Subject(s)

Alternative Splicing/genetics , Chromosome Mapping/methods , Databases, Protein , Genome, Human , Sequence Alignment/methods , Sequence Analysis, Protein/methods , Transcription Factors/genetics , Algorithms , Amino Acid Motifs/genetics , Conserved Sequence , Gene Expression Regulation/genetics , Genetic Variation , Humans , Proteins/chemistry , Proteins/genetics , Structure-Activity Relationship

Structure-based comparison of four eukaryotic genomes.

Cline, Melissa; Liu, Guoying; Loraine, Ann E; Shigeta, Ronald; Cheng, Jill; Mei, Gangwu; Kulp, David; Siani-Rose, Michael A.

Pac Symp Biocomput ; : 127-38, 2002.

Article in English | MEDLINE | ID: mdl-11928469

ABSTRACT

The field of comparative genomics allows us to elucidate the molecular mechanisms necessary for the machinery of an organism by contrasting its genome against those of other organisms. In this paper, we contrast the genome of homo sapiens against C. Elegans, Drosophila melanogaster, and S. cerevisiae to gain insights on what structural domains are present in each organism. Previous work has assessed this using sequence-based homology recognition systems such as Pfam [1] and Interpro [2]. Here, we pursue a structure-based assessment, analyzing genomes according to domains in the SCOP structural domain dictionary. Compared to other eukaryotic genomes, we observe additional domains in the human genome relating to signal transduction, immune response, transport, and certain enzymes. Compared to the metazoan genomes, the yeast genome shows an absence of domains relating to immune response, cell-cell interactions, and cell signaling.

Subject(s)

Genome , Genomics/methods , Sequence Analysis, DNA/methods , Animals , Caenorhabditis elegans/genetics , Computer Simulation , Drosophila melanogaster/genetics , Enzymes/genetics , Humans , Models, Genetic , Saccharomyces cerevisiae/genetics , Zinc Fingers/genetics

Protein-based analysis of alternative splicing in the human genome.

Loraine, Ann E; Helt, Gregg A; Cline, Melissa S; Siani-Rose, Michael A.

Proc IEEE Comput Soc Bioinform Conf ; 1: 118-24, 2002.

Article in English | MEDLINE | ID: mdl-15838129

ABSTRACT

Understanding the functional significance of alternative splicing and other mechanisms that generate RNA transcript diversity is an important challenge facing modern-day molecular biology. Using homology-based, protein sequence analysis methods, it should be possible to investigate how transcript diversity impacts protein structure and function. To test this, a data mining technique ("DiffHit") was developed to identify and catalog genes producing protein isoforms which exhibit distinct profiles of conserved protein motifs. We found that out of a test set of over 1,300 alternatively spliced genes with solved genomic structure, over 30% exhibited a differential profile of conserved InterPro and/or Blocks protein motifs across distinct isoforms. These results suggest that motif databases such as Blocks and InterPro are potentially useful tools for investigating how alternative transcript structure affects gene function.

Subject(s)

Alternative Splicing/genetics , Databases, Protein , Genome, Human , Information Storage and Retrieval/methods , Proteome/genetics , Sequence Alignment/methods , Sequence Analysis, Protein/methods , Algorithms , Database Management Systems , Evolution, Molecular , Gene Expression Profiling/methods , Humans , Protein Isoforms/chemistry , Protein Isoforms/genetics , Proteome/chemistry , Sequence Homology, Amino Acid , Transcription, Genetic/genetics

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL