Search | VHL Regional Portal

Prediction of posttranslational modifications using intact-protein mass spectrometric data.

Holmes, Mark R; Giddings, Michael C.

Anal Chem ; 76(2): 276-82, 2004 Jan 15.

Article in English | MEDLINE | ID: mdl-14719871

ABSTRACT

We present a Web-based application that uses whole-protein masses determined by mass spectrometry to identify putative co- and posttranslational proteolytic cleavages and chemical modifications. The protein cleavage and modification engine (PROCLAME) requires as input an intact mass measurement and a precursor identification based on peptide mass fingerprinting or tandem mass spectrometry. This approach predicts mass-modifying events using a depth-first tree search, bounded by a set of rules controlled by a custom-built fuzzy logic engine, to explore a large number of possible combinations of modifications accounting for the experimental mass. Candidates are saved during a search if they are within a user-specified instrument mass accuracy; the total number of possible candidates searched is based on a specified fuzzy cutoff score. Candidates are scored and ranked using a simple probabilistic model. There is generally not enough information in an intact mass measurement to determine a single unique protein characterization; however, the program provides utility by expediting the identification of sets of putative events consistent with the mass data and ranking them for further investigation. This approach uses a simple, intuitive rule base and lends itself to discovery of unannotated posttranslational events. We have assessed the program with both in silico-generated test data and with published data from an analysis of large ribosomal subunit proteins, both from the yeast S. cerevisiae. Results indicate a high degree of sensitivity and specificity in characterizing proteins whose masses resulted from reasonable proteolysis and covalent modification scenarios. The application is available on the web at http://proclame.unc.edu.

Subject(s)

Protein Processing, Post-Translational , Proteins/chemistry , Software Validation , Software , Computational Biology/methods , Fuzzy Logic , Internet , Mass Spectrometry/methods , Probability

Genome-based peptide fingerprint scanning.

Giddings, Michael C; Shah, Atul A; Gesteland, Ray; Moore, Barry.

Proc Natl Acad Sci U S A ; 100(1): 20-5, 2003 Jan 07.

Article in English | MEDLINE | ID: mdl-12518051

ABSTRACT

We have implemented a method that identifies the genomic origins of sample proteins by scanning their peptide-mass fingerprint against the theoretical translation and proteolytic digest of an entire genome. Unlike previously reported techniques, this method requires no predefined ORF or protein annotations. Fixed-size windows along the genome sequence are scored by an equation accounting for the number of matching peptides, the number of missed enzymatic cleavages in each peptide, the number of in-frame stop codons within a window, the adjacency between peptides, and duplicate peptide matches. Statistical significance of matching regions is assessed by comparing their scores to scores from windows matching randomly generated mass data. Tests with samples from Saccharomyces cerevisiae mitochondria and Escherichia coli have demonstrated the ability to produce statistically significant identifications, agreeing with two commonly used programs, peptident and mascot, in 86% of samples analyzed. This genome fingerprint scanning method has the potential to aid in genome annotation, identify proteins for which annotation is incorrect or missing, and handle cases where sequencing errors have caused framing mistakes in the databases. It might also aid in the identification of proteins in which recoding events such as frameshifting or stop-codon read-through have occurred, elucidating alternative translation mechanisms. The prototype is implemented as a clientserver pair, allowing the distribution, among a set of cluster nodes, of a single or multiple genomes for concurrent analysis.

Subject(s)

Fungal Proteins/genetics , Genome, Fungal , Peptide Mapping , Saccharomyces cerevisiae/genetics , Amino Acid Sequence , Codon, Terminator , DNA, Mitochondrial/chemistry , DNA, Mitochondrial/genetics , Open Reading Frames , Spectrometry, Mass, Electrospray Ionization

Artificial neural network prediction of antisense oligodeoxynucleotide activity.

Giddings, Michael C; Shah, Atul A; Freier, Sue; Atkins, John F; Gesteland, Raymond F; Matveeva, Olga V.

Nucleic Acids Res ; 30(19): 4295-304, 2002 Oct 01.

Article in English | MEDLINE | ID: mdl-12364609

ABSTRACT

An mRNA transcript contains many potential antisense oligodeoxynucleotide target sites. Identification of the most efficacious targets remains an important and challenging problem. Building on separate work that revealed a strong correlation between the inclusion of short sequence motifs and the activity level of an oligo, we have developed a predictive artificial neural network system for mapping tetranucleotide motif content to antisense oligo activity. Trained for high-specificity prediction, the system has been cross-validated against a database of 348 oligos from the literature and a larger proprietary database of 908 oligos. In cross- validation tests the system identified effective oligos (i.e. oligos capable of reducing target mRNA expression to <25% that of the control) with 53% accuracy, in contrast to the <10% success rates commonly reported for trial-and-error oligo selection, suggesting a possible 5-fold reduction in the in vivo screening required to find an active oligo. We have implemented a web interface to a trained neural network. Given an RNA transcript as input, the system identifies the most likely oligo targets and provides estimates of the probabilities that oligos targeted against these sites will be effective.

Subject(s)

Neural Networks, Computer , Oligodeoxyribonucleotides, Antisense/genetics , Binding Sites/genetics , Linear Models , Oligodeoxyribonucleotides, Antisense/metabolism , RNA, Messenger/genetics , RNA, Messenger/metabolism

Computational identification of putative programmed translational frameshift sites.

Shah, Atul A; Giddings, Michael C; Parvaz, Jasmin B; Gesteland, Raymond F; Atkins, John F; Ivanov, Ivaylo P.

Bioinformatics ; 18(8): 1046-53, 2002 Aug.

Article in English | MEDLINE | ID: mdl-12176827

ABSTRACT

MOTIVATION: In an effort to identify potential programmed frameshift sites by statistical analysis, we explore the hypothesis that selective pressure would have rendered such sites underabundant and underrepresented in protein-coding sequences. We developed a computer program to compare the frequencies of k-length subsequences of nucleotides with the frequencies predicted by a zero order Markov chain determined by the codon bias of the same set of sequences. The program was used to calculate and evaluate the distribution of 7-base oligonucleotides in the 6000+ putative protein-coding sequences of S. cerevisiae preliminary to the laboratory testing of the most highly underrepresented oligos for frameshifting efficiency. RESULTS: Among the most significant results is the finding that the heptanucleotides CUU-AGG-C and CUU-AGU-U, sites of the programmed +1 translational frameshifts required for the production in yeast of actin filament-binding protein ABP140 and telomerase subunit EST3, respectively, rank among the least represented of phase I heptanucleotides in the coding sequences of S. cerevisiae. Laboratory experiments demonstrated that other underrepresented heptanucleotides identified by the program, for example GGU-CAG-A, are also prone to significant translational frameshifting, suggesting the possibility that genes containing other underrepresented heptamers may also encode transframe products. AVAILABILITY: The program is available for download from http://www.gesteland.genetics.utah.edu/freqAnalysis SUPPLEMENTARY INFORMATION: Complete results from the analysis of S. cerevisiae are available on http://www.gesteland.genetics.utah.edu/freqAnalysis

Subject(s)

Algorithms , Frameshift Mutation/genetics , Models, Statistical , Protein Biosynthesis/genetics , Sequence Analysis, DNA/methods , Software , Base Composition , Base Sequence , DNA/chemistry , DNA/genetics , Databases, Genetic , Gene Expression Regulation , Models, Genetic , Molecular Sequence Data , Reproducibility of Results , Saccharomyces/genetics , Sensitivity and Specificity , Sequence Analysis, DNA/statistics & numerical data

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL