Search | VHL Regional Portal

1.

Issues in incorporation semantic integrity in molecular biological object-oriented databases.

Schweigert, S; Herde, P V; Sibbald, P R.

Comput Appl Biosci ; 11(4): 339-47, 1995 Aug.

Article in English | MEDLINE | ID: mdl-8521043

ABSTRACT

Issues critical to ensuring semantic integrity in molecular biological data collections have been identified and include complexity, exceptions, missing data, changing models, holism and integration, delocalized data, interoperability and nomenclature. This combination is peculiar to biology and presents some interesting problems as a result. Little is known about semantic checking in object-oriented databases in general, but because such technology appears highly suitable for modeling biological data, it is appropriate to examine the ways in which object-oriented technology can support this functionality. It is concluded that object-oriented technology will support semantic checking even in a complex domain like biology. We propose 10 guidelines for future work including ways of treating exceptional cases and 'positioning' of constraints in a schema.

Subject(s)

Databases, Factual , Molecular Biology , Semantics , Biotechnology , Models, Biological , Terminology as Topic

2.

Deducing protein structures using logic programming: exploiting minimum data of diverse types.

Sibbald, P R.

J Theor Biol ; 173(4): 361-75, 1995 Apr 21.

Article in English | MEDLINE | ID: mdl-7783449

ABSTRACT

The extent to which a protein can be modeled from constraint data depends on the amount and quality of the data. This report quantifies a relationship between the amount of data and the achievable model resolution. In an information-theoretic framework the number of bits of information per residue needed to constrain a solution was calculated. The number of bits provided by different kinds of constraints was estimated from a tetrahedral lattice where all unique molecules of 6, 9, ..., 21 atoms were enumerated. Subsets of these molecules consistent with different constraint sets were then chosen, counted, and the root-mean-square distance between them calculated. This provided the desired relations. In a discrete system the number of possible models can be severely limited with relatively few constraints. An expert system that can model a protein from data of different types was built to illustrate the principle and was tested using known proteins as examples. C-alpha resolutions of 5 A are obtainable from 5 bits of information per amino acid and, in principle, from data that could be rapidly collected using standard biophysical techniques.

Subject(s)

Computer Simulation , Models, Chemical , Protein Folding , Protein Structure, Secondary

3.

CDR3 length in antigen-specific immune receptors.

Rock, E P; Sibbald, P R; Davis, M M; Chien, Y H.

J Exp Med ; 179(1): 323-8, 1994 Jan 01.

Article in English | MEDLINE | ID: mdl-8270877

ABSTRACT

In both immunoglobulins (Ig) and T cell receptors (TCR), the rearrangement of V, D, and J region sequence elements during lymphocyte maturation creates an enormous degree of diversity in an area referred to as the complementarity determining region 3 (CDR3) loop. Variations in the particular V, D, and J elements used, precise points of recombination, and random nucleotide addition all lead to extensive length and sequence heterogeneity. CDR3 loops are often critical for antigen binding in Igs and appear to provide the principal peptide binding residues in TCRs. To better understand the physical and selective constraints on these sequences, we have compiled information on CDR3 size variation for Ig H, L (kappa and lambda) and TCR alpha, beta, gamma, and delta. Ig H and TCR delta CDR3s are the most variable in size and are significantly longer than L and gamma chains, respectively. In contrast, TCR alpha and beta chain distributions are highly constrained, with nearly identical average CDR3 lengths, and their length distributions are not altered by thymic selection. Perhaps most significantly, these CDR3 length profiles suggest that gamma/delta TCRs are more similar to Igs than to alpha/beta TCRs in their putative ligand binding region, and thus gamma/delta and alpha/beta T cells may have fundamentally different recognition properties.

Subject(s)

Receptors, Antigen, T-Cell, alpha-beta/immunology , Receptors, Antigen, T-Cell, gamma-delta/immunology , Amino Acid Sequence , Animals , Antigens/immunology , Binding Sites , Humans , Mice , Molecular Sequence Data , Receptors, Antigen, T-Cell, alpha-beta/chemistry , Receptors, Antigen, T-Cell, gamma-delta/chemistry , T-Lymphocytes/immunology

4.

Weighting in sequence space: a comparison of methods in terms of generalized sequences.

Vingron, M; Sibbald, P R.

Proc Natl Acad Sci U S A ; 90(19): 8777-81, 1993 Oct 01.

Article in English | MEDLINE | ID: mdl-8415606

ABSTRACT

Four methods for weighting aligned biological sequences have recently appeared that differ mathematically, philosophically, and in their results. Thus, while there is consensus about the need to weight sequences, the method to use is contentious. A geometric analysis based on a continuous sequence space is presented that provides a common framework in which to compare the methods. It is concluded that there are two "best" methods. When the sequences are known to be phylogenetically related and a tree can be generated without introducing excessive stress into the data, the method of Altschul et al. [Altschul, S. F., Carroll, R. J. & Lipman, D. J. (1989) J. Mol. Biol. 207, 647-653] is appropriate. When the sequences are not known to be phylogenetically related or a tree cannot be produced without unduly distorting the distances between the sequences, a modification of the method of Sibbald and Argos [Sibbald, P. R. & Argos, P. (1990) J. Mol. Biol. 216, 813-818] is preferable.

Subject(s)

Amino Acid Sequence , Base Sequence , Information Systems , Phylogeny , DNA/chemistry , Decision Trees , Models, Biological , Sequence Homology

5.

PALM--a pattern language for molecular biology.

Helgesen, C; Sibbald, P R.

Proc Int Conf Intell Syst Mol Biol ; 1: 172-80, 1993.

Article in English | MEDLINE | ID: mdl-7584333

ABSTRACT

This paper presents a new pattern language, PALM, for describing patterns in molecular biology sequences. The language is intended for representing knowledge about such patterns in a declarative, clear and concise way. It is also shown that its expressive power enables the definition of any regular or context free language, and also higher languages in the Chomsky hierarchy by parameter attachment, variables and procedural attachment. It is also possible to define approximate patterns. The language is rigorously defined, and several examples of its use and expressive power are given.

Subject(s)

Molecular Biology/methods , Pattern Recognition, Automated , Programming Languages , Sequence Analysis/methods , DNA-Directed DNA Polymerase , DNA-Directed RNA Polymerases , Databases, Factual , Glycosaminoglycans , Lipoproteins , Natural Language Processing , Nuclear Proteins , Sulfuric Acid Esters

6.

Overseer: a nucleotide sequence searching tool.

Sibbald, P R; Sommerfeldt, H; Argos, P.

Comput Appl Biosci ; 8(1): 45-8, 1992 Feb.

Article in English | MEDLINE | ID: mdl-1568124

ABSTRACT

Overseer is a computer program that searches databases of nucleic acid sequences for objects of interest to the user. Such objects may consist of any number of simpler building blocks such as repeats, palindromes or stem-loops, strings of particular bases with or without mismatches, etc. Written in standard Pascal, this program runs under Unix and VMS and should also run under other operating systems. A simple interface allows the user to generate interactively a file containing a description of the target to be found. The searching program runs non-interactively, processing the information from the file and searching the sequences. The results are output to a file. Search capabilities are quite flexible and the code is designed to be modified. Since the framework of the program is simple, adding new modules to search for new target types as the need arises is possible.

Subject(s)

Base Sequence , Nucleic Acids/genetics , Software , Algorithms , Databases, Factual , Sequence Alignment/methods , Sequence Alignment/statistics & numerical data

7.

Consistent policy?

Sibbald, P R; Philipson, L; Cameron, G.

Nature ; 355(6356): 103, 1992 Jan 09.

Article in English | MEDLINE | ID: mdl-1729641

Subject(s)

DNA/chemistry , Databases, Factual , Base Sequence

8.

Identification of proteins in sequence databases from amino acid composition data.

Sibbald, P R; Sommerfeldt, H; Argos, P.

Anal Biochem ; 198(2): 330-3, 1991 Nov 01.

Article in English | MEDLINE | ID: mdl-1799218

ABSTRACT

Having obtained the amino acid composition of a protein, chemists and molecular biologists may wish to identify the protein from this data alone. In general such data will have errors associated with them and the length of the protein may be known only approximately or not at all. In this paper a method is described which enables searching of protein sequence databases for sequences or fragments of sequences which have a composition similar to the one being sought. Such searches are generally quite discriminating as shown by the examples provided. This method has been implemented as part of the computer program Scrutineer and is being freely distributed. It is simple to use.

Subject(s)

Amino Acids/chemistry , Databases, Factual , Proteins/chemistry , Algorithms , Amino Acid Sequence , Humans , Molecular Sequence Data , Software

9.

Automated protein sequence pattern handling and PROSITE searching.

Sibbald, P R; Sommerfeldt, H; Argos, P.

Comput Appl Biosci ; 7(4): 535-6, 1991 Oct.

Article in English | MEDLINE | ID: mdl-1747788

ABSTRACT

The protein sequence searching program Scrutineer has been modified to search for targets from a file. We are distributing a reformatted file of PROSITES which can be read by Scrutineer. In addition, Scrutineer still accepts targets typed in interactively but can now write them out in the format required as input. Since the input format is the same as the output format, target management and re-use is simple.

Subject(s)

Amino Acid Sequence , Proteins/genetics , Software

10.

Rop/helix-loop-helix similarity.

Gibson, T J; Sibbald, P R; Rice, P.

DNA Seq ; 1(3): 213-5, 1991.

Article in English | MEDLINE | ID: mdl-1773060

Subject(s)

Bacterial Proteins/chemistry , DNA-Binding Proteins/chemistry , RNA-Binding Proteins , Transcription Factors , Amino Acid Sequence , Animals , Humans , Molecular Sequence Data , Sequence Alignment , TCF Transcription Factors , Transcription Factor 7-Like 1 Protein

11.

Weighting aligned protein or nucleic acid sequences to correct for unequal representation.

Sibbald, P R; Argos, P.

J Mol Biol ; 216(4): 813-8, 1990 Dec 20.

Article in English | MEDLINE | ID: mdl-2176240

ABSTRACT

Aligned sequences from the same family (e.g. the haemoglobins) are seldom representative of the entire family. This is because (1) the sequence databases are heavily skewed toward a small number of organisms and (2) only a minute fraction of all the different family members have been sequenced. For many applications, such as using alignments or profiles to perform database searches for distantly related family members, such unequal representation requires correction. An algorithm to perform appropriate weighting of individual sequences is presented along with examples illustrating its efficacy.

Subject(s)

Amino Acid Sequence , Base Sequence , Algorithms , Animals , Globins , Molecular Sequence Data , Nucleoside-Phosphate Kinase/genetics , Thymidine Kinase/genetics , Viruses/genetics

12.

The P-loop--a common motif in ATP- and GTP-binding proteins.

Saraste, M; Sibbald, P R; Wittinghofer, A.

Trends Biochem Sci ; 15(11): 430-4, 1990 Nov.

Article in English | MEDLINE | ID: mdl-2126155

ABSTRACT

Many ATP- and GTP-binding proteins have a phosphate-binding loop (P-loop), the primary structure of which typically consists of a glycine-rich sequence followed by a conserved lysine and a serine or threonine. The three-dimensional structures of several ATP- and GTP-binding proteins containing P-loops have now been solved. In this review current knowledge of P-loops is discussed with the additional aim of illustrating the fascinating relationship between protein sequence, structure and function.

Subject(s)

Adenosine Triphosphate/metabolism , GTP-Binding Proteins/chemistry , Proteins/chemistry , Amino Acid Sequence , Animals , GTP-Binding Proteins/genetics , Humans , Molecular Sequence Data , Mutation , Protein Conformation , Proteins/genetics , Proteins/metabolism , Sequence Homology, Nucleic Acid

13.

Trans-splicing of pre-mRNA is predicted to occur in a wide range of organisms including vertebrates.

Dandekar, T; Sibbald, P R.

Nucleic Acids Res ; 18(16): 4719-25, 1990 Aug 25.

Article in English | MEDLINE | ID: mdl-2395638

ABSTRACT

Several known trans-splicing RNA structures were used to define a canonical trans-splicing structure which was then used to perform a computer search of the EMBL nucleotide database. In addition to most known trans-splicing structures, many putative new trans-splicing sites were detected. These were found in a broad range of organisms including the vertebrates. Control experiments indicate that the search predicts known false positives at a rate of only 20%. Trans-splicing may therefore be a very wide-spread phenomenon.

Subject(s)

RNA Precursors/genetics , RNA Splicing , Animals , Base Sequence , Humans , Information Systems , Molecular Sequence Data , Nucleic Acid Conformation , Phylogeny , RNA Precursors/metabolism

14.

Scrutineer: a computer program that flexibly seeks and describes motifs and profiles in protein sequence databases.

Sibbald, P R; Argos, P.

Comput Appl Biosci ; 6(3): 279-88, 1990 Jul.

Article in English | MEDLINE | ID: mdl-2207752

ABSTRACT

Scrutineer is an interactive, user-friendly program designed to search for motifs, patterns and profiles in the Swissprot, Protein Identification Resource (PIR) or SeqDb protein sequence databases. Basic capabilities include (i) searches for strings of amino acids with multiple choices at a given position; (ii) searches for strings including variable-length segments and delocalized constraints; (iii) searches over subsets of a database or particular regions within each sequence (e.g. N-terminal one-third); (iv) searches involving secondary structure predictions, physicochemical characteristics, and the like; and (v) searches using aligned sequences as targets with various optional weighting schemes. The various search criteria and hits can be combined and complex targets located. Once the data are loaded into virtual memory, all occurrences in PIR release 22.0 (3.7 x 10(6) amino acids) of a given short string of amino acids (e.g. a hexamer) are found in approximately 36 s. Scrutineer can also describe the entire database, user-specified hits, user-defined regions of sequence and all hits. The source code and accompanying manual are being freely distributed.

Subject(s)

Amino Acid Sequence , Databases, Factual , Proteins/chemistry , Software , Algorithms , Molecular Sequence Data , Software Design , User-Computer Interface

15.

A completely conserved rat U6 snRNA pseudogene coding sequence is sandwiched between a cytochrome c retropseudogene and a LINE-like sequence.

Sibbald, P R; Blencowe, B J.

Nucleic Acids Res ; 18(4): 1063, 1990 Feb 25.

Article in English | MEDLINE | ID: mdl-2156224

Subject(s)

Cytochrome c Group/genetics , Pseudogenes , RNA, Small Nuclear/genetics , Animals , Genomic Library , Information Systems , Molecular Sequence Data , Promoter Regions, Genetic , Rats , Rats, Inbred Strains , Repetitive Sequences, Nucleic Acid , Sequence Homology, Nucleic Acid

16.

Calculating higher order DNA sequence information measures.

Sibbald, P R; Banerjee, S; Maze, J.

J Theor Biol ; 136(4): 475-83, 1989 Feb 22.

Article in English | MEDLINE | ID: mdl-2682009

Subject(s)

DNA/analysis , Base Sequence , Information Theory

17.

Photosystem I-Mediated Regulation of Water Splitting in the Red Alga, Porphyra sanjuanensis.

Sibbald, P R; Vidaver, W.

Plant Physiol ; 84(4): 1373-7, 1987 Aug.

Article in English | MEDLINE | ID: mdl-16665613

ABSTRACT

The marine red alga, Porphyra sanjuanensis is found mainly in the high intertidal zone and at low tide subject to frequent and extreme water stress, often accompanied by high temperatures and light intensities. Such exposures can lead to severe desiccation which is accompanied by the progressive loss of photosynthetic activity. Even following the loss of more than 90% of the thallus water content the alga recovers rapidly when returned to seawater. This stress-induced, reversible inactivation of photosynthesis is believed to be a protective adaptation which prevents photodamage to the exposed alga. Effects of light, inhibitors of water splitting, and electron donors to PSI on variable fluorescence and water splitting suggest that activity of the oxygen evolving complex is regulated by the PSI-driven reduction of a component of intersystem electron transport.

18.

How probable are antibody cross-reactions?

Sibbald, P R; White, M J.

J Theor Biol ; 127(2): 163-9, 1987 Jul 21.

Article in English | MEDLINE | ID: mdl-3695546

ABSTRACT

Antibodies to polypeptides are increasingly being used in research. Their specificity and tight but reversible binding make them ideal for applications such as identification of proteins, immunological quantification or purification, and peptide mapping. Antibodies are also used in medicine to deliver loads to specific sites in tissues, and in electron microscopy as heavy metal conjugates to locate antigens in thin sections. While these techniques depend on specificity of antibody binding, it is occasionally observed that cross-reactions occur. These cross-reactions are attributed to the existence of one or more antibody binding sites common to both polypeptides. It is important to know whether these cross-reactions are expected due to chance alone, or if they are improbable and likely due to some causative agent. Examples of such causative agents might include gene duplication events or convergence due to functional constraints. At the present time, good methods for predicting the probability and therefore the frequency of cross-reactions are unavailable. In this paper we apply some recently reported mathematical results to address the following questions: (1) What is the probability that polyclonal or monoclonal antibodies raised against a given polypeptide will cross-react with another polypeptide due to chance alone? (2) What is the probability that polyclonal or monoclonal antibodies raised against a given polypeptide will cross-react with one or more polypeptides in a pool of polypeptides? Approximate answers to these questions are presented for cases where amino acid compositions of linear polypeptides are known or unknown, but the amino acid sequence of one or more of the polypeptides is not known. Implications of the results for antibody use in protein research are discussed.

Subject(s)

Antibodies/immunology , Amino Acid Sequence , Antibodies, Monoclonal , Binding Sites, Antibody , Cross Reactions , Peptides/immunology , Probability

19.

Copper in photosystem II: association with LHC II.

Sibbald, P R; Green, B R.

Photosynth Res ; 14(3): 201-9, 1987 Jan.

Article in English | MEDLINE | ID: mdl-24430735

ABSTRACT

Photosystem II particles from spinach and barley contained 2.5 and 4.2 Cu per 300 chlorophylls respectively. This Cu was resistant to removal by EDTA. A large percentage of the PSII Cu in both plants is associated with the light-harvesting chlorophyll a/b protein, LHCII; 46% in barley and 76% in spinach. Several experiments have been performed to rule out the possibility that the Cu was introduced during the isolation procedures and to ensure that the Cu is associated with PSII. Since the PSII Cu is mainly associated with LHCII, it is unlikely that it is involved in O2 evolution.

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

Subject(s)

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL