Search | VHL Regional Portal

Sequence assembly and finishing methods.

Staden, R; Judge, D P; Bonfield, J K.

Methods Biochem Anal ; 43: 303-22, 2001.

Article in English | MEDLINE | ID: mdl-11449730

Subject(s)

Sequence Analysis, DNA/methods , Sequence Analysis, DNA/statistics & numerical data , Software , Base Sequence , Computational Biology , Computer Graphics , DNA/genetics , Databases, Factual , Molecular Sequence Data

The Staden package, 1998.

Staden, R; Beal, K F; Bonfield, J K.

Methods Mol Biol ; 132: 115-30, 2000.

Article in English | MEDLINE | ID: mdl-10547834

Subject(s)

Database Management Systems , Sequence Analysis/methods , Amino Acid Sequence , Base Sequence , Computer Graphics , Contig Mapping , DNA , Molecular Sequence Data , Sequence Homology, Nucleic Acid

Automated detection of point mutations using fluorescent sequence trace subtraction.

Bonfield, J K; Rada, C; Staden, R.

Nucleic Acids Res ; 26(14): 3404-9, 1998 Jul 15.

Article in English | MEDLINE | ID: mdl-9649626

ABSTRACT

The final step in the detection of mutations is to determine the sequence of the suspected mutant and to compare it with that of the wild-type, and for this fluorescence-based sequencing instruments are widely used. We describe some simple algorithms forcomparing sequence traces which, as part of our sequence assembly and analysis package, are proving useful for the discovery of mutations and which may also help to identify misplaced readings in sequence assembly projects. The mutations can be detected automatically by a new program called TRACE_DIFF and new types of trace display in our program GAP4 greatly simplify visual checking of the assigned changes. To assess the accuracy of the automatic mutation detection algorithm we analysed 214 sequence readings from hypermutating DNA comprising a total of 108 497 bases. After the readings were assembled there were 1232 base differences, including 392 Ns and 166 alignment characters. Visual inspection of the traces established that of the 1232 differences, 353 were real mutations while the rest were due to base calling errors. The TRACE_DIFF algorithm automatically identified all but 36, with 28 false positives. Further information about the software can be obtained from http://www.mrc-lmb.cam.ac.uk/pubseq/

Subject(s)

Point Mutation , Subtraction Technique , Algorithms , Automation , Base Sequence , DNA/genetics , Fluorescence , Molecular Sequence Data

Experiment files and their application during large-scale sequencing projects.

Bonfield, J K; Staden, R.

DNA Seq ; 6(2): 109-17, 1996.

Article in English | MEDLINE | ID: mdl-8907307

ABSTRACT

The data for large scale sequencing projects are passed through several processing steps prior to assembly, and post-assembly processing generally requires knowledge of more than just the sequence of each reading. We address here the problem of providing data to individual programs and of combining all the tasks into a single process. The solution comprises two components: a file format (experiment file format) that stores information about readings, and a script (PREGAP) that controls the creation and use of experiment files by the processing programs. PREGAP can take a batch of data from a variety of sequencing instruments, gather information about each reading, and then scan the reading to select the 3' end of the good quality data, mark sequencing vector, other cloning vector sequences, and Alu segments. The results of all these operations are added to the experiment file for each reading, ready for processing by the assembly program. Experiment files also provide a mechanism for using alternative assembly engines with our package.

Subject(s)

Sequence Analysis , Software

A new DNA sequence assembly program.

Bonfield, J K; Smith, K f; Staden, R.

Nucleic Acids Res ; 23(24): 4992-9, 1995 Dec 25.

Article in English | MEDLINE | ID: mdl-8559656

ABSTRACT

We describe the Genome Assembly Program (GAP), a new program for DNA sequence assembly. The program is suitable for large and small projects, a variety of strategies and can handle data from a range of sequencing instruments. It retains the useful components of our previous work, but includes many novel ideas and methods. Many of these methods have been made possible by the program's completely new, and highly interactive, graphical user interface. The program provides many visual clues to the current state of a sequencing project and allows users to interact in intuitive and graphical ways with their data. The program has tools to display and manipulate the various types of data that help to solve and check difficult assemblies, particularly those in repetitive genomes. We have introduced the following new displays: the Contig Selector, the Contig Comparator, the Template Display, the Restriction Enzyme Map and the Stop Codon Map. We have also made it possible to have any number of Contig Editors and Contig Joining Editors running simultaneously even on the same contig. The program also includes a new 'Directed Assembly' algorithm and routines for automatically detecting unfinished segments of sequence, to which it suggests experimental solutions.

Subject(s)

Base Sequence , Software , Animals , Humans

The application of numerical estimates of base calling accuracy to DNA sequencing projects.

Bonfield, J K; Staden, R.

Nucleic Acids Res ; 23(8): 1406-10, 1995 Apr 25.

Article in English | MEDLINE | ID: mdl-7753633

ABSTRACT

During DNA sequencing projects one of the most labour intensive and highly skilled tasks is to view the original trace descriptions of gels and to adjudicate between conflicting readings. Given the current methods of calculating a consensus, the majority of the time employed in viewing traces and editing readings is actually devoted to making the poorer data fit the good data. We propose new consensus calculation algorithms that employ numerical estimates of base calling accuracy and which when used in conjunction with an automatic detector of contradictory data should greatly reduce the time spent checking and editing readings and hence improve DNA sequencing productivity.

Subject(s)

Algorithms , Consensus Sequence , Sequence Analysis, DNA/methods , Base Sequence , DNA , Decision Support Techniques , Molecular Sequence Data

Subject(s)

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL