GelStatus reads a GCG Fragment Assembly database, and produces a summary report of the quality of each contig.
GelStatus produces an output file which lists each contig and its member fragments for a project.
For each contig, GelStatus reports: the length of the contig, the number of bases sequenced in both directions, once only, or many times (in the same direction); and the number and percentage of bases that differ from one or more of the fragments.
For each fragment, GelStatus reports: the direction in which it was sequenced (relative to the consensus sequence), the length of the "working" version of the fragment (including any gaps), and the number and percentage of bases which differ from the consensus.
This program was written by Peter Rice (E-mail: pmr@sanger.ac.uk Post: Informatics Division, The Sanger Centre, Hinxton Hall, Cambridge, CB10 1RQ, UK).
All EGCG programs are supported by the EGCG Support Team, who can be contacted by E-mail (egcg@embnet.org).
Here is an example session with GelStatus
% gelstatus What should I call the output file (* geltest.dat *) ? %
Here is some of the output file.
Gelstatus of project geltest, February 7, 1995 13:47 No. Pct No. Pct Contig Length Both Many Once Dif Dif Fragment Dir Length Dif Dif ---------- ------ ---- ---- ---- --- --- ----------- --- ------ --- --- mu23 + 259 6 2.3 mu2 + 330 4 1.2 --------------- ------ --- --- mu2.con 330 0 259 71 8 2.4 ( 2 fragments) 589 10 1.7 ///////////////////////////////////////////////////////// mu26b + 173 0 0.0 mu6 - 301 14 4.7 mu10 + 367 22 6.0 mu5 - 234 5 2.1 --------------- ------ --- --- mu5.con 418 367 0 51 29 6.9 ( 3 fragments) 902 41 4.5 ------ ---- ---- ---- --- --- -------------- ------ --- --- (Total) 1215 613 259 343 48 4.0 (Total) 2328 66 2.8 geltest has 12 fragments in 4 Contigs Longest contig is mu5.con, length 418
The GCG Fragment Assembly System programs are used to enter and manipulate raw sequence data. GelPicture reads a contig from the Fragment Assembly database and displays a diagram of the gel alignments and a printout of the aligned gel sequences and consensus. GelPicture has been modified to include the sequence direction in both sections of the output, and to mark with '=======' any consensus sequence that is correct (agrees with every fragment) and has been sequenced in both directions. GelAnalyze reads a GelStatus report from a shotgun project, and produces project statistics by the method of Lander and Waterman.
The GCG Fragment Assembly System must be already started (by running GelStart) before running GelStatus
Errors are considered to be any base in a fragment which differs from the consensus, or which is ambiguous (for example X in any fragment). A gap introduced into a fragment during a GelAssemble session is also counted as an error.
The contig percentage figures are calculated relative to the length of the consensus sequence. The fragment percentage figures are calculated relative to the working length of the fragment.
GelStatus has a limited space for writing the contig and fragment names. If a contig or fragment name is longer than 10 characters, only the last 10 characters will be shown.
GelStatus should be run frequently to show the progress in assembling contigs. The output report indicates errors in assembly as very high error percentages for misaligned fragments. The report also indicates how much of each contig has been sequenced in both directions. GelPicture can be run to examine individual contigs in detail.
The output report from GelStatus is 80 characters wide, and can be printed on any printer.
The input to GelStatus is a project from the GCG Fragment Assembly System.
All parameters for this program may be put on the command line. Use the option -CHEck to see the summary below and to have a chance to add things to the command line before the program executes. In the summary below, the capitalized letters in the qualifier names are the letters that you must type in order to use the parameter. Square brackets ([ and ]) enclose qualifiers or parameter values that are optional. For more information, see "Using Program Parameters" in Chapter 3, Basic Concepts: Using Programs in the GCG User's Guide.
Minimum syntax: % gelstatus -Default Optional parameters: [-OUTfile=]myproj.dat Output filename -UPper Make all sequences upper case (do not treat lower case as ambiguous)
The parameters and switches listed below can be set from the command line. For more information, see "Using Program Parameters" in Chapter 3, Basic Concepts: Using Programs in the GCG User's Guide.
GelStatus normally treats lower case sequence as ambiguous. If you use a convention of lower case sequence characters to mean a base is "probably correct" GelStatus will report these bases as ambiguities to be corrected. If you have simply entered your sequences in lower case, you can tell GelStatus to convert them all to upper case before checking for ambuguities.
Printed: April 22, 1996 15:53 (1162)