Gelstatus

Go back to top

GELSTATUS


FUNCTION

GelStatus reads a GCG Fragment Assembly database, and produces a summary report of the quality of each contig.


DESCRIPTION

GelStatus produces an output file which lists each contig and its member fragments for a project.

For each contig, GelStatus reports: the length of the contig, the number of bases sequenced in both directions, once only, or many times (in the same direction); and the number and percentage of bases that differ from one or more of the fragments.

For each fragment, GelStatus reports: the direction in which it was sequenced (relative to the consensus sequence), the length of the "working" version of the fragment (including any gaps), and the number and percentage of bases which differ from the consensus.


AUTHOR

This program was written by Peter Rice (E-mail: pmr@sanger.ac.uk Post: Informatics Division, The Sanger Centre, Hinxton Hall, Cambridge, CB10 1RQ, UK).

All EGCG programs are supported by the EGCG Support Team, who can be contacted by E-mail (egcg@embnet.org).


EXAMPLE

Here is an example session with GelStatus

  
  
  % gelstatus
  
    What should I call the output file (* geltest.dat *) ?
  
  %
  


OUTPUT

Here is some of the output file.

  
  
  
  Gelstatus of project geltest,    February  7, 1995 13:47
  
                                  No.  Pct                          No.  Pct
  Contig     Length  Both  Many  Once  Dif  Dif  Fragment    Dir Length  Dif  Dif
  ---------- ------  ----  ----  ----  ---  ---  ----------- --- ------  ---  ---
  
                                            mu23         +     259    6  2.3
                                            mu2          +     330    4  1.2
                                            --------------- ------  ---  ---
  mu2.con       330     0   259    71    8  2.4  (  2 fragments)    589   10  1.7
  
   /////////////////////////////////////////////////////////
                                            mu26b        +     173    0  0.0
                                            mu6          -     301   14  4.7
                                            mu10         +     367   22  6.0
                                            mu5          -     234    5  2.1
                                            --------------- ------  ---  ---
  mu5.con       418   367     0    51   29  6.9  (  3 fragments)    902   41  4.5
  
        ------  ----  ----  ----  ---  ---  --------------  ------  ---  ---
  (Total)      1215   613   259   343   48  4.0  (Total)           2328   66  2.8
  
  
   geltest has 12 fragments in 4 Contigs
  
   Longest contig is mu5.con, length 418
  


RELATED PROGRAMS

The GCG Fragment Assembly System programs are used to enter and manipulate raw sequence data. GelPicture reads a contig from the Fragment Assembly database and displays a diagram of the gel alignments and a printout of the aligned gel sequences and consensus. GelPicture has been modified to include the sequence direction in both sections of the output, and to mark with '=======' any consensus sequence that is correct (agrees with every fragment) and has been sequenced in both directions. GelAnalyze reads a GelStatus report from a shotgun project, and produces project statistics by the method of Lander and Waterman.


RESTRICTIONS

The GCG Fragment Assembly System must be already started (by running GelStart) before running GelStatus


ALGORITHM

Errors are considered to be any base in a fragment which differs from the consensus, or which is ambiguous (for example X in any fragment). A gap introduced into a fragment during a GelAssemble session is also counted as an error.

The contig percentage figures are calculated relative to the length of the consensus sequence. The fragment percentage figures are calculated relative to the working length of the fragment.


CONSIDERATIONS

GelStatus has a limited space for writing the contig and fragment names. If a contig or fragment name is longer than 10 characters, only the last 10 characters will be shown.


SUGGESTIONS

GelStatus should be run frequently to show the progress in assembling contigs. The output report indicates errors in assembly as very high error percentages for misaligned fragments. The report also indicates how much of each contig has been sequenced in both directions. GelPicture can be run to examine individual contigs in detail.


DEVICES REQUIRED

The output report from GelStatus is 80 characters wide, and can be printed on any printer.


INPUT FILE

The input to GelStatus is a project from the GCG Fragment Assembly System.


COMMAND-LINE SUMMARY

All parameters for this program may be put on the command line. Use the option -CHEck to see the summary below and to have a chance to add things to the command line before the program executes. In the summary below, the capitalized letters in the qualifier names are the letters that you must type in order to use the parameter. Square brackets ([ and ]) enclose qualifiers or parameter values that are optional. For more information, see "Using Program Parameters" in Chapter 3, Basic Concepts: Using Programs in the GCG User's Guide.

  
  
  Minimum syntax: % gelstatus -Default
  
  Optional parameters:
  
  [-OUTfile=]myproj.dat    Output filename
  -UPper                   Make all sequences upper case
                      (do not treat lower case as ambiguous)
  


OPTIONAL PARAMETERS

The parameters and switches listed below can be set from the command line. For more information, see "Using Program Parameters" in Chapter 3, Basic Concepts: Using Programs in the GCG User's Guide.

-UPper

GelStatus normally treats lower case sequence as ambiguous. If you use a convention of lower case sequence characters to mean a base is "probably correct" GelStatus will report these bases as ambiguities to be corrected. If you have simply entered your sequences in lower case, you can tell GelStatus to convert them all to upper case before checking for ambuguities.

Printed: April 22, 1996 15:53 (1162)