Estatplot

Go back to top

ESTATPLOT(+)


FUNCTION

EStatPlot is a version of StatPlot with command line control. StatPlot plots a set of parallel curves from a table of numbers like the table written by the Window program. The statistics in each column of the table are associated with a position in the analyzed sequence.


DESCRIPTION

EStatPlot is a display program for programs like Window that make sliding window measurements on a sequence. The statistics in each column of the table are associated with some position in a sequence. EStatPlot figures out a scale for each column and then plots all of the statistics in parallel. You can choose the density in bases/cm along the horizontal axis so that different runs of EStatPlot may be compared.


AUTHOR

This GCG program was modified by Jaakko Hattula (Tampere University of Technology, Finland) and Peter Rice (E-mail: pmr@sanger.ac.uk Post: Informatics Division, The Sanger Centre, Hinxton Hall, Cambridge, CB10 1RQ, UK).

All EGCG programs are supported by the EGCG Support Team, who can be contacted by E-mail (egcg@embnet.org).


EXAMPLE

Here is a session using EStatPlot to plot the functions from the example session with EWindow:

  
  
  % estatplot
  
    ESTATPLOT of what file ?  gamma.wdw
  
    gamma.wdw contains 6 columns of 134 statistics for:
  
          Name   Check   Begin     End       Dir
     gamma.seq    6474       1     500   forward
  
    The minimum density for a one-page plot is 23.15 bases/cm.
  
    What density would you like (* 23.1 *) ?
  
    STATPLOT will take 1 pages.
    Would you like to:
  
    P)lot the statistics
    D)ifferent density
    G)et another stat file to plot
  
    Q)uit
  
    Please select one (* P *):
  
    When your LaserWriter attached to tty07 is ready, press .
  
  %
  


OUTPUT

This is the plot from the example session


RELATED PROGRAMS

Window makes a table of the frequencies of different sequence patterns within a window as it is moved along a sequence. A pattern is any short sequence like GC or R or ATG. You can plot the output with the program StatPlot.


RESTRICTIONS

No more than six columns of measurements are allowed. No more than 300,000 measurements may appear in each column. There are a number of input file format restrictions discussed below.

On Hewlett Packard plotters, density in bases per centimeter is only defined for paper that is 11 x 17 inches.


GRAPHICS

The Wisconsin Package must be configured for graphics before you run any program with graphics output! If the % setplot command is available in your installation, this is the easiest way to establish your graphics configuration, but you can also use commands like % postscript that correspond to the graphics languages the Wisconsin Package supports. See Chapter 5, Using Graphics in the User's Guide for more information about configuring your process for graphics.


CTRL-C

If you need to stop this program, use C to reset your terminal and session as gracefully as possible. Searches and comparisons write out the results from the part of the search that is complete when you use C. The graphics device should stop plotting the current page and start plotting the next page. If the current page is the last page, plotters should put the pen away and graphic terminals should return to interactive mode.


INPUT FILE

Unlike most GCG programs, there are a number of format requirements for the input file to EStatPlot

The first line of the file must identify the sequence, checksum, and range after the words of:, check:, from:, and to:. The word reverse identifies reversed sequence ranges. Reversed ranges are numbered backwards on GCG plots.

The second non-blank line is printed on the plot without interpretation.

The line containing the '..' is read and the words from the second column onwards are taken to be the column headings for labeling each part of the plot. The number of words in this dividing line between the first word (in this example 'position') and the '..' is taken to be the number of columns of statistics to be plotted. There must be a space between the last column heading and the '..'.

The data start two lines below the line with the '..'. The numbers are in the format I8, 6F12.3. This means that the position numbers are integers right justified in the first eight character columns. Each statistic has three figures to the right of the decimal and is right justified in a field 12 character-columns wide.

Here is some of the input file gamma.wdw, which you can Fetch from the database for further inspection:

  
  
   WINDOW of: gamma.seq  check: 6474  from: 1  to: 500
   Window: 100  Shift: 3  MatchType: Subset  MisMatch: 0
  
  Human fetal beta globins G and A gamma
  from Shen, Slightom and Smithies,  Cell 26; 191-203.
  Analyzed by Smithies et al. Cell 26; 345-353.
  
  February 14, 1989
  
  Position C(obsrv)  G(obsrv)  CG(obsrv) CG ob-ex(l)  GC(obsrv) GC ob-ex(l)  ..
  
50      17.000    30.000      1.000      -4.100      4.000      -1.100
53      19.000    29.000      1.000      -4.510      5.000      -0.510
56      17.000    30.000      1.000      -4.100      5.000      -0.100
  
    ///////////////////////////////////////////////////////////////////////
  
    443      31.000    14.000      0.000      -4.340      2.000      -2.340
    446      32.000    14.000      0.000      -4.480      2.000      -2.480
    449      32.000    13.000      0.000      -4.160      2.000      -2.160
  
  


COMMAND-LINE SUMMARY

All parameters for this program may be put on the command line. Use the option -CHEck to see the summary below and to have a chance to add things to the command line before the program executes. In the summary below, the capitalized letters in the qualifier names are the letters that you must type in order to use the parameter. Square brackets ([ and ]) enclose qualifiers or parameter values that are optional. For more information, see "Using Program Parameters" in Chapter 3, Basic Concepts: Using Programs in the GCG User's Guide.

  
  
  Minimum syntax: % estatplot [-INfile=]gamma.wdw -Default
  
  Prompted Parameters:
  
  -DENsity=23.15            Density (bases/cm)
  -MENU=P                   Menu option
  
  Local Data Files: gamma.mrk   marks the plot with known features
  
  Optional Parameters:
  
  -LABel                    makes vertical axis labels on every page
  -POInt                    makes points instead of a line
  -CONsistent               scales every field the same
  -SCAling                  prompts for scale limits interactively
  -BOTtom1=15.0 -TOP1=30.0  Scaling ranges (if -SCAling is used)
  
  Most EGCG graphics programs accept these and other switches. See the Using
  Graphics chapter of the EGCG USERS GUIDE for descriptions.
  
  -DENSity=150.0        plot density in bases per 100 platen units
  -LEFTMARgin=10.0      sets the left plot margin position
  -RIGHTMARgin=140.0    sets the right plot margin position
  -BOTTOMMARgin=10.0    sets the bottom plot margin position
  -TOPMARgin=90.0       sets the top plot margin position
  -BORDer               puts a line border around the plot
  -NOBORDer             suppresses a line border
  -PAGENUMber           forces page numbering
  -NOPAGENUMber         suppresses page numbering
  -TITletext="text"     overrides the default plot title
  -NOTITletext          suppresses the plot title
  -SUBTITletext="text"  overrides the default plot subtitle
  -NOSUBTITletext       suppresses the plot subtitle
  -CHEIGHT=1.5          default plot character height
  -LINESTyle1=1         plot line style 1 (set for each line)
  -LINEPERiod1=1        plot line period 1 (set for each line)
  -LINECOLor1=0         plot line colour 1 (set for each line)
  All GCG graphics programs accept these and other switches. See the Using
  Graphics chapter of the USERS GUIDE for descriptions.
  
  -FIGure[=FileName]  stores plot in a file for later input to FIGURE
  -FONT=3             draws all text on the plot using font 3
  -COLor=1            draws entire plot with pen in stall 1
  -SCAle=1.2          enlarges the plot by 20 percent (zoom in)
  -XPAN=10.0          moves plot to the right 10 platen units (pan right)
  -YPAN=10.0          moves plot up 10 platen units (pan up)
  -PORtrait           rotates plot 90 degrees
  


LOCAL DATA FILES

The files described below supply auxiliary data to this program. The program automatically reads them from a public data directory unless you either 1) have a data file with exactly the same name in your current working directory; or 2) name a file on the command line with an expression like -DATa1=myfile.dat. For more information see Chapter 4, Using Data Files in the User's Guide.

If you are studying a sequence with known features, this program marks the plot with small boxes showing the positions of these features. The presence of a file in your directory with the same name as your sequence and the file name extension .mrk causes the program to mark each range specified in the file. You can provide a marking file on the command line with an expression like -MARk= gamma.mrk. The file gamma.mrk contains information about the format of marking files. The figure for the example session shows marked regions.


OPTIONAL PARAMETERS

The parameters and switches listed below can be set from the command line. For more information, see "Using Program Parameters" in Chapter 3, Basic Concepts: Using Programs in the GCG User's Guide.

-LABel

makes vertical axis labels on both vertical axes of every page of a multipage plot.

-POInt

makes a point at each measurement instead of a continuous curve.

-CONsistent

Because EStatPlot scales each field to use the whole physical vertical axis dimension, it may cause vertical exaggeration when you want to compare similar measurements. You can use the -CONsistent switch to cause EStatPlot to plot all of the measures with the same scaling. This scaling may cause weird looking results if the measures are of different kinds as in the plot in the example. The -SCAling switch allows you to choose the absolute scaling for each field.

-SCAling

allows you to set the scaling on the vertical axis. If you use this switch you are asked for the bottom and top of each panel in the plot. The query shows the defaults calculated for each panel.

-BOTtom1

allows you to specify the scaling for panel 1 on the command line. This switch is only used is -SCAling is also on the command line.

-TOP1

allows you to specify the scaling for panel 1 on the command line. This switch is only used is -SCAling is also on the command line.

-MARk=gamma.mrk

If you are studying a sequence with known features, this program marks the plot with small boxes showing the positions of these features. The presence of a file in your directory with the same name as your sequence and the file name extension .mrk causes the program to mark each range specified in the file. The file gamma.mrk contains information about the format of marking files.

These options apply to all GCG graphics programs. These and many others are described in detail in Chapter 5, Using Graphics of the User's Guide.

-FIGure=programname.figure

writes the plot as a text file of plotting instructions suitable for input to the Figure program instead of drawing the plot on your plotter.

-FONT=3

draws all text characters on the plot using Font 3 (see Appendix I) .

-COLor=1

draws the entire plot with the pen in stall 1.

These options let you expand or reduce the plot (zoom), move it in either direction (pan), or rotate it 90 degrees (rotate).

-SCAle=1.2

expands the plot by 20 percent by resetting the scaling factor (normally 1.0) to 1.2 (zoom in). You can expand the axes independently with -XSCAle and -YSCAle. Numbers less than 1.0 contract the plot (zoom out).

-XPAN=30.0

moves the plot to the right by 30 platen units (pan right).

-YPAN=30.0

moves the plot up by 30 platen units (pan up).

-PORtrait

rotates the plot 90 degrees. Usually, plots are displayed with the horizontal axis longer than the vertical (landscape). Note that plots are reduced or enlarged, depending on the platen size, to fill the page.

Printed: April 22, 1996 15:53 (1162)