Newfetch

Go back to top

NEWFETCH


FUNCTION

NewFetch copies GCG sequences or fragments or data files from the GCG database into your directory or displays them on your terminal screen and allows the user to specify a sequence range.


DESCRIPTION

The expression % newfetch *bov* will retrieve every GCG data file or sequence entry whose name contains the string bov. Sequence specification is described in detail in Chapter 2, Using Sequences of the User's Guide.

When copying a sequence from a database, Newfetch creates a file in GCG format whose name is the entry name and whose extension is the database logical name. For example, % newfetch EMBL:Hsrep2 copies the requested sequence into a file called hsrep2.em_pr. The filename extension is taken from the logical name for the database. In this example, the extension .em_pr indicates that the sequence was copied from the Primate division of the EMBL nucleotide sequence database. (See "Using Database Sequences" in Chapter 2, Using Sequences of the User's Guide for a complete listing of logical names for all GCG databases.) If the file being copied is not from a sequence database, for example enzyme.dat, then its name is not changed.

If your sequence specification contains no logical name, Newfetch looks in all the databases and in all the GCG data directories to find all possible entry names. For example, % newfetch hum* would do almost the same thing as % newfetch GenBank:hum*, except that if any sequences beginning with hum were present in databases other than GenBank or in any GCG data directories, they would also be retrieved.

Special Considerations for Searching

Keep in mind that filenames are case sensitive and database entry names are case insensitive. Because this program searches for both filenames and database entry names, you must take care when you enter the character pattern that makes up your specification.

For example, if you entered Gamma* as a file specification, this program would find all entries in the databases whose names begin with Gamma but no GCG-supplied files would be found. This is because all the files in the Wisconsin Package are named using lowercase letters. Conversely, if you entered gamma*, this program would find all of the entries in the databases and all the GCG-supplied files whose names begin with gamma.

(Note that it is often convenient to add -OUTfile= Term to the command line so the data are displayed on your terminal screen.)


AUTHOR

This GCG program was modified by Jack Leunissen (E-mail: jackl@caos.kun.nl Post: CAOS/CAMM Center, University of Nijmegen, Toernooiveld 1, 6525 ED Nijmegen, The Netherlands).

All EGCG programs are supported by the EGCG Support Team, who can be contacted by E-mail (egcg@embnet.org).


EXAMPLE

Here is a session using Newfetch to retrieve local copies of most of the GenEMBL human beta globin sequences:

  
  
  % newfetch
  
   NEWFETCH uses any sequences
  
   NEWFETCH what sequence(s) ?  EMBL:hsldlr
  
  
   hsldlr.em_pr
  
              Start (* 1 *) ?
             End (*  5095 *) ?  500
  
  
  %
  


COMMAND-LINE SUMMARY

All parameters for this program may be put on the command line. Use the option -CHEck to see the summary below and to have a chance to add things to the command line before the program executes. In the summary below, the capitalized letters in the qualifier names are the letters that you must type in order to use the parameter. Square brackets ([ and ]) enclose qualifiers or parameter values that are optional. For more information, see "Using Program Parameters" in Chapter 3, Basic Concepts: Using Programs in the GCG User's Guide.

  
  
  Minimum Syntax: % newfetch [-INfile=]GenEMBL:Humhb*
  
  Prompted Parameters:
  
  -BEGin=1 -END=506        the range of interest
  
  Local Data Files: None
  
  Optional Parameters:
  
  -OUTfile=FileName   copy file(s)-sequence(s) into one file
  -DOCLines=6         copies only the first 6 lines of documentation.
  -NOMONitor          suppresses the screen monitor
  -REFerence          copies only the documentation
  


LOCAL DATA FILES

None.


OPTIONAL PARAMETERS

The parameters and switches listed below can be set from the command line. For more information, see "Using Program Parameters" in Chapter 3, Basic Concepts: Using Programs in the GCG User's Guide.

-DOCLines=6

sets GCG programs to copy only six non-blank lines of documentation from input data files into the output files. Use the % doclines command to set this parameter for your whole session. Usually, Newfetch copies all of the documentation from each sequence entry into your new files exactly as it appeared in the original entry. In addition, it marks in the output file which region the desired fragment was extracted from.

-OUTfile=filename

copies the sequence(s) and/or data file(s) into one file which you can name. If you leave out the name of the file, Newfetch prompts you for one. (Wisconsin Sequence Analysis Package(TM) programs will not read files containing more than one sequence unless they are in an MSF (multiple sequence format) file.)

It is often useful to use Term for the filename so that the data are displayed on your terminal screen.

-REFerence

copies only the documentation for the sequence or data file. Unless specified, the name of the output file is the entry name concatenated with _ref, followed by the database logical name as the extension.

-MONitor

This program normally monitors its progress on your screen. However, when you use the -Default option to suppress all program interaction, you also suppress the monitor. You can turn it back on with this option. If your program is running in batch, the monitor will appear in the log file. If the monitor is slowing the program down, suppress it with -NOMONitor.

Printed: April 22, 1996 15:54 (1162)