ELineUp is a screen editor for editing multiple sequence alignments. You can edit up to 500 sequences simultaneously. New sequences can be typed in by hand or added from existing sequence files. A consensus sequence identifies places where the sequences are in conflict.
ELineUp lets you edit several overlapping or aligned sequences simultaneously. ELineUp allows you to edit sequences in the context of an alignment to help you see the effect of your changes on the alignment.
As in SeqEd, you can move the cursor with the arrow keys and insert or delete symbols or gaps in the sequences. In ELineUp the cursor can travel from one sequence to another. You can add new sequences by hand or from existing sequence files, and you can move sequences from one position to another.
ELineUp provides a surface on which you can arrange and edit many sequences. This surface resembles a piece of graph paper with 31 rows and as many columns as you need. The screen acts as a window behind which the ELineUp surface is scrolled.
Sequences can be placed anywhere on the surface as long as two sequences in the same row do not collide. Several sequences can be placed on the same row.
Sequences placed on the ELineUp surface become part of a sequence group. A new sequence group is formed by running ELineUp with a new sequence group name. Sequences already stored in files can be placed anywhere on the surface with the Get command. New sequences (not already in sequence files) can be typed in anywhere on the surface.
When you end a session with ELineUp it writes out each sequence in a file and then writes a list file with the name and position of each sequence in the group. (See Chapter 2, Using Sequences in the User's Guide for more information about list files.) When you edit the group again, the sequences reappear on the ELineUp surface where you left them.
You can have a consensus sequence display the dominant character at each column where sequences overlap. The consensus uses uppercase where overlapping sequences are in agreement; it uses lowercase to show disagreement and periods to show where there is no consensus at all.
This GCG program was modified by Peter Rice (E-mail: pmr@sanger.ac.uk Post: Informatics Division, The Sanger Centre, Hinxton Hall, Cambridge, CB10 1RQ, UK).
All EGCG programs are supported by the EGCG Support Team, who can be contacted by E-mail (egcg@embnet.org).
Here is a session using ELineUp to edit the same sequence group displayed in the example session for GCG's Pretty program. First use Fetch to copy the files *.frg and picorna.fil to your default directory.
% elineup picorna.fil
R2 Column: 1 Row: 1 No AutoCons FOSN: PICORNA Protein 11: ................................ttttgesad.pvtttve....n..yggdt.q....vq 10: ................................ttatgesad.pvtttve....n..ygget.q....vq 9: ................................ttsagesad.pvtttve....n..ygget.q....iq 8: gvenae.kgvtentna.tadfvaqpvylpe.nqt......kv.affynrs...spi.gaftvks..... 7: glgqmlesmi.dntvretvgaatsrdalpnteasgpthskeipaltavetgatnplvpsdtvqtrhvvq 6: glgqmlesmi.dntvretvgaatsrdalpnteasgpahskeipaltavetgatnplvpsdtvqtrhvvq 5: gigdmiegav.egitknalvpptstnslpghkpsgpahskeipaltavetgatnplvpsdtvqtrhviq 4: giedliseva.qgal..tlslpkqqdslpdtkasgpahskevpaltavetgatnplapsdtvqtrhvvq 3: ...gpvedai.......t..aaigr..vadtvgtgptnseaipaltaaetghtsqvvpgdtmqtrhvkn 2: glgdeleevivekt.kqtv.asi.........ssgpkhtqkvpiltanetgatmpvlpsdsietrttym 1: ...npvenyidevlnevlv........vpninssnpttsnsapaldaaetghtssvqpedvietryvqt ..|.........|.........|.........|.........|.........|.........|......... 0 10 20 30 40 50 60 "picorna.fil" successfully loaded.
ELineUp does not insist that all your sequences start in the same column, but this is a requirement of GCG's Pretty.
To create a new sequence group, use the ELineUp command with a new group name such as myseqs. If you use % elineup myseqs, ELineUp looks in your current directory for the file myseqs.fil. If you use the command % elineup -MSF myseqs, then ELineUp looks for the file myseqs.msf. If it doesn't find a file with this name, ELineUp starts a new group with one sequence, the consensus, having the same name as the group. (If you do not want to have a consensus sequence in your group, run ELineUp with the command-line option -NOCONsensus.) To construct the group, use the Get command to add sequences from existing sequence files or use the New command so ELineUp lets you type in a new sequence. ELineUp prompts you for a unique name of up to ten characters for each new sequence.
You can start ELineUp with the name of an existing group. If you have a file of sequence names called myseqs.fil, which was created in a previous session with ELineUp use % elineup myseqs. If you have a multiple sequence format (MSF) file called myseqs.msf, which was created in a previous session with ELineUp use % elineup -MSF myseqs. You may specify a file name extension if the default extension ELineUp adds is not appropriate. You also can use any single or multiple sequence specification as input to ELineUp Multiple sequences can be specified as a list file or as a sequence specification using a wildcard. (See Chapter 2, Using Sequences in the User's Guide for help in specifying sequences.) ELineUp loads the sequences into the multiple sequence editor and starts with its window at the left end of the group. You can add more sequences, modify existing ones, delete sequences, rename sequences, and move any sequence to a new position.
In Screen Mode,
commands are typically single keystrokes.
Except for the search command,
Screen Mode commands do not require a
In Screen Mode, the cursor shows your position in one of the sequences in the group. You can insert any valid GCG sequence symbol (see Appendix III) into the sequence by typing the symbol. It is inserted at the cursor.
To move the cursor to the right one symbol,
use the
You can type a number followed by a
You can use the angle brackets to skip 50 characters to the left or right. If you precede the angle bracket by a number, the cursor skips that many characters and continues to do so until you change the number.
In contrast with the horizontal arrows,
if you precede either the
When the cursor is at the left end of a sequence,
you can move the sequence to the right with the space bar and to the left with the
To search for a pattern,
type a / (slash)
in Screen Mode.
You are prompted for the sequence pattern you wish to find.
ELineUp only searches the current sequence.
You can repeat the last search by simply using /
The command-line option -NUCleotide or the NUCleotide command in Command Mode makes ELineUp treat all nucleic acid sequences as circular and finds your pattern even if it wraps from the end of the sequence into the beginning. ELineUp uses the same rules for pattern definition and recognition as the FindPatterns, MapPlot, Map, and MapSort programs.
The command-line option -PROtein or the PROtein command in Command Mode makes ELineUp searches linear and disables the nucleic acid ambiguity meanings of the GCG sequence symbols; they also change the way the consensus sequence is defined (see the topic THE CONSENSUS SEQUENCE below).
Even if ELineUp thinks your sequence is a nucleotide sequence you can request a perfect match by typing an = right after the /. So if you type /=RTC only RTC is matched, whether you have a protein or a nucleotide sequence.
Here is the summary of Screen Mode commands you would see in the on-line help:
Screen Mode [n] is an optional numeric parameter. G, A, T, C .... - inserts a sequence character- deletes a sequence character, "drags" a sequence to the left if cursor is at its start - "pushes" a sequence to the right if cursor is at its start /TAACG - finds the next occurrence of "TAACG", last pattern is the default when none is specified [n] - move ahead [n characters] [n] - move back [n characters] [n] - move up to next sequence [or to row specified] [n] - move down to next sequence [or to row specified] [n] - move to column n H - move to start of current sequence E - move to end of current sequence R - redraw the screen D - enter Command Mode I - push over all seqs starting past current column P - pull over all seqs starting past current column [n]< - move 50 [or n] positions to left [n]> - move 50 [or n] positions to right
In Command Mode,
you enter commands followed by a
ELineUp command editing is modeled on VMS DCL command line editing.
ELineUp lets you modify and execute previous commands.
If you simply press
Only the capitalized portion of the commands described in the documentation below must be typed.
Some commands may be preceded or followed by optional numeric parameters or a file name. The square brackets ([ and ]) in the documentation below show optional command arguments: s and f refer to starting and finishing rows or offsets on the surface; x and y refer to offset and row coordinates. When an optional parameter is omitted, some commands prompt you for the value. Others commands make default assumptions that are explained in each command description.
Missing Position Parameters: Spacewalk
Several commands need position parameters to know where to put a sequence.
If these parameters are omitted,
ELineUp enters a mode,
called Spacewalk,
that allows you to move the cursor anywhere on the surface to select a position for the new sequence.
In Spacewalk Mode,
the arrow keys and
If a required name is omitted or illegal, ELineUp prompts you for a name. If you respond with a blank name, ELineUp cancels the command.
You must have write privileges in your current working directory to use ELineUp otherwise, ELineUp will not accept any name you try to give a sequence.
Often when it prompts for sequence or file names,
ELineUp presents a default value in a manner different from other GCG programs;
when the prompt appears,
it looks like you have already typed in the default value.
You can just press
Here is the summary of Command Mode commands you would see in the on-line help:
Command Mode x and y represent numbers for column and row. Only the capitalized part of the command is necessary. [x,y] Get [filename] - add sequence [at position x,y] [from filename] [x,y] New [seqname] - add empty sequence [at position x,y] [named seqname] [x,y] MOve [seqname] - move current or specified sequence [to x,y] REMove [seqname] - delete current or specified sequence entirely REName [old] [new] - change sequence name (changing consensus name changes the group) REDraw - redraw the screen HEAding [seqname] - edit documentary heading of current or specified sequence screen - enter screen mode (pressingis sufficient) NUCleotide - use nucleotide ambiguity codes in find and consensus PROtein - do not use nucleotide ambiguity codes SPacewalk - use spacewalk to position sequences NOSPacewalk - DO NOT use spacewalk to position sequences FOSN - use list file format when writing MSF - use multiple sequence format files when writing [n] SLide - add n to all sequence columns [s,f] ROWMove [n] - move a set of rows (s to f) up or down [n rows] [s,f] PRint [filename] - write the sequence group to a Pretty format file SUMmary [filename] - write the sequence names and positions in a file or on the terminal screen GOto [seqname] - put cursor on start of named sequence [s,f] CONSensus - calculate consensus [from s to f] AUtoconsensus - automatically calculate consensus (slow) NOAUtoconsensus - turn off automatic consensus FLip - reverse complement the current group ZIp [filename] - align and gap a sequence to the current group Write [filename] - write the current sequence group to a file EXit [filename] - write the current group to a file and stop Quit - quit the editor without writing out the group
:[x,y] Get [filename]
adds the sequence in the specified file to the group at column x in row y. The screen is erased and you are prompted to enter the range and strand. Unlike the Write and EXit commands, Get does not assume any file extension. You must type the file name plus any extension it requires.
:[x,y] New [SeqName]
adds an empty sequence at column x in row y.
:[x,y] MOve [SeqName]
moves the sequence to start at column x in row y. If the SeqName parameter is omitted, the sequence at the current cursor position is moved.
:REMove [SeqName]
deletes the entire sequence from the group. If the SeqName parameter is omitted, the sequence at the current cursor position is removed.
:REName [OldName] [NewName]
changes the name of the sequence. If no names are provided in the command, the sequence at the current cursor position is renamed and you are prompted for the new name. If only one name is provided, it is assumed to be the old name and you are prompted for the new name.
redraws your terminal screen. This is useful if noise in the line between your terminal and the computer has changed the screen in some unreasonable way or if a system message appears on your screen.
:[s] HEAding [SeqName]
enters Heading Mode to let you view and edit the documentary heading.
You can modify any part of the heading.
Heading Mode is terminated with
returns your session to Screen Mode.
Just pressing
sets the sequence type for each sequence in the sequence group to be nucleotide. This enables the nucleic acid ambiguity meanings of the GCG sequence symbols in pattern searches (with /) and consensus definition (set the topic THE CONSENSUS SEQUENCE below). Also, ELineUp treats nucleic acid sequences as circular when searching for a pattern. When the sequences are saved to files with either the Write or EXit command, they are written as nucleotide sequences if their sequence type is nucleotide.
sets the sequence type for each sequence in the sequence group to be protein. This forces ELineUp to treat all sequences as linear in pattern searches and does not interpret any sequence characters as nucleotide ambiguity symbols in pattern searches and consensus definition (see the topic THE CONSENSUS SEQUENCE below). When the sequences are saved to files with either the Write or EXit command, they are written as protein sequences if their sequence type is protein.
enters Spacewalk Mode, which allows you to move the cursor anywhere on the surface to select a position for a new sequence.
tells ELineUp not to use Spacewalk Mode but to prompt for numerical surface coordinates.
tells ELineUp to use the list file format when loading or storing the sequence group.
tells ELineUp to use the multiple sequence format (MSF) file when loading or storing the sequence group.
:[s,f] PRint [filename]
writes a file of the formatted sequence group from position s to f. The format resembles that of Pretty.
:SUMmary [filename]
writes a list of the names and beginning positions of the sequences loaded into the ELineUp editor. This list can go either to a file or to your screen (by typing Term for filename).
:[n] SLide
shifts all the sequence starting positions by n. The coordinate ruler appears to slide under the sequences. n can be either a positive or negative number to shift the sequences to the right or left, respectively.
:[s,f] ROWMove [n]
moves a clump of rows up or down. The sequences on rows numbered from s to f are moved up n rows. Negative values of n move the sequences down n rows. This command can be used to open a row in the middle of the surface for another sequence. ELineUp will not let you move sequences onto rows containing other sequences not simultaneously being moved.
:GOto [SeqName]
moves the cursor to the beginning of the named sequence.
:[s,f] CONsensus
calculates the consensus sequence between positions s and f. If the optional positions are omitted, the entire consensus is calculated. This command only works when ELineUp is not in the Auto Consensus state. (See the topic THE CONSENSUS SEQUENCE below for further details.)
makes ELineUp recalculate the consensus sequence each time there is a change in any of the other sequences. When ELineUp is in the Auto Consensus state, the consensus is strictly a function of the other sequences and cannot be changed directly. Unfortunately, when the sequence group is large, recomputing the consensus uses a lot of machine time and makes ELineUp appear sluggish.
turns off the Auto Consensus state. This allows you to change the consensus directly.
:ZIp [filename]
aligns a new sequence to the current consensus.
:Write [filename]
records the current surface configuration in a list file and saves the current version of each sequence in a file if the program is in FOSN mode (see the FILE NAME CONVENTIONS topic below). If the program is in MSF mode, a multiple sequence format (MSF) file is written. If the filename parameter is omitted, ELineUp uses the sequence group name specified when the program is initially run. If you specify a file in another directory, all files are created there.
:EXit [filename]
works like the Write command but stops the session after writing out the sequences. The filename parameter behaves as in the Write command.
terminates a session with ELineUp without saving any changes you've made since the last time you used the Write command.
shows the commands available in Screen and Command Modes of ELineUp
Heading Mode allows you to view and edit the documentation that precedes the sequence in the sequence file. All headings are lost if you write the sequences into a multiple sequence format (MSF) file.
To enter Heading Mode, use the HEAding command.
You can move around using the arrow keys and making insertions and deletions as you wish.
Although the editing window is only twenty lines long,
it scrolls over the heading vertically to let you see and modify any part.
Like many text editors,
typing inserts text at the cursor and the
Unlike many text editors, before letting you edit the heading, ELineUp asks you if you need more storage. You must enter the maximum number of lines that you expect to have to add. If you are in Heading Mode and find you do not have enough storage for your changes and additions, you can exit Heading Mode and enter it again, specifying some larger number of lines for increased storage.
ELineUp behaves differently depending on whether you are working with a protein or nucleotide sequence group.
If you are working with a nucleotide sequence group, then pattern searches (see "Finding Patterns" under the SCREEN MODE topic) and the consensus definition (see the topic THE CONSENSUS SEQUENCE) assume the IUB nucleotide ambiguity meanings for the GCG sequence symbols. Also, ELineUp treates nucleic acid sequences as circular when searching for patterns. When the sequences are saved to files with either the Write or EXit command, they are written as nucleotide sequences if their sequence type is nucleotide.
If you are working with a protein sequence group, then ELineUp treats all sequences as linear in pattern searches and does not interpret any sequence characters as nucleotide ambiguity symbols in pattern searches and the consensus definition. When the sequences are saved to files with either the Write or EXit command, they are written as protein sequences if their sequence type is protein.
By default, if the first sequence entered into the ELineUp editor screen is from an existing sequence file, then the type of that sequence determines the type for the entire group. If the first sequence in a sequence group is entered interactively from the keyboard, then ELineUp sets the sequence type for the entire sequence group to be protein, by default. ELineUp indicates the type of the sequence group (protein or nucleotide) in the upper-right corner of the editor screen.
You can specify the sequence type for the entire group from the command line with the -PROtein and -NUCleotide command-line options. Once you are viewing the editor screen, you can change the sequence type for the entire sequence group with the PROtein and NUCleotide command mode commands.
An optional consensus sequence can be generated as a function of the rest of the sequences in a sequence group, or, like any other sequence, typed in by you.
By default, new sequence groups contain an empty consensus at row 0 unless ELineUp is run with the -NOCONsensus command-line option.
If the sequence group has no consensus, you can create one using the New command and giving the new sequence the same name as the sequence group. The CONsensus command and the AUtoconsensus commands now work on the row you have designated for the consensus with the New command.
When a consensus sequence is generated by ELineUp either by issuing the CONsensus command or AUtoconsensus command, each consensus character in the consensus sequence is replaced with a character that is a function of the other characters in its column. If all the characters in the column are the same letter and at least one of them is uppercase, the consensus character is the uppercase equivalent of that letter. If there is more than one letter in the column, but one occurs more frequently than any other, or if all letters are the same, but none are uppercase, then the consensus character is the lowercase of that letter. Otherwise, the consensus character is a dot (.).
The consensus definition also depends on whether ELineUp is working with a nucleotide or a protein sequence group. If a protein sequence group is loaded into the mutiple sequence editor, the above description is complete. If a nucleotide sequence group is loaded into the editor, the ambiguity codes are ignored for the purpose of consensus definition. This treats all ambiguity codes as though they were the code 'N.' ELineUp indicates the type of the sequence group (protein or nucleotide) in the upper-right corner of the editor screen.
The consensus sequence is distinguished by having the same name as the sequence group. If you rename the consensus sequence with the REName command, the name of the sequence group changes as well. (You can rename the group even if you have no consensus sequence.) Conversely, if you specify a file name in a Write or EXit command, this changes the name for the sequence group being saved and also changes the consensus sequence name.
The consensus sequence is unique in that, because it will likely extend to all columns determined by the other sequences, no other sequence may share its row. You can delete the consensus sequence from your group, and you can later create a new consensus sequence. However, an existing sequence cannot become the consensus sequence, either through the REName or Get commands.
If you use ELineUp with sequence groups that have different sequences starting at several different columns, a problem will arise when you make deletions or insertions to whole columns. For example, suppose you are assembling a group of sequences with ELineUp When a new sequence is added, you may decide a previous sequence reading was incorrect. You may decide to delete a base from an old sequence near the left end of the assembly. The tail of that sequence slides left one column, destroying its alignment with any sequence starting to the right of the deletion site.
The general problem is that insertions or deletions cause shifts of register between sequences. Those sequences that overlap with the changed column appears to need adjustment. But those sequences that start down to the right do not appear misaligned.
ELineUp warns you of potential alignment problems by producing a warning sign at the top of the screen.
depending on whether there was an insertion or deletion.
The warning is only displayed if there are other sequences that start to the right of your change.
To make it easy for you to correct the alignment problem,
ELineUp provides you with screen mode commands to PULLOVER the sequences starting to the right of the cursor (
By default, ELineUp reads and writes individual sequence files, grouped in a list file (FOSN format). Using the command-line option -MSF causes ELineUp to expect a multiple sequence format (MSF) file when reading a sequence group, and to write out an MSF file when storing a sequence group (MSF format). For instance, the command % elineup -MSF hsp70 reads the sequences in the file hsp70.msf into the ELineUp editor and name the sequence group hsp70. (See Chapter 2, Using Sequences in the User's Guide for a complete description of MSF files.) When ELineUp writes an MSF file, leading gap characters (.) are added to those sequences that do not start at the beginning of the alignment so that all sequences are left-justified in the output file.
The current sequence group format is indicated as either FOSN: or MSF: on the top line of the screen editor. You can toggle between these two formats using the FOSN and MSF commands in command mode.
There is no harm in using SeqEd to change a sequence file that has been written by ELineUp Provided the name is the same, the new version is accepted by LineUp. The only restriction on replacing members of a sequence group is that the new members must not overlap with other sequences on the same row. The information where the sequence starts is stored in the list file, so changing the sequence file can only change the length of the sequence. You can change where a sequence starts on the surface by modifying the Offset and Row columns of its entry in the list file using a text editor. If you overlap two sequences on the same row, ELineUp refuses to load one of the overlapping sequences.
This version of ELineUp does not handle embedded comments. ELineUp can read files containing embedded comments, but the comments are lost and will not appear in any file written by ELineUp
If your sequences all start at the same column, you can use GCG's Pretty to generate a consensus sequence for a sequence group created by ELineUp Pretty uses a more sophisticated algorithm than ELineUp to generate a consensus sequence and you have more control over the consensus calculation. However, Pretty can only handle sequence groups whose left ends are aligned.
Pretty and ELineUp both know how to read the other's files of sequence names, so you can use Pretty to get a consensus sequence in Pretty format. Then, % pretty -UGLy makes a file of sequence names that ELineUp can read. However, the consensus sequence defined by Pretty will not be recognized as the consensus sequence of LineUp. It is named "Consensus" by Pretty, whereas ELineUp names its consensus sequence with the sequence group name. This is reasonable, since ELineUp will not define the consensus in the same way, so the names should be different.
If you alternate between using Pretty and ELineUp on a sequence group having a ELineUp consensus sequence, you have to preserve the old sequence group name when doing Pretty -UGLy in order to make ELineUp recognize the consensus sequence. If you give a new name to the group, the consensus sequence is no longer recognizable, as such, by ELineUp
Several indicators for ELineUp are displayed on the top row of the screen. The left-most word indicates the name of the sequence on which the cursor currently rests. Next, the cursor's position on the surface is displayed. Then the display shows whether ELineUp calculates the consensus automatically every time you add or delete a character. The sequence group name is next, preceded by either FOSN or MSF, indicating the file format to be used for reading and writing the sequence group. Finally, ELineUp indicates whether the type of the sequence group is nucleotide or protein.
ELineUp frequently displays the PULL-OVER WARNING sign (see the PULL-OVER AND PUSH-OVER topic above).
The screen provides a window onto the sequence surface. Through this window, 16 of the 31 surface rows can be viewed at one time. As you move your cursor near the top row of the window, for example, if there are occupied surface rows past the top of the window the surface is scrolled down, letting you see more lines at the top of the window and fewer at the bottom.
When there are more rows in use than can be displayed at once, some rows are hidden above or below the window. When this happens a '+' is displayed next to the top or bottom row number indicating hidden rows in that direction.
Although the window also scrolls horizontally, there is no analogous sign indicating that you cannot see the whole length of the surface.
When you save a sequence group using FOSN format, the name given to the FOSN is made up of the sequence group name followed by the extension .fil. The sequence file names are the sequence names used in ELineUp and the file extension .frg. When you save a sequence group using MSF format, the name given to the MSF file is made up of the sequence group name followed by the extension .msf.
These file name extensions are the defaults for ELineUp but you can specify your own by using the command-line options -FOSNEXtension, -FRAGEXtension, and -MSFEXtension (see below). You can override these choices when you specify an output file name; if you include a file extension, it is used in lieu of that given on the command line or the default.
The current version of ELineUp cannot recover from a system crash. If you are disconnected from ELineUp you lose everything you have done since the last time you saved the group using the Write or EXit commands. Therefore, we recommend that you save your work frequently using the Write command so that little is lost in the event of a crash.
ELineUp has a total of one million bytes of storage for sequences and their headings. While this is large, it is finite. If your sequence group exceeds this limit, the parameter MEMSIZE can be changed in the the file EGenInclude:emem.inc and ELineUp can be recompiled and linked. If you make MEMSIZE larger, you may notice that the computer is doing a huge number of page swaps, making the program needlessly expensive and inefficient to run. If you suspect this is happening, you should discuss the possibility of raising the size of your working set with your system manager. This will reduce the number of page swaps required.
ELineUp was designed and implemented by Dr. William Winsborough. We are very grateful for the collaboration of Drs. William Boorstein and Lynn Manseau of the UW Department of Physiological Chemistry.
All parameters for this program may be put on the command line. Use the option -CHEck to see the summary below and to have a chance to add things to the command line before the program executes. In the summary below, the capitalized letters in the qualifier names are the letters that you must type in order to use the parameter. Square brackets ([ and ]) enclose qualifiers or parameter values that are optional. For more information, see "Using Program Parameters" in Chapter 3, Basic Concepts: Using Programs in the GCG User's Guide.
Minimal Syntax: % elineup [-INfile1=]Picorna Prompted Parameters: None Local Data Files: set.keys (must be in your current working directory to be used) -DATa=swgappep.cmp scoring matrix for Zipping peptides -DATa=swgapdna.cmp scoring matrix for Zipping nucleic acids Optional Parameters: -MSF reads and writes sequence groups in MSF format -SINGlecommnd automatically returns to screen mode after each command -PROtein sets sequence type to protein, and sets find to search for perfect symbol matches -NUCleotide sets sequence type to nucleotide, and sets find to allow nucleotide ambiguity code matches -CONSROW=0 sets the consensus row for a new sequence group -NOCONsensus starts new sequence groups without a consensus row -LINesize=50 sets line length for output with the PRint command -BLOcksize=10 sets block length for output with the PRint command -FRAGEXtension=frg sets the file extension for each sequence when using FOSN format -CONSEXtension=con sets the file extension for the consensus when using FOSN format -FOSNEXtension=fil sets the file extension for the file of sequence names when using FOSN format -MSFEXtension=msf sets the file extension for the multiple sequence file when using MSF format
You can use the program SetKeys to create a set.keys file that tells the editors SeqEd, GelEnter, LineUp, and GelAssemble how to interpret the letters you type at the terminal. When entering gel readings, it is useful to have the symbols for G, A, T, and C under the fingers of one hand in the same positions as the lanes in your gel. SeqEd, GelEnter, LineUp, and GelAssemble automatically read the file set.keys if it is present in your local directory. If set.keys is absent, or if the sequence type is set to Protein (in SeqEd and LineUp, only) the terminal keys retain their conventional meanings.
If you have a set.keys file in your directory, SeqEd, GelEnter, LineUp, and GelAssemble only respond to the sequence characters that it redefines. You can edit the file set.keys with a text editor if some of the keys you want to use are not in it. Any keys not mentioned in set.keys appear to be dead.
Several keys are vital for the control of SeqEd,
and GelAssemble;
this means you are not allowed to redefine the keys for /,
The parameters and switches listed below can be set from the command line. For more information, see "Using Program Parameters" in Chapter 3, Basic Concepts: Using Programs in the GCG User's Guide.
sets ELineUp to use MSF format. ELineUp reads a sequence group from an MSF (multiple sequence format) file and writes an MSF file when storing a sequence group. The default FOSN format reads and writes individual sequence files, grouped in a file of sequence names. (See Chapter 2, Using Sequences in the User's Guide for a complete description of MSF files.)
sets ELineUp to return automatically to Screen Mode after every command in Command Mode. -NOSINGlecommand is the default.
sets the sequence type for each sequence in the sequence group to be either protein or nucleotide. By default, if the first sequence in a sequence group is read from an existing sequence file, then the type of that sequence determines the type for the entire group. Also by default, if the first sequence in a sequence group is entered interactively from the keyboard, then ELineUp sets the sequence type for the entire sequence group to be protein.
You can change the sequence type for the entire group when you are in ELineUp with the PROtein and NUCleotide commands. PROtein tells ELineUp to make pattern searches using perfect symbol matches. When ELineUp is in the nucleotide state, if you type /GARC in Screen Mode, either of the patterns GAAC or GAGC is found. In the protein state, ELineUp treats sequences as linear and will not find patterns that start at the end and continue into the beginning. In the nucleotide state, sequences are searched as though they are circular.
The automatic consensus definition is also different in the protein state than in the nucleotide state. In the nucleotide state, ambiguity codes make no contribution to the consensus. They are treated as if they were all Ns and are ignored. In the protein state, all characters have the same status.
tells ELineUp whether new files should start with a consensus sequence in the group. (Remember that you can create or remove the consensus sequence at anytime, so this is only a matter of convenience.) The default is -CONsensus.
tells ELineUp on which row to put the consensus in a new sequence group. This command is only in effect if -NOCONsensus is not on the command line. The default is row 0.
sets the line length for pretty-style output created by the PRint command. The value must be in the range from 10 to 110. The default value is 50.
sets the block length (number of bases between spaces) for pretty-style output created by the PRint command. The range of n must be 1 to line size. The default value is 10.
sets the file extension that ELineUp uses when reading and writing observation sequence files while in the FOSN format state. Do not include the dot separating the file from the extension. The default value is 'frg'.
sets the file extension that ELineUp uses when reading and writing consensus sequence files while in the FOSN format state. Do not include the dot separating the file from the extension. The default value is 'con'.
sets the file extension that ELineUp uses when reading and writing file of sequence name files while in the FOSN format state. Do not include the dot separating the file from the extension. The default value is 'fil'.
sets the file extension that ELineUp uses when reading and writing multiple sequence format files while in the MSF format state. Do not include the dot separating the file from the extension. The default value is 'msf'.
Printed: April 22, 1996 15:52 (1162)