SeqLab - Editing Multipe Sequence Alignments Interactively

The basic concepts of Lists in SeqLab has been explained in the basic description of the interface. This subsection of the BioCompanion features the multiple sequence alignment features of SeqLab exclusively.

NOTE:

The original GCG documentation features an extensive introduction the SeqLab. Users who are interested to use this system frequently via the SeqLab Interface are advised to study the "SeqLab Tutorial", also available as part of the GCG documentation.

Starting the Editor Mode

After the set-up of the GCG package is properly completed, the user may start SeqLab with

% seqlab

as dscribed before. Next, the editing mode is launched by selecting the mode in the "Main Window" accordingly. The section on sequence editing described how to enter a sequence into SeqLab.

Coloring of Sequences

The coloring of the sequences can be used in edit mode to display various properties. Amongst those, the sequence similarities as generated by a plotsimilarity run and the feature definitions (from database sequences) are the most useful. Colleagues who use amino acid alignmnets might find the residue coloring useful, but is is important to keep in mind that any manual sequence alignment might be biased and therefore inappropriate.

A small guided tour using SeqLab: Creation of a profile

As described in the profile section below, it is essential for profile generation to generate a reasonable alignment and extend the profile as needed. The similar is true for pattern searching , but in contrast to profile generation the GCG programs do not allow for an automatic pattern generation.

The following example is one possibility to generate a profile for internal repeats in a protein. Readers who would like to experiment with these options should keep in mind that the experimental evidence for such an internal repeat might be essential in order to build a profile.

The example uses human calmodulin as a starting point. It has four calcium binding sites which all inherit the same pattern, a so-called EF hand motif (defived from crystallographic data, an E helix and an F helix are interconnected by a loop that embraces the calcium ion). We will use Seqlab's feature table identification to take the four sites and slign them for profile generation.

Start with a single sequence
Launch SeqLab as usual, and load the sequence SWISSPROT:CALM_HUMAN using the GCG database browser as described.
Identify regions of interest
Change the Display mode to Graphic Features . The sequence will show four rectangular elements which resemble the regions of interest. (Double-click on the boxes and you will see the feature explanation).
Creating new sequence fragments
This step will need to be repeated four times. After selecting the region of interest with a single mouse click, use the Copy button to put the sequence into the "copy" buffer. Add a new sequence to SeqLab by chosing New from the File pulldown. Paste your sequence fragment there.
Review the generated fragments
Change to residue coloring mode and review the four sequence fragments. You might want to align them by hand but we'll do this later automatically. The sequences should all have approximately the same length.
Create the Alignment
Select the four sequences (click the first fragment, press SHIFT while clicking the last fragment), and select the pileup method from the Multiple Sequence Analysis section in the Functions pulldown. Proceed with all default options.
Map the Alignment output back to the Editor
In the Output Manager select the output and Add the alignment to the editor.
Identify Regions of Similarity
Run the plotsimilarity program from the Multiple Sequence Analysis section in the Functions pulldown. Proceed with the Options button and tick the option to generate a new colormap file. Run the program. Add the resulting color code to the editor by using the Add Color Table option in the Edit pulldown.
Select profile region
Use only the most prominent regions (i.e. the darkest ones) by clicking with the mouse on the upper-left corner of the region, and drag the rubber band over the region of interest. Next, use the profilemake section in the Multiple Sequence Analysis section in the Functions pulldown. Proceed with all default options after having selected that you want to use the selected fragments (in contrast to the entire regions).
Refine the profile
1) Run the profile search program with your newly generated profile. In the interest of speed, use Swissprot:*yeast as subset of the protein database.
2) Load the resulting set *.pfs into the editor, and display the sequences in Graphics Features mode. Identify and isolate calcium-binding regions as desired and repeat the profilemaking step as described above.

JAMF source file: multiple.jam
Next file in HTML: 'Programs to Deal with Multiple Sequences '

[next page] , or [overview] , or [table of contents]