EReverse reverses and/or complements a sequence. EReverse is a version of GCG's Reverse with command line control.
EReverse complements and/or reverses the symbols in a sequence. The complements of all of the supported IUB nucleic acid symbols are listed in Appendix III. The output is written into a new sequence file.
This GCG program was modified by Jaakko Hattula (Tampere University of Technology, Finland) and Peter Rice (E-mail: pmr@sanger.ac.uk Post: Informatics Division, The Sanger Centre, Hinxton Hall, Cambridge, CB10 1RQ, UK).
All EGCG programs are supported by the EGCG Support Team, who can be contacted by E-mail (egcg@embnet.org).
The file test.seq contains all of the legitimate GCG sequence characters. Here is a session using EReverse to show how they would look on the opposite strand:
% ereverse
EREVERSE uses nucleotide sequence data
EREVERSE of what sequence ? gamma.seq
Start (* 1 *) ?
End (* 11375 *) ?
Do you want to:
1) reverse only
2) complement only
3) reverse and complement
Please choose one (* 3 *):
What should I call the output file (* gamma.rev *)
%
Here is the output file test.rev:
REVERSE-COMPLEMENT of: gamma.seq check: 6474 from: 1 to: 11375
Human fetal beta globins G and A gamma
from Shen, Slightom and Smithies, Cell 26; 191-203.
Analyzed by Smithies et al. Cell 26; 345-353.
gamma.rev Length: 11375 March 19, 1996 14:12 Type: N Check: 3374 ..
1 GAATTCGGCA GTTACTGCAA CTTCCACTTT TCTCTCACCC GCTCCAGGAA
51 AAGTGACCTG CAGTCACTTT CCTGGTAGTA TTGATTCTTT CTTGTTTGTG
101 GCTGTTCCCC ATTTCCAATT GTTTTCCATG ATTATTGCTT CTACTGTGAT
/////////////////////////////////////
11251 CCTCAGGTGA TCTGCCTGCC TTGGCCTCCC AAAATTCTGG GATTACAGGC
11301 GTGAGCCACC ACTCCCAGCC TCTAAACAAG TGAATCTTAA TTGCTCCTCC
11351 TCAGACTAAG GAATATCTAG GATCC
A reversed sequence is renumbered so that the first base corresponds to the last base of the range you chose. It only makes sense to complement nucleotide sequences! If you don't reverse and complement a sequence, you are in danger of having the sequence in 3' to 5' orientation. All GCG programs and all databases assume that nucleotide sequences are in 5' to 3' order, so be careful. Peptide sequences are generally kept in amino-to-carboxyl orientation. Many legitimate sequence symbols are not IUB-supported nucleic acid symbols, so they have no sensible complement (see Appendix III) .
Embedded comments are lost.
Here is the input file test.seq used in the example above:
This sequence contains every symbol in the alphabet of
legitimate GCG sequence characters (Appendix III).
Test.Seq Length: 389 July 19, 1994 15:05 Type: N Check: 8468 ..
1
>starts with the codons from appendix iii>
GCTGCCGCAG CGGCXGATGA CAATAACRAY TGTTGCTGYG ATGACGAYGA
51 AGAGGARTTT TTCTTYGGTG GCGGAGGGGG XCATCACCAY ATTATCATAA
101 THAAAAAGAA RTTGTTACTT CTCCTACTGT TRCTXYTAYT GYTRYTXATG
151 AATAACAAYC CTCCCCCACC GCCXCAACAG CARCGTCGCC GACGGCGGAG
201 AAGGCGXAGR MGAMGGMGRM GXTCTTCCTC ATCGAGTAGC TCXAGYWSXA
251 CTACCACAAC GACXGTTGTC GTAGTGGTXT GGXXXTATTA CTAYGAAGAG
301 CAACAGSART AATAGTGATA RTRATRR
>continues with all
uppercase sequence characters>
ABC DEFGHIJKLM NOPQRSTUVW
351 XYZ.+@&*ab cdefghijkl mnopqrstuv wxyz*@&+.
COMMAND-LINE SUMMARY
All parameters for this program may be put on the command line.
Use the option -CHEck to see the summary below and to have a chance to add things to the command line before the program executes.
In the summary below,
the capitalized letters in the qualifier names are the letters that you must type in order to use the parameter.
Square brackets ([ and ])
enclose qualifiers or parameter values that are optional.
For more information,
see "Using Program Parameters" in Chapter 3,
Basic Concepts: Using Programs in the GCG User's Guide.
Minimal Syntax: % ereverse [-INfile=]ggamma.seq -Default
Prompted Parameters:
-BEGin=1 -END=1700 range of interest
-REVerse reverse the strand
-COMPlement complement the strand
[-OUTfile=]ggamma.rev output file name
Local Data Files: None
Optional Parameters: None
LOCAL DATA FILES
None.
OPTIONAL PARAMETERS
None.
Printed: April 22,
1996 15:53 (1162)