EReverse reverses and/or complements a sequence. EReverse is a version of GCG's Reverse with command line control.
EReverse complements and/or reverses the symbols in a sequence. The complements of all of the supported IUB nucleic acid symbols are listed in Appendix III. The output is written into a new sequence file.
This GCG program was modified by Jaakko Hattula (Tampere University of Technology, Finland) and Peter Rice (E-mail: pmr@sanger.ac.uk Post: Informatics Division, The Sanger Centre, Hinxton Hall, Cambridge, CB10 1RQ, UK).
All EGCG programs are supported by the EGCG Support Team, who can be contacted by E-mail (egcg@embnet.org).
The file test.seq contains all of the legitimate GCG sequence characters. Here is a session using EReverse to show how they would look on the opposite strand:
% ereverse EREVERSE uses nucleotide sequence data EREVERSE of what sequence ? gamma.seq Start (* 1 *) ? End (* 11375 *) ? Do you want to: 1) reverse only 2) complement only 3) reverse and complement Please choose one (* 3 *): What should I call the output file (* gamma.rev *) %
Here is the output file test.rev:
REVERSE-COMPLEMENT of: gamma.seq check: 6474 from: 1 to: 11375 Human fetal beta globins G and A gamma from Shen, Slightom and Smithies, Cell 26; 191-203. Analyzed by Smithies et al. Cell 26; 345-353. gamma.rev Length: 11375 March 19, 1996 14:12 Type: N Check: 3374 .. 1 GAATTCGGCA GTTACTGCAA CTTCCACTTT TCTCTCACCC GCTCCAGGAA 51 AAGTGACCTG CAGTCACTTT CCTGGTAGTA TTGATTCTTT CTTGTTTGTG 101 GCTGTTCCCC ATTTCCAATT GTTTTCCATG ATTATTGCTT CTACTGTGAT ///////////////////////////////////// 11251 CCTCAGGTGA TCTGCCTGCC TTGGCCTCCC AAAATTCTGG GATTACAGGC 11301 GTGAGCCACC ACTCCCAGCC TCTAAACAAG TGAATCTTAA TTGCTCCTCC 11351 TCAGACTAAG GAATATCTAG GATCC
A reversed sequence is renumbered so that the first base corresponds to the last base of the range you chose. It only makes sense to complement nucleotide sequences! If you don't reverse and complement a sequence, you are in danger of having the sequence in 3' to 5' orientation. All GCG programs and all databases assume that nucleotide sequences are in 5' to 3' order, so be careful. Peptide sequences are generally kept in amino-to-carboxyl orientation. Many legitimate sequence symbols are not IUB-supported nucleic acid symbols, so they have no sensible complement (see Appendix III) .
Embedded comments are lost.
Here is the input file test.seq used in the example above:
This sequence contains every symbol in the alphabet of legitimate GCG sequence characters (Appendix III). Test.Seq Length: 389 July 19, 1994 15:05 Type: N Check: 8468 .. 1 >starts with the codons from appendix iii> GCTGCCGCAG CGGCXGATGA CAATAACRAY TGTTGCTGYG ATGACGAYGA 51 AGAGGARTTT TTCTTYGGTG GCGGAGGGGG XCATCACCAY ATTATCATAA 101 THAAAAAGAA RTTGTTACTT CTCCTACTGT TRCTXYTAYT GYTRYTXATG 151 AATAACAAYC CTCCCCCACC GCCXCAACAG CARCGTCGCC GACGGCGGAG 201 AAGGCGXAGR MGAMGGMGRM GXTCTTCCTC ATCGAGTAGC TCXAGYWSXA 251 CTACCACAAC GACXGTTGTC GTAGTGGTXT GGXXXTATTA CTAYGAAGAG 301 CAACAGSART AATAGTGATA RTRATRR >continues with all uppercase sequence characters> ABC DEFGHIJKLM NOPQRSTUVW 351 XYZ.+@&*ab cdefghijkl mnopqrstuv wxyz*@&+.
COMMAND-LINE SUMMARY
All parameters for this program may be put on the command line. Use the option -CHEck to see the summary below and to have a chance to add things to the command line before the program executes. In the summary below, the capitalized letters in the qualifier names are the letters that you must type in order to use the parameter. Square brackets ([ and ]) enclose qualifiers or parameter values that are optional. For more information, see "Using Program Parameters" in Chapter 3, Basic Concepts: Using Programs in the GCG User's Guide.
Minimal Syntax: % ereverse [-INfile=]ggamma.seq -Default Prompted Parameters: -BEGin=1 -END=1700 range of interest -REVerse reverse the strand -COMPlement complement the strand [-OUTfile=]ggamma.rev output file name Local Data Files: None Optional Parameters: None
LOCAL DATA FILES
None.
OPTIONAL PARAMETERS
None.
Printed: April 22, 1996 15:53 (1162)