ETranslate is a version of GCG's old Translate program with command line control added.
ETranslate creates a peptide sequence by translating nucleic acid sequences that you specify. The process of using ETranslate is the process of assembling a nucleic acid sequence suitable for translation and then letting ETranslate translate your assembly.
If you are translating from a multi-exon gene, you should assemble the nucleic acid fragments together before translating them. Every time you define a sequence fragment, ETranslate asks if you want to add another exon or translate the assembly you have. ETranslate keeps concatenating the fragments until you let it translate the assembly.
If you want to translate several different genes, ETranslate lets you make additional assemblies. Each time another nucleic acid assembly is translated, the resulting peptide sequence is added to the end of the output sequence.
ETranslate supports the translation of fragments from circular molecules by letting you define a range from the input sequence that crosses the origin of the molecule. The terminal bell rings when a circular range is chosen.
After ETranslate translates your gene, you may specify another gene from the current sequence file or get another sequence file.
ETranslate supports the IUB-IUPAC character set for the representation of nucleotide ambiguity. See Appendix III for a list of the IUB codes and their meanings.
This GCG program was modified by Jaakko Hattula (Tampere University of Technology, Finland) and Peter Rice (E-mail: pmr@sanger.ac.uk Post: Informatics Division, The Sanger Centre, Hinxton Hall, Cambridge, CB10 1RQ, UK).
All EGCG programs are supported by the EGCG Support Team, who can be contacted by E-mail (egcg@embnet.org).
Here is a session using ETranslate to translate the G-gamma gene in gamma.seq into the peptide sequence for the human fetal beta globin G gamma:
% etranslate ETRANSLATE uses nucleotide sequence data ETRANSLATE of what sequence ? gamma.seq What should I call the output file (* gamma.pep *) ? Start (* 1 *) ? 2179 End (* 11375 *) ? 2270 Reverse (* No *) ? Range begins ATGGG and ends GGAAG. Is this correct (* Yes *) ? That is done, now would you like to: A) Add another exon from this sequence B) Add another exon from a new sequence C) Translate and then add more genes from this sequence D) Translate and then add more genes from a new sequence W) Translate and write everything into a file Please choose one (* W *): a Start (* 1 *) ? 2393 End (* 11375 *) ? 2615 Reverse (* No *) ? Range begins GCTCC and ends TCAAG. Is this correct (* Yes *) ? That is done, now would you like to: A) Add another exon from this sequence B) Add another exon from a new sequence C) Translate and then add more genes from this sequence D) Translate and then add more genes from a new sequence W) Translate and write everything into a file Please choose one (* W *): a Start (* 1 *) ? 3502 End (* 11375 *) ? 3630 Reverse (* No *) ? Range begins CTCCT and ends ACTGA. Is this correct (* Yes *) ? That is done, now would you like to: A) Add another exon from this sequence B) Add another exon from a new sequence C) Translate and then add more genes from this sequence D) Translate and then add more genes from a new sequence W) Translate and write everything into a file Please choose one (* W *): %
Here is the output file ggamma.pep:
ETRANSLATE of: gamma.seq check: 6474 from: 2179 to: 2270 ETRANSLATE of: gamma.seq check: 6474 from: 2393 to: 2615 ETRANSLATE of: gamma.seq check: 6474 from: 3502 to: 3630 generated symbols 1 to: 148. gamma.seq Length: 148 September 26, 1995 17:28 Type: P Check: 6924 .. 1 MGHFTEEDKA TITSLWGKVN VEDAGGETLG RLLVVYPWTQ RFFDSFGNLS 51 SASAIMGNPK VKAHGKKVLT SLGDAIKHLD DLKGTFAQLS ELHCDKLHVD 101 PENFKLLGNV LVTVLAIHFG KEFTPEVQAS WQKMVTGVAS ALSSRYH*
EExtractPeptide is a version of ExtractPeptide with command line control. ExtractPeptide writes a peptide sequence from one or more of the translation frames displayed in the output from Map. Translate supercedes ExtractPeptide for most applications. AllTrans translates a set of aligned nucleotide sequences into protein. MyTrans is a simple EGCG application that translates part of a nucleotide sequence into protein.
The translation from the output of Map can be filed by using the ExtractPeptide program to extract the translation frames. The PepData program translates sequences in all six frames.
Unknown.
ETranslate allows you to translate sequences where the reading frame is interrupted. This frame-interruption commonly occurs across intervening sequences as in the example above, where a single codon is divided by the first intervening sequence. To accommodate frame interruption, ETranslate allows you to specify ranges (exons) that are not an even multiple of three in length. ETranslate concatenates the nucleotide ranges (exons) that you define and translates the concatenated nucleotide sequence assembly (gene) only at the moment you choose a menu item that starts with the word Translate.
If you continue after translation, you are in effect building a new assembly (gene) and concatenating the peptide sequence from the new gene onto the peptide sequence you have already created.
All parameters for this program may be put on the command line. Use the option -CHEck to see the summary below and to have a chance to add things to the command line before the program executes. In the summary below, the capitalized letters in the qualifier names are the letters that you must type in order to use the parameter. Square brackets ([ and ]) enclose qualifiers or parameter values that are optional. For more information, see "Using Program Parameters" in Chapter 3, Basic Concepts: Using Programs in the GCG User's Guide.
Minimal Syntax: % etranslate -[INfile=]gamma.seq -Default Prompted Parameters: [-OUTfile=]gamma.pep output file name -MENU1=w menu option Local Data Files: -TRANSlate=translate.txt contains the genetic code Optional Parameters: -BEGin1=1 -END1=100 limits for each range -REVerse1 direction for each range [-INfile2=]gamma.seq second input file
The files described below supply auxiliary data to this program. The program automatically reads them from a public data directory unless you either 1) have a data file with exactly the same name in your current working directory; or 2) name a file on the command line with an expression like -DATa1=myfile.dat. For more information see Chapter 4, Using Data Files in the User's Guide.
The translation of codons to amino acids, the identification of potential start codons and stop codons, and the mappings of one-letter to three-letter amino acid codes are all defined in a translation table in the file translate.txt. If the standard genetic code does not apply to your sequence, you can provide a modified version of this file in your working directory or name an alternative file on the command line with an expression like -TRANSlate=mycode.txt. Translation tables are discussed in more detail in the Data Files manual.
The parameters and switches listed below can be set from the command line. For more information, see "Using Program Parameters" in Chapter 3, Basic Concepts: Using Programs in the GCG User's Guide.
specifies the start of a fragment, defaults to the start of the sequence. -BEGin2=1 refers to the second input range, and so on.
specifies the end of a fragment, defaults to the end of the sequence. -END2=1201 refers to the second input range, and so on.
specifies the direction of assembly, defaults to forward. -REVerse2 refers to the second input range, and so on.
names the output file for the assembly.
gives the answer to the menu prompt.
names a second input sequence the assembly.
Printed: April 22, 1996 15:53 (1162)