ECorrespond looks for similar patterns of codon usage by comparing codon frequency tables.
The frequencies compared are the number of incidents of the codon in question divided by the total number of codons specifying that amino acid or terminator in each table. The statistic gets smaller as the patterns of codon usage become more similar (see Grantham, Nucl. Acids Res. 9(1); r43-r74 (1981)). ECorrespond requires codon frequency tables generated by CodonFrequency as the object of the comparison. If an amino acid is not used at all in one of the tables, its codons contribute nothing to the sum of squares. These ignored codons are counted and reported. You can file the results of a session with ECorrespond or display the results only on the screen. You may use ambiguous file names or indirect file specifications (files of filenames) for the input file(s), and ECorrespond makes all of the implied comparisons.
This GCG program was modified by Jaakko Hattula (Tampere University of Technology, Finland) and Peter Rice (E-mail: pmr@sanger.ac.uk Post: Informatics Division, The Sanger Centre, Hinxton Hall, Cambridge, CB10 1RQ, UK).
All EGCG programs are supported by the EGCG Support Team, who can be contacted by E-mail (egcg@embnet.org).
Here is session using ECorrespond to find the correspondence among all of the files ending in .cod that are provided in the Wisconsin Package(TM):
% ecorrespond ECORRESPOND of what frequency file(s) ? *.cod to what other frequency file(s) (* *.cod *) ? Do you want to file the results (* No *) ? Y What should I call the output file (* drosophilahigh.corr *) ? ///////////////////////////////////////////// %
Correspond always writes output on your screen. You can also choose to file the results. Here is part of the output file from the example session:
ECORRESPOND December 12, 1995 16:21 Between and D-Squared D Not-Counted .. drosophila_high.cod drosophila_high.cod 0.000000 0.000000 0 drosophila_high.cod ecohigh.cod 4.678955 2.163089 3 drosophila_high.cod ecolow.cod 4.938438 2.222260 3 ecohigh.cod drosophila_high.cod 4.678955 2.163089 3 ecohigh.cod ecohigh.cod 0.000000 0.000000 3 ecohigh.cod ecolow.cod 3.389803 1.841142 3 ecolow.cod drosophila_high.cod 4.938438 2.222260 3 ecolow.cod ecohigh.cod 3.389803 1.841142 3 ecolow.cod ecolow.cod 0.000000 0.000000 3
CodonFrequency generates codon frequency tables. CodonPreference finds regions of sequences that show a preference for a pattern of codon choices in a codon frequency table.
If you use ambiguous file names, all of the files in the set of files implied by your file name or file of file names must be real codon frequency tables like the ones written by ECodonFrequency. If either file specification does not contain any files, ECorrespond simply does nothing.
ECorrespond reads the normalized (/1000) data from the fourth column of the codon frequency table. It then totals these figures for each synonymous family. If the total for a family in either table is 0.0, then none of the codons from that family contribute anything to the value of D squared.
Frequency((codon)) = Number((column 4)) / Total((family)) D squared = Sum over all 64 codons of: ( Freq((codon,table 1)) - Frequency((codon,table 2)) ) (2)
If you plan to compare many codon frequency tables, naming your tables with the extension .cod simplifies your task. This allows you to specify the files ambiguously with *.cod.
The codon frequency tables that ECorrespond compares should be in the same format as the tables from the CodonFrequency program. ECorrespond only reads the fourth column of information for calculating frequencies.
All parameters for this program may be put on the command line. Use the option -CHEck to see the summary below and to have a chance to add things to the command line before the program executes. In the summary below, the capitalized letters in the qualifier names are the letters that you must type in order to use the parameter. Square brackets ([ and ]) enclose qualifiers or parameter values that are optional. For more information, see "Using Program Parameters" in Chapter 3, Basic Concepts: Using Programs in the GCG User's Guide.
Minimal Syntax: % ecorrespond [-INfile1=]*.cod -Default Prompted Parameters: [-INfile2=]*.cod tables to compare to -FILE should output file be used? [-OUTfile=]xxx.corr output file name Optional Parameters: -CONtinue1=1 continue after each set
None.
The parameters and switches listed below can be set from the command line. For more information, see "Using Program Parameters" in Chapter 3, Basic Concepts: Using Programs in the User's Guide.
makes this program loop back to the beginning and prompt for more input files after the comparison is done.
Grantham R., Gautier C., Gouy M., Jacobzone M. and Mercier R. (1981). "Codon catalog usage is a genome strategy modulated for gene expressivity." Nucl. Acids Res. 9, r43-r74.
Printed: April 22, 1996 15:52 (1162)