PepStats gives a short statistical summary on the composition of a protein sequence and gives the molecular weight and isoelectric point.
PepStats is a smaller and less powerful version of PeptideSort. Its main function is to provide details of protein properties that are only obtainable from PeptideSort by carefully specifying "no enzyme". It can also provide statistics for the longest open reading frame(s) from a file written by Extract.
This program was written by Rodrigo Lopez S. (E-mail: rodrigol@biotek.uio.no; Post: Biotechnology Centre of Oslo, PO Box 1125 Blindern, N-0317 Oslo 3, Norway).
All EGCG programs are supported by the EGCG Support Team, who can be contacted by E-mail (egcg@embnet.org).
Here is a session with PepStats .
% pepstats PEPSTATS uses protein (with stop codons) sequence data PEPSTATS of what sequence ? Sw:Gshr_Human Start (* 1 *) ? End (* 478 *) ? Length = 478 Peptide Sw:Gshr_Human has 1 peptide(s) and the last stop is at position: 479 What should I call the output file (* gshr_human.stats *) ? %
Here is the output file:
PEPSTATS of: Gshr_Human check: 4050 from: 1 to: 478 ID GSHR_HUMAN STANDARD; PRT; 478 AA. AC P00390; DT 21-JUL-1986 (REL. 01, CREATED) DT 21-JUL-1986 (REL. 01, LAST SEQUENCE UPDATE) DT 01-NOV-1990 (REL. 16, LAST ANNOTATION UPDATE) . . . Continuous From: 1 To: 478 Length: 478 Summary for whole sequence: Molecular weight = 51569.02 Residues = 478 Average Residue Weight = 107.885 Charge = 1 Isoelectric point = 7.67 Residue Number Mole Percent A = Ala 42 8.787 B = Asx 0 0.000 C = Cys 10 2.092 D = Asp 21 4.393 E = Glu 29 6.067 F = Phe 14 2.929 G = Gly 43 8.996 H = His 16 3.347 I = Ile 29 6.067 K = Lys 34 7.113 L = Leu 34 7.113 M = Met 15 3.138 N = Asn 17 3.556 P = Pro 24 5.021 Q = Gln 11 2.301 R = Arg 17 3.556 S = Ser 31 6.485 T = Thr 31 6.485 V = Val 44 9.205 W = Trp 3 0.628 Y = Tyr 13 2.720 Z = Glx 0 0.000 Small (A+G) 85 17.782 Hydroxyl (S+T) 62 12.971 Acidic (D+E) 50 10.460 Acid/Amide (D+E+N+Q) 78 16.318 Basic (H+K+R) 67 14.017 Charged (D+E+H+K+R) 117 24.477 Small hphob (I+L+M+V) 122 25.523 Aromatic (F+W+Y) 30 6.276
PeptideSort shows the peptide fragments from a digest of an amino acid sequence. It sorts the peptides by weight, position, and HPLC retention at pH 2.1, and shows the composition of each peptide. It also prints a summary of the composition of the whole protein. PepStats is based on PeptideSort.
None known
The input file of PepStats is a protein sequence file.
All parameters for this program may be put on the command line. Use the option -CHEck to see the summary below and to have a chance to add things to the command line before the program executes. In the summary below, the capitalized letters in the qualifier names are the letters that you must type in order to use the parameter. Square brackets ([ and ]) enclose qualifiers or parameter values that are optional. For more information, see "Using Program Parameters" in Chapter 3, Basic Concepts: Using Programs in the GCG User's Guide.
Minimum syntax: % pepstats [-INfile=]Sw:Gshr_Human -Default Prompted parameters: -BEGin=1 -END=478 range of interest [-OUTfile=]gshr_human.stats output file name Local Data Files: [-DATa1=]aminoacid.dat contains amino acid data Optional parameters: -MINLen=0 minimum peptide length -NONTERM first residue is not the N-terminus -NOCTERM last residue is not the C-terminus
The files described below supply auxiliary data to this program. The program automatically reads them from a public data directory unless you either 1) have a data file with exactly the same name in your current working directory; or 2) name a file on the command line with an expression like -DATa1=myfile.dat. For more information see Chapter 4, Using Data Files in the User's Guide.
File AminoAcid.Dat contains values for calculation of properties. You can Fetch this file and edit the values. The file is supplied by GCG for use by the program PeptideSort.
The parameters and switches listed below can be set from the command line. For more information, see "Using Program Parameters" in Chapter 3, Basic Concepts: Using Programs in the GCG User's Guide.
specifies a minimum length for peptides to be considered, for example when using the output of EExtractPeptide as an input file.
specifies that the first residue in the input file is not the N-terminus of the protein.
specifies that the last residue in the input file is not the C-terminus of the protein.
Printed: April 22, 1996 15:54 (1162)