Palindrome

Go back to top

PALINDROME


FUNCTION

Palindrome searches for perfect inverted repeats in a nucleic acid sequence.


DESCRIPTION

Palindrome searches for inverted repeats in a DNA or RNA sequences by using a running window which is fixed at the 3' end of the sequence and shifts towards the 5'end by one.


AUTHOR

This program was written by Rodrigo Lopez S. (E-mail: rodrigol@biotek.uio.no; Post: Biotechnology Centre of Oslo, PO Box 1125 Blindern, N-0317 Oslo 3, Norway).

All EGCG programs are supported by the EGCG Support Team, who can be contacted by E-mail (egcg@embnet.org).


EXAMPLE

Here is a session with Palindrome that was used to see the occurence of inverted repeats of minimun length 8 in the human low density lipoprotein lipase receptor sequence between positions 1 and 500.

  
  
  % palindrome
  
  PALINDROME searches for perfect inverted repeats in a nucleic acid
  sequence.
  
   PALINDROME uses nucleotide sequence data
  
   PALINDROME of what sequence ?  GenEMBL:hsldlr02
  
              Start (* 1 *) ?
            End (* 144 *) ? 100
  
    What minimum palindrome length (* 4 *)? 5
  
    What should I call the output file (* hsldlr02.pal *) ?
  
  %
  


OUTPUT

Here is some of the output file:

  
  
  Palindrome of: em_new:hsldlr02 check: 8167  from: 1  to 100
  
  ID   HSLDLR02   standard; DNA; PRI; 144 BP.
  AC   L00336; K02573;
  DT   23-APR-1990 (Rel. 23, Created)
  DT   12-DEC-1994 (Rel. 42, Last updated, Version 9)
  DE   Human low density lipoprotein receptor gene, exon 2.
  KW   low density lipoprotein receptor-1; repeat region. . . .
  
  Minimum 5 base long palindromes found:
  
  
 1 TTTCC     5
   |||||
62 AAAGG    58
  
24 AGATG    28
   |||||
69 TCTAC    65
  


RELATED PROGRAMS

Repeats is a GCG program that finds direct repeats in nucleic acid sequences.


RESTRICTIONS

Palindrome does not allow mismatches in its present form.


ALGORITHM

Palindrome uses a shifting and ever decreasing window size on which the inverted repear search takes place. The sequence is anchored only on 3' end. From the 5' end the window moves by a shift of one.


CONSIDERATIONS

None


SUGGESTIONS

Careful scrutiny of the output file of Palindrome may reveal where mismaches occur, especially when the palindromes are very close together. To detect these more easily the user can set the palindrome length to say, 3. Note that the output file can be quite long!


INPUT FILE

The input file of Palindrome is a nucleic acid sequence in GCG format.


COMMAND-LINE SUMMARY

All parameters for this program may be put on the command line. Use the option -CHEck to see the summary below and to have a chance to add things to the command line before the program executes. In the summary below, the capitalized letters in the qualifier names are the letters that you must type in order to use the parameter. Square brackets ([ and ]) enclose qualifiers or parameter values that are optional. For more information, see "Using Program Parameters" in Chapter 3, Basic Concepts: Using Programs in the GCG User's Guide.

  
  
  Minimum Syntax: % palindrome [-INfile=]em_pr:hsldlr02 -Default
  
  Prompted Parameters:
  
  -BEGin=1 -END=3778    the range of interest
  -PALLength=4          minimum length of palindrome to search
  -OUTfile=hsldlr02.pal the output file name
  
  Local Data Files:
  
  (None)
  
  Optional Parameters:
  
  (None)
  


LOCAL DATA FILES

None.


OPTIONAL PARAMETERS

None.

Printed: April 22, 1996 15:54 (1162)