Table of Contents


1. Elementary Requirements and Usage of Standard Packages

1.1 Needed Equipment

1.1.1. Desktop
1.1.2. On-Site or Remote Central Computing Facilities
1.1.3. Prices

1.2 In order to connect to a remote WWW server...

1.2.1. Prerequisites
1.2.2. Needed Applications
1.2.3. Getting Launched
1.2.4. The Home Page
1.2.5. Working Effectively
1.2.6. Performance Issues
1.2.7. Panic recovery
1.2.8. Sophisticated use of the Web
1.2.9. WWW security
1.2.10. Difference to Interactive Sessions

1.3 In order to Connect for an Interactive session remotely...

1.3.1. Cabling
1.3.2. Desktop Equipment - Workstation Type
1.3.3. Hardcopy Devices
1.3.4. Local Site Information for Access:

1.4 Once You Have Made the Connection for an Interactive Session...

1.4.1. User Name
1.4.2. Password

1.5 If You Successfully Logged In

1.6 If You Need to Change Your Password

1.7 Disconnect from the Computer

1.7.1. Regular operation
1.7.2. Emergency Break: Serial Line
1.7.3. Emergency Break: 'rlogin'
1.7.4. Emergency Break: 'telnet'
1.7.5. Emergency Break: 'set host'
1.7.6. Emergency Break: PC / Macintosh

1.8 Problem of No Response: No Window or Dark Screen

1.8.1. Technical Problems
1.8.2. Configuration Problems

1.9 Problem of No or Wrong Response (Setup Worked Before)

1.9.1. Problems on Serial Lines (Modems, etc.)
1.9.2. Problems on Ethernet
1.9.3. Communication Problems on Various Configurations

1.10 No Connection to Remote Host

1.10.1. Communication Problems on Various Configurations
1.10.2. Communication Problems on TCP/IP
1.10.3. Connection Problems on X-Windows

1.11 No Successful Login

1.11.1. Local Problems on LAT
1.11.2. Local Problems on PCs
1.11.3. Remote Problems on VMS
1.11.4. Remote Problems on UNIX
1.11.5. Remote Account Problems

1.12 Problems During Session: No or Strange Events

1.12.1. Unknown Terminal
1.12.2. Screen shows lots of unreadable characters
1.12.3. Screen weird after graphics
1.12.4. Screen Accidentally Locked
1.12.5. Screen Occupied by another Program - no Reaction
1.12.6. Screen Occupied by another Program - Takes all Input
1.12.7. Keys Give Wrong Response
1.12.8. National Character Set
1.12.9. File Sharing problems
1.12.10. Need to Delete 'lock'-File

1.13 Problems Caused by User

1.13.1. Quota Exceeded
1.13.2. Need to Stop a Previous Session
1.13.3. Need to Stop a Print Session
1.13.4. File accidentially deleted

2. Getting Started

2.1 Standard Environment

2.1.1. Material and Methods
2.1.2. Setup of the Text Screen
2.1.3. X-Windows principle
2.1.4. Usage of X Windows

2.2 Standard Programs

2.2.1. Input
2.2.2. Required and Optional Parameters
2.2.3. Output

2.3 GCG environment

2.4 SeqLab (formerly Wisconsin Package Interface)

2.4.1. Purpose of SeqLab
2.4.2. The SeqLab Multiple Sequence Editor (SeqLab "edit mode")
2.4.3. SeqLab Details: The Concept of "Lists"
2.4.4. More SeqLab Details: The Concept of an "Output Manager"
2.4.5. Even more SeqLab Details: The Concept of a "Job Manager"
2.4.6. Interaction of SeqLab Windows
2.4.7. Starting SeqLab
2.4.8. SeqLab and the User

2.5 Setup of the GCG Plotting Environment

2.5.1. ... Using SeqLab
2.5.2. Plotting Setup from the Command Line Using the 'setplot' Utility
2.5.3. Plotting Setup from the Command Line Using Generic Commands
2.5.4. Verification of the Plotting Environment

2.6 Computer On-Line Documentation

2.7 GCG On-Line Documentation

2.8 GCG On-Line Documentation (SeqLab Version)

2.9 Network Help

2.9.1. USENET NEWS
2.9.2. BIOSCI BULLETIN BOARDS

2.10 Printed Documentation

2.11 Human Help


3. Data Transfer, Import, Handling, and Formatting

3.1 Transfer of Data in between Computers

3.1.1. 'ftp'
3.1.2. DECnet Copy
3.1.3. Remote UNIX Copy
3.1.4. 'Kermit'
3.1.5. ZMODEM

3.2 File Handling Commands on Various Operating Systems

3.2.1. PC/Mac File Sharing
3.2.2. Navigation on VMS and UNIX systems
3.2.3. Manipulation of VMS and UNIX files
3.2.4. Output on VMS and UNIX systems
3.2.5. Local Site Information for Printing

3.3 Local Site Information for Editing

3.3.1. Start the 'vi' Editor
3.3.2. Typing Text
3.3.3. Help for Sophisticated Functions
3.3.4. Screen Refresh
3.3.5. Exit the Editor

3.4 Import of Sequences to the GCG Package

3.4.1. Sequence Formats
3.4.2. Importing Sequences into SeqLab
3.4.3. Reformatting Sequences into GCG (via commandline)
3.4.4. PC / Mac Files

4. How to Get Information from the Databases

4.1 Principle

4.1.1. Production of Major Databases
4.1.2. Contents of a Sequence Database Entry
4.1.3. Scope of a Query
4.1.4. Networks of Databases
4.1.5. Computer Networks

4.2 Obtaining Data from Databases stored on the local resource

4.2.1. Using the GCG Software: 'lookup'
4.2.2. Using the GCG Software: 'stringsearch'
4.2.3. The GCG Database Browser in SeqLab
4.2.4. Find Sequences in the Databases with SRS
4.2.5. Find Sequences in the Databases with ENTREZ

4.3 View Sequence Data (on the GCG computer)

4.3.1. View Data on the Screen
4.3.2. Copy Data to Your Directory on the GCG host

4.4 Find Sequences in the Databases via Network Tools

4.4.1. Requirements
4.4.2. Gopher
4.4.3. WWW in Biology

4.5 Obtaining Information in Complex Context

4.5.1. Methods to collect similar entries
4.5.2. Paradigm to deal with multiple sequences
4.5.3. Programs to use several sequences simultaneously

5. Enter a Sequence

5.1 Considerations

5.1.1. The Place Of Birth Changed
5.1.2. Use of programs other than the GCG package
5.1.3. Entering Data Is Serious
5.1.4. Related Programs

5.2 Principle of SeqLab to load, edit, and manipulate sequences

5.2.1. Basic Concepts
5.2.2. Steps required before typing a new sequence
5.2.3. Editing Sequences in SeqLab

5.3 Principle of the Sequence Editor Program 'seqed' (Text mode)

5.4 Enter Your Sequence from Scratch

5.4.1. Start of the 'seqed' Program
5.4.2. Non-Sequence Information
5.4.3. Sequence Information and Exit
5.4.4. Computational aids

5.5 Modification of an Existing File or Database Sequence (Text Mode)

5.5.1. Navigation
5.5.2. Start of the 'seqed' Program
5.5.3. Screen Mode Commands
5.5.4. Command Line Mode Commands

5.6 Reformatting from RNA, DNA, MSF or other GCG Formats

5.6.1. Conversion in between GCG formats
5.6.2. Problems in reformatting

5.7 The Fragment Assembly System (FAS)

5.7.1. Principle of Fragment Assembly
5.7.2. The Fragment Assembly Process
5.7.3. The Fragment Assembly Data Structures
5.7.4. GCG's implementation of the Fragment Assembly System

6. How to Handle a Single Sequence

6.1 Prerequisites for all examples and GCG programs

6.1.1. Text Output
6.1.2. Graphic Output, Launched via Command Line
6.1.3. Graphic Output, Launched via SeqLab
6.1.4. Notes on Graphics

6.2 Composition-Counting Programs

6.2.1. Principle of alphabets
6.2.2. Principle of counting sequence ( -properties)
6.2.3. Detailed View on the "windows" Technique
6.2.4. Programs
6.2.5. Effect of the Window Size

6.3 Reading Frame Estimation Programs

6.3.1. Principle
6.3.2. GCG Programs for reading frames
6.3.3. Methods to verify the reading frame by sequence searching

6.4 Restriction Enzyme Mapping Programs

6.4.1. Principle of Patterns
6.4.2. Using Programs to Predict Primers in a Pattern Approach
6.4.3. Principle of Restriction Enzyme Mapping in a Pattern Approach
6.4.4. Programs

6.5 Translation

6.5.1. DNA to Protein
6.5.2. Protein to DNA
6.5.3. DNA to RNA and Vice Versa

6.6 Protein Tools

6.6.1. Secondary/Tertiary Structure Prediction
6.6.2. Schematical Visualisation of Secondary Structure
6.6.3. Fragmentation
6.6.4. Isoelectric Point
6.6.5. Simplification of Protein Sequences

6.7 Hints on additional software

6.7.1. Benefits of additional software
6.7.2. Disadvantages
6.7.3. Data transfer and formatting

7. Comparison of Two Sequences

7.1 Schematic Comparison

7.1.1. Principle of Sequence Alignment
7.1.2. Principle of Dotplots
7.1.3. Dotplot Principle - Improved
7.1.4. Dotplot Principle - Improved Again
7.1.5. Interpretation of Dotplots

7.2 GCG's Implementation of Schematic Comparison

7.2.1. Comparison Calculation
7.2.2. Display Program
7.2.3. Analysis of genomic sequences
7.2.4. Detection of Internal Repeats

7.3 Principle of the Analytical Comparison of Two Sequences

7.3.1. Motivation
7.3.2. Letter-by-Letter Alignment Prerequisites
7.3.3. Symbol Comparison Tables
7.3.4. Alignment Path Matrices

7.4 Comparison of DNA and protein sequences

7.4.1. Information content
7.4.2. Automatic translation on the fly

7.5 Comparison Programs

7.5.1. Using Sequence Comparison in SeqLab
7.5.2. Two Sequences of Similar Length
7.5.3. Two Sequences of Different Length
7.5.4. Two Overlapping Sequences
7.5.5. Two Sequences of Very Different Length
7.5.6. DNA and Protein Sequences
7.5.7. Programs to Display Two Aligned Sequences as Text
7.5.8. Programs to Display Two Aligned Sequences as Graphics
7.5.9. Significance Evaluation

8. Searching Patterns

8.1 Pattern Principles

8.1.1. Example of Pattern Benefit
8.1.2. Definition of a Pattern Language
8.1.3. Drawbacks of patterns

8.2 Creation of Patterns

8.2.1. Iterative Schedule
8.2.2. Considerations: Pattern Sensitivity

8.3 Programs

8.3.1. The 'findpatterns' Program
8.3.2. A PROSITE Database Searching Program
8.3.3. Other Pattern Motif Databases

9. Sequence Searching

9.1 Tools for Sequence Searching

9.1.1. Sensitive searching, using special computers
9.1.2. Extremely fast searching, using approximations

9.2 Sequence Searching with Heuristic Methods

9.2.1. Principle of Similarity Detection
9.2.2. Expectations
9.2.3. Programs

9.3 Rigorous Searching in the Twilight Zone

9.3.1. Principle
9.3.2. Programs

9.4 Searching Strategies

9.4.1. Tuning of your Sequence
9.4.2. Translate DNA
9.4.3. Tuning of Search Software Parameters
9.4.4. Statistics Analysis of Hits
9.4.5. Mapping Result Data
9.4.6. Analysis of Target Sequences

9.5 Use of Specific Searching Libraries

9.5.1. Database Sub-Libraries
9.5.2. Sequence Sets in BLAST searching
9.5.3. Sequence Lists (formerly File Of Sequence Names (FOSN))
9.5.4. Multiple Sequence Files (MSF)
9.5.5. Lists within the Wisconsin Package Interface (SeqLab)
9.5.6. Impact of Electronic Networks, Time Effects and Location
9.5.7. Creation of own Databases

10. Sequence Families

10.1 Principle of Multiple Sequence Alignment

10.1.1. Prerequisites
10.1.2. Finding the Best
10.1.3. Grouping
10.1.4. Result Evaluation
10.1.5. Limitations

10.2 SeqLab - Editing Multipe Sequence Alignments Interactively

10.2.1. Starting the Editor Mode
10.2.2. Coloring of Sequences
10.2.3. A small guided tour using SeqLab: Creation of a profile

10.3 Programs to Deal with Multiple Sequences

10.3.1. Manual Start of an Alignment with the Multi-sequence Editor
10.3.2. Manual Editing of Sequence Alignments (text mode)
10.3.3. Manual Editing of File of Sequence Names
10.3.4. Automatic Creation of File of Sequence Names
10.3.5. Automatic Generation of a Multiple Sequence Alignment
10.3.6. Display of the Dendrogram Generated by the 'pileup' program
10.3.7. Textual Presentation of the Alignment
10.3.8. Graphic Presentation of Similarity in the Alignment
10.3.9. Schematic Presentation Sequence Similarity

10.4 Phylogeny

10.4.1. Creation of a Tree
10.4.2. PAUP-based Methods

10.5 Manual Creation of Sequence Alignments (from sequence fragments)

10.6 Profiles

10.6.1. Principle
10.6.2. Formats of Sequences
10.6.3. Profile Generation
10.6.4. Profile Searching
10.6.5. Profile Analysis
10.6.6. Profile scanning

11. BioCompanion Information

11.1 History of This Document

11.1.1. Note on the Type of this Document

11.2 How to Use this Guide

11.2.1. Conventions
11.2.2. Exercises

11.3 BioCompanion and Copyright

11.3.1. GCG CD Version

11.4 Electronic versions of the BioCompanion

11.4.1. HTML Format
11.4.2. LATEX Format
11.4.3. Postscript Format
11.4.4. RTF Format
11.4.5. JAMF source code

11.5 Printed Version and Publisher Information

11.5.1. Address
11.5.2. Configuration parameters
11.5.3. Configuration files
11.5.4. Prices
11.5.5. Acknowledgements
11.5.6. Known problems

11.6 Revisions

11.6.1. Versions before 3.x
11.6.2. Version 3.0
11.6.3. Version 3.1
11.6.4. Version 3.2
11.6.5. Version information

12. Appendix

12.1 Restrictions imposed by computer use

12.2 Security

12.2.1. Operational security
12.2.2. Data protection

12.3 What is an Operating System?

12.3.1. VMS
12.3.2. UNIX: ULTRIX
12.3.3. UNIX: OSF/1
12.3.4. UNIX: SunOs
12.3.5. UNIX: Solaris
12.3.6. UNIX: IRIX
12.3.7. Windows NT

12.4 Technical Issues: Local Setup

12.4.1. Account
12.4.2. Equipment
12.4.3. Local Area Network

12.5 Technical Issues: Networking

12.5.1. Local networks
12.5.2. Protocols
12.5.3. Internet
12.5.4. 'ftp', 'www', 'telnet'
12.5.5. 'HASSLE'

13. Index

13.1 A

13.2 B

13.3 C

13.4 D

13.5 E

13.6 F

13.7 G

13.8 H

13.9 I

13.10 J

13.11 K

13.12 L

13.13 M

13.14 N

13.15 O

13.16 P

13.17 Q

13.18 R

13.19 S

13.20 T

13.21 U

13.22 V

13.23 W

13.24 X

13.25 Y

13.26 Z

13.27


Short Table of Contents

This document was written with the multi-format authoring tool JAMF (Just Another Metafile).

Publisher Information: Verlag Ute Doelz, FAX +41 61 6419012