Database Maintenance

Go back to top

DATABASE MAINTENANCE

TABLE OF CONTENTS

Version 8.1-UNIX

Printed: April 22, 1996 15:51

These programs are used at several sites to build additional databases in GCG format.

DbStats

DbStats counts the number of entries and the total lengths of sequence entries in a GCG formatted database.

GbOnly

GbOnly creates a list of GenBank entries that have accession numbers not found in the latest release of the Embl database.

PirOnly

PirOnly and related programs select entries from PIR that are not included in the latest release of SwissProt.

CheckLen

CheckLen calculates five checksums and the sequence length for each entry in a database, and writes them to a file for use in a quick cross check for identical sequences.

CheckLenComp

CheckLenComp compares two sorted CheckLen output files, and produces a list of entries from the first file which are not found in the second.

KabatToGcg

KabatToGCG creates GCG data libraries from Kabat distribution files.

SeqDbToGcg

SeqDbToGcg converts the SeqDb database distribution files into a database in GCG format.

ConvertEnz

ConvertEnz reads lines extracted from the ENZYME database, and converts them to lower case.

Ig2Nbrf

Ig2Nbrf is a utility program that converts an IG formatted file into an NBRF formatted database which PirToGcg can index.

EMBLtoGCGSC

EMBLToGCGSC is the Sanger Centre's modification of GCG's EMBLtoGCG which reformats EMBL and SWISS-PROT flat sequence files into GCG data libraries.