ExPASy Home page Site Map Search ExPASy Contact us PROSITE
Hosted by NCSC USMirror sites:Canada China Korea Switzerland Taiwan
Search for

The PROSITE database of protein families and domains
Release Notes

Release 17, December 2001


Table of contents

 1   Introduction
 2   Description of the changes made to PROSITE since release 16.0
 3   Forthcoming changes
 4   Status of the PROSITE files
 5   FTP access to PROSITE
 6   Acknowledgments

(1)   Introduction

From release 17.0 onwards the PROSITE database will be distributed apart from the SWISS-PROT release. This release of PROSITE contains 1,108 documentation entries describing 1,501 different patterns, rules or profiles/matrices. Since release 16.0, 127 entries have been added and 250 entries have been updated.

The following table shows the growth of the database since its creation in 1989.

Rel. Date Doc Entries Note
1.003/895860Only released in PC/Gene (Version 5.16)
2.003/89129132Only released in PC/Gene (Version 6.00)
3.005/89? 160 
4.010/89? 202Printed release (EMBL Biocomputing document)
5.004/90296 338 
6.011/90375433 
7.005/91441 508  
8.011/91530605  
9.006/91580 689  
10.012/92635 803  
11.010/93715 927  
12.006/94785 1029First release to include profiles
13.011/958891167 
14.012/979971335 
15.006/981014 1352 
16.0 07/9910341374 
17.011/0111081501 


(2)   Description of the changes made to PROSITE since release 16.0


2.1   Weekly update of PROSITE
We have introduced weekly updates of PROSITE, which are available for FTP download from the directory:

2.2   Distribution of a reference tool to scan PROSITE
We are now distributing a program (ps_scan) that allows to scan a sequence against all PROSITE patterns, profiles and rules. ps_scan is a perl program used to scan one or several patterns, rules and/or profiles from PROSITE against one or several protein sequences in SWISS-PROT or FASTA format. It requires two external compiled programs from the PFTOOLS package "pfscan" and "psa2msa".

2.3   Introduction of new CC qualifiers
We introduced five new qualifiers in the CC line of PROSITE matrix entries.
       2.3.1 The /MATRIX_TYPE qualifier
This qualifier describes the region in the protein identified by the profile. Example:
   CC   /MATRIX_TYPE=protein_domain; 

The matrix type can be: protein_domain, repeat_region, localization_signal or composition.

   Protein_domain   Describes a profile directed against a conserved region of a protein.
   Repeat_region   Describes a profile directed against a run of repeat units.
   Localization_signal   Describes a profile directed against a region important for the localization of the protein in the cell.
   Composition   Describes a profile directed against a region of low complexity or enriched in a given amino acid.
       2.3.2 The /SCALING_DB qualifier
This qualifier indicates which database was used to calibrate the profile. Example:
   CC   /SCALING_DB=window20_shuffled; 

Scaling databases currently used are:

   reversed    Is a protein database, randomized by taking the reverse sequence of each individual entry.
   window20    Is a protein database, locally shuffled in windows of 20 residues.
   window20_shuffled    Is a small version of a window20 protein database.
   db_global    Is a protein database, globally shuffled in windows of 20 residues.
       2.3.3 The /AUTHOR qualifier
This qualifier is used to indicate the author that created or updated the profile. Example:
   CC   /AUTHOR=K_Hofmann, P_Bucher;
The first name is the author of the profile, the second one the author of the last update.
       2.3.4 The /FT_KEY and /FT_DESC qualifiers
These qualifiers are used to give a computer readable short description of the region identified by the profile. They are based on the SWISS-PROT Feature Table key and Feature Table description currently used to define the region identified by the profile. Example:
   CC   /FT_KEY=DOMAIN; /FT_DESC=KRINGLE.
FT_KEY can be NP_BIND, MOTIF, DOMAIN, REPEAT, DNA_BIND or ZN_FING.

More details can be found on feature keys and feature descriptions in the SWISS-PROT user manual.


(3)   Forthcoming changes

3.1   Introduction of PDB accession number in the text of PDOC

We plan to introduce PDB accession number in the text of PROSITE documentation. The format is indicated by the following example:

             (see <PDB:1D4M>)
3.2   Extension of the DR line length to 76 characters
SWISS-PROT has plans to elongate the mnemonic code for the protein name from up to 4 characters to up to 5 characters. E.g. the mnemonic code for the meiotic recombination protein rec10 is currently 'RE10'. After the introduction of extended entry names it could be modified to the 5-letter code 'REC10'.

This SWISS-PROT modification will introduce a change in the size of PROSITE DR lines. As soon as SWISS-PROT introduces the 5-letter code in ID lines, we will extend PROSITE DR lines to 76 characters.

(4)   Status of the PROSITE files

PROSITE is distributed with different data and documentation files. The following table lists the files that are currently available.

prosuser.txt User manual
profile.txt Description of the profile syntax
psrelnot.txt Release notes for the current release (17)
prosite.dat Patterns, profiles and rules databases (updated weekly)
prosite.doc Documentation database for each pattern and profile (updated weekly)
prosite.lis List of documentation entries (updated weekly)
experts.txt List of on-line experts for PROSITE and SWISS-PROT (updated weekly)
jourlist.txt List of cited journals in PROSITE (updated weekly )
pautindex.txt Authors index (updated weekly)
ps_98.txt Announcement concerning PROSITE

Important notes

Two files are no longer distributed:

We have continued to include in some PROSITE documentation entries the references of Web sites relevant to the subject under consideration. There are now 62 documents that include such links.

(5)   FTP access to PROSITE

PROSITE is available for download on the following anonymous FTP servers:

Organization Swiss Institute of Bioinformatics (SIB)
Address ftp.expasy.org, au.expasy.org/ftp/, ca.expasy.org/ftp/, cn.expasy.org/ftp/, kr.expasy.org/ftp/, tw.expasy.org/ftp/, us.expasy.org/ftp/
Directory /databases/prosite/

Organization European Bioinformatics Institute (EBI)
Address ftp.ebi.ac.uk
Directory /pub/databases/prosite/

(6)   Acknowledgments

This release of PROSITE has been prepared by:

Amos Bairoch (1), Philipp Bucher (2), Laurent Falquet (2), Elisabeth Gasteiger (1), Alain Gateau (1), Alexandre Gattiker (1), Nicolas Hulo (1), Marco Pagni (2) and Christian Sigrist (1).

(1) Swiss Institute for Bioinformatics, Geneva, Switzerland;
(2) Swiss Institute for Bioinformatics, Lausanne, Switzerland.


ExPASy Home page Site Map Search ExPASy Contact us PROSITE
Hosted by NCSC USMirror sites:Canada China Korea Switzerland Taiwan