Go to the abstract in the NAR 2001 Database Issue.
Kriventseva, E.V., Servant, F., Fleischmann, W., Zdobnov, E.V., Apweiler, R.
EBI-EMBL Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD,UK
Contact zhenya@ebi.ac.uk
The CluSTr (Clusters of SWISS-PROT and TrEMBL proteins) database offers an automatic classification of SWISS-PROT and TrEMBL proteins into groups of related proteins. The clustering is based on analysis of all pairwise comparisons between protein sequences. Analysis has been carried out for different levels of protein similarity, yielding a hierarchical organisation of clusters. The database provides links to InterPro, which integrates information on protein families, domains and functional sites from PROSITE, PRINTS, Pfam, ProDom and SMART. Links to the InterPro graphical interface allow users to see at a glance whether proteins from the cluster share particular functional sites. CluSTr also provides cross-references to HSSP and PDB. The database is available for querying and browsing at http://www.ebi.ac.uk/clustr.
During last year our effort was focusted on creating clusters for complete proteomics sets. Clusters for 42 complete prokaryotic proteomes were build. In addition to existing last year complete eukaryotic proteomes of Caenorhabditis elegans, Saccharomyces cerevisiae and Drosophila melanogaster we added data for Arabidpsis thaliana. Since complete coding sequence predictions for Homo sapiens and Mus musculus are not yet available in the EMBL nucleotide sequence database, we provide clusters for SWISS-PROT and TrEMBL proteins from SPTr and Ensembl (http://www.ensembl.org) draft complete proteomes.
We thank Gene-IT for technical support.
Category Protein Sequence Motifs
Go to the abstract in the NAR 2001 Database Issue.