Nuclc. Acids. Res. OUP
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH ARTICLES TABLE OF CONTENTS
Compilation Paper
Categories List
Alphabetical List
Search Summary Papers

ASTRAL

http://astral.stanford.edu/

Chandonia, J. M.1, Walker, N.2, Lo Conte, L.3, Koehl, P.4, Levitt, M.4, Brenner, S. E.5

1Berkeley Structural Genomics Center, Ernest Orlando Lawrence Berkeley National Laboratory, Berkeley, CA, 94720 USA
2Department of Plant and Microbial Biology, University of California, Berkeley, CA, 94720-3102, USA
3MRC Laboratory of Molecular Biology, Hills Road, Cambridge CB2 2QH, UK
4Department of Structural Biology, D-109 Fairchild, Stanford University, Stanford, CA, USA
5Berkeley Structural Genomics Center, Ernest Orlando Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA, and Department of Plant and Microbial Biology, University of California, Berkeley, CA, 94720-3102, USA

Contact   brenner@compbio.berkeley.edu


Database Description

The ASTRAL compendium provides a set of tools and databases designed to aid investigators in the analysis of protein structure, particularly through the use of sequence comparison. Astral augments SCOP, a manual classification of protein domains according to structure, by providing a library of sequences which each corresponds to a structural domain classified in SCOP. To do so, the PDB entry for each SCOP domain is examined, and a mapping is constructed between the SEQRES information (that reflects the molecule studied) and the ATOM records (atoms observed experimentally) Because the majority of the structures in PDB are very similar to others, it is frequently helpful to reduce the redundancy by selecting high-quality representative subsets. To do this, we compare all extracted sequences using standard sequence comparison algorithms. This information is then combined with a quality score that provides a first order estimate of the resolution and regularity of crystallographically determined protein structures. We are thus able to provide sequence subsets with both limited redundancy and high quality structural information. The level of redundancy in these subsets is user defined, and is based on one of three criteria: percent sequence identity, BLAST E-value, or SCOP similarity. These sequence subsets are an ideal starting point for homology based structure prediction, and have also proven useful for testing new sequence comparison methods, and structure analysis. Several major improvements have been made to the ASTRAL compendium since its initial release two years ago. The number of protein domains included has doubled from 15,190 to 30,867, and additional databases have been added. The Rapid Access Format (RAF) database contains manually curated mappings linking the amino acid sequences of proteins in the PDB (SEQRES records in the database entry) to the atoms experimentally observed (ATOM records), in a format designed for rapid access by automated tools. This information is used to derive sequences for protein domains in the SCOP database. In cases where a SCOP domain spans several protein chains, all of which can be traced back to a single genetic source, a 'genetic domain' sequence is created by concatenating the sequences of each chain in the order found in the original gene sequence. Both the standard library of SCOP sequences and a library including genetic domain sequences are available. Selected representative subsets derived from both libraries using the criteria described above are also included.

Recent Developments

Manually curated, Rapid Access Format sequence maps Genetic Domain sequences Translation table for chemically modified amino acids

Acknowledgements

This project is funded by NIH grant 1 P50 GM62412. S.E.B. is supported NIH grant 1 K22 HG00056 and is a Searle Scholar (01-L-116).

REFERENCES

  1. Brenner SE, Koehl P, Levitt M (2000) The ASTRAL compendium for protein structure and sequence analysis. Nucleic Acids Res., 28, 254-256.

Category   Structure

Go to the abstract in the NAR 2002 Database Issue.

 

Compilation Paper
Categories List
Alphabetical List
Search Summary Papers