Nuclc. Acids. Res. OUP
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH ARTICLES TABLE OF CONTENTS
Compilation Paper
Categories List
Alphabetical List
Search Summary Papers

SMART

http://smart.embl-heidelberg.de

Letunic, I.1, Goodstadt, L.2, Dickens, NJ.2, Doerks, T.1, Schultz, J.3, Ciccarelli, F.1, Copley, RR.1, Ponting, CP.2, Bork, P.1

1EMBL, Meyerhofstrasse 1, 69012 Heidelberg, Germany
2MRC Functional Genetics Unit, University of Oxford, Department of Human Anatomy and Genetics, South Parks Road, Oxford OX1 3QX UK
3Cellzome, Meyerhofstrasse 1, 69012 Heidelberg, Germany

Contact   Ivica.Letunic@EMBL-Heidelberg.DE


Database Description

SMART (Simple Modular Architecture Research Tool, http://smart.embl-heidelberg.de) is a web-based resource used for the identification and annotation of protein domains and the analysis of domain architectures [1]. The current release has added more than 200 original hand-curated domain models. This brings the total to more than 600 domain families represented among nuclear, signalling and extracellular proteins. Extensive annotation for each domain family is available, providing information on function, subcellular localization, phyletic distribution and tertiary structure. Annotation now includes links to OMIM in cases where a human disease is associated with one or more mutations in a particular domain. A non-redundant sequence database is searched weekly for occurrences of SMART domains and several intrinsic features (transmembrane regions, coiled coils, signal peptides and internal repeats), and results are stored in a relational database. We have included new analysis methods and updated others. Internal protein repeats and transmembrane regions are detected using Prospero [2] and TMHMM2 [3], respectively. Improvements in the web interface now allow easy searches for proteins that contain user-defined combinations of domains or intrinsic features within a specified phyletic range. New advanced queries provide direct access to the SMART database using SQL, so users are no longer restricted to using simple AND-NOT logic. Protein lists are easily displayed or retrieved in FASTA format (with optional filtering for the domain of interest). Schematic representations of proteins use dynamically generated single images, which enables easy inclusion of SMART output in users' documents. SMART now provides multiple sequence alignments coloured by consensus, thereby highlighting patterns of residue conservation. SMART is now mirrored at http://smart.ox.ac.uk. In conclusion, SMART provides a unique combination of powerful and accurate analytical tools with simple visualization of results.

Recent Developments

-200 new hand curated domain profiles -Expanded domain annotations include missense mutations within domains that are known to be associated with human disease -All SMART-generated alignments may be coloured by consensus using CHROMA [4] -Transmembrane regions are predicted using TMHMM2 [3] -Internal protein repeats are detected using Prospero [2] -Protein intrinsic features (transmembrane regions, coiled-coils, signal peptides and internal repeats) are stored in a relational database and thus may be used in search queries in combination with domains -'Advanced query' form allows direct access to SMART database using SQL -Schematic representations of proteins are displayed as single images thereby allowing easy inclusion in publications -Results of SMART queries can be retrieved in FASTA-format either as full-length or domain-specific sequences

REFERENCES

  1. Schultz, J., et al., (2000) SMART: a web-based tool for the study of genetically mobile domains. Nucleic Acids Res., 28, 231-4.
  2. Mott, R. (2000) Accurate formula for P-values of gapped local sequence and profile alignments. J Mol Biol., 300, 649-59.
  3. Krogh, A., et al. (2001) Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J. Mol. Biol., 305, 567-80.
  4. Goodstadt, L. and C.P. Ponting. (2001) CHROMA: consensus-based colouring of multiple alignments for publication. Bioinformatics, in press.

Category   Protein Sequence Motifs

Go to the abstract in the NAR 2002 Database Issue.

 

Compilation Paper
Categories List
Alphabetical List
Search Summary Papers