CDD

Compilation Paper

Categories List

Alphabetical List

Search Summary Papers

CDD

http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml

Marchler-Bauer, A., Panchenko, A.R., Shoemaker, B.A., Thiessen, P.A., Geer, L.Y., Bryant, S.H.

National Center for Biotechnology Information, National Library of Medicine, NIH Bldg. 38A, Room 8N805, 8600 Rockville Pike, Bethesda, MD 20894

Contact bauer@ncbi.nlm.nih.gov

Database Description

The Conserved Domain Database (CDD) is a compilation of multiple sequence alignments representing protein domains conserved in molecular evolution. It has been populated with alignment data from the public collections Pfam ( 1) and Smart ( 2), as well as with contributions from colleagues at NCBI. The current version of CDD (v1.53) contains 3551 such models. CDD alignments are linked to protein sequence and structure data in Entrez (3 ). The molecular structure viewer Cn3D (4 ) serves as a tool to interactively visualize alignments and three-dimensional structure, and to annotate three-dimensional residue coordinates with evolutionarily conserved features. CDD can be accessed on the world-wide-web at the URL http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml. Protein query sequences may be compared against databases of position-specific score matrices (PSSMs) derived from alignments in CDD, using a service named CD-Search, which can be found at http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi. CD-Search runs reverse position-specific BLAST (RPS-BLAST), a variant of the widely used PSI-BLAST algorithm ( 5). CD-Search is run by default for protein-protein queries submitted to NCBI's BLAST-service at http://www.ncbi.nlm.nih.gov/BLAST.

Acknowledgements

We are grateful towards the authors of Pfam and Smart, for creating an invaluable resource. We thank Chris Ponting, Alex Bateman, Eugene Koonin, and David Lipman for discussions and helpful suggestions, L. Aravind for providing sequence alignment data, Tom Madden and Sergei Shavirin for making RPS-BLAST available, Richard Copley for providing access to Smart data, and Naomi Ariel for help with the initial analysis of imported alignments.

REFERENCES

Bateman, A., Birney, E., Durbin, R., Eddy, S.R., Howe, K.L. and Sonnhammer, E.L.L. (2000) The Pfam proteins family database. Nucleic Acids Res., 28, 263-266.
Schultz, J., Copley, R.R., Doerks, T., Ponting, C.P. and Bork, P. (2000) SMART: a web-based tool for the study of genetically mobile domains. Nucleic Acids Res., 28, 231-234.
Wheeler, D.L., Church, D.M., Lash, A.E., Leipe, D.D., Madden, T.L., Pontius, J.U., Schuler, G.D., Schriml, L.M., Tatusova, T.A., Wagner, L. and Rapp, B.A. (2001) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res., 29, 11-16.
Wang, Y., Geer, L.Y., Chappey, C., Kans, J.A. and Bryant, S.H. (2000) Cn3D: sequence and structure views from Entrez. Trends Biochem. Sci., 25, 300-302.
Altschul, S.F., Madden, T.L., Sch‰ffer, A.A., Zhang, J., Zhang, Z., Miller, W. and Lipman, D.J.
(1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res., 25, 3389-3402.

Category Protein Sequence Motifs

Go to the abstract in the NAR 2002 Database Issue.