Go to the abstract in the NAR 1999 Database Issue.
Contact pirmail@nbrf.georgetown.edu
The database includes alignments of protein sequences that represent superfamilies, families, and homology domains. Sequences belong to the same homeomorphic superfamily if they are homologous from end to end. Each superfamily is further classified into families containing sequences that are at least 45% identical. Many protein sequences are composed of distinct functional regions called domains, or multiple copies of the same domain. Segments corresponding to the same homology domain in sequences from different superfamilies are extracted and aligned to form the homology domain alignments. ClustalW is used to generate the multiple sequence alignments and ALNED, a PIR interactive alignment editor, is used to check and correct them. Annotation information, such as superfamily names, keywords, and taxonomy, is derived from the PIR-International Protein Sequence Database. The database has helped in classifying sequences, in defining new homology domains, and in spreading and standardizing protein names, features, and keywords among members of a family or superfamily
This work has been supported by NLM grant LM05798.
Category Protein Sequence Motifs
Go to the abstract in the NAR 1999 Database Issue.