Go to the abstract in the NAR 2001 Database Issue.
http://pauling.mbu.iisc.ernet.in/~pali
Balaji, S.1, Sujatha, S.1, Aruna, S.2, Mhatre, S.N.1, Srinivasan, N.1
1Molecular Biophysics Unit Indian Institute of Science Bangalore 560 012 India
2Centre for Biotechnology Anna University Chennai 600 025 India
Contact ns@mbu.iisc.ernet.in
PALI (Release - 1.3) (http://pauling.mbu.iisc.ernet.in/~pali) contains three-dimensional (3-D) structure-dependent sequence alignments as well as structure-based phylogenetic trees of homologous proteins in various families. The data set of homologous protein structures has been derived by consulting the SCOP database. The present release (1.3) comprises of 614 families of homologous proteins involving 3050 protein domain structures with each family made-up of at least two members and nearly 17000 structural alignments. There is a substantial increase in the number of alignments compared to the previous release of PALI which contained about 9000 alignments. Every member in a family has been structurally aligned with every other member in the same family (pairwise alignment) and all the members in a family are also aligned using simultaneous superposition (multiple alignment). The structural alignments are performed using the program STAMP in a semi-automated way. Every family is also associated with two dendrograms, calculated using PHYLIP, one based on a structural dissimilarity metric defined for every pairwise alignment and the other based on the similarity of topologically equivalenced residues. The present release also includes the structural distance metric for each pair as defined by Gerstein and Levitt. Readily available alignments with the details of structural and sequence similarities, superposed coordinate sets and dendrograms can be accessed family-wise. Querying the database for protein pairs with sequence or structural similarities falling within a specified range can also enable accessing the families, alignments and dendrograms. Thus PALI forms a useful resource to help in analysing the relationship between sequence and structure variation at a given level of sequence similarity. PALI also contains about 650 "orphans" (single member families). Using a web-interface involving PSI_BLAST and KITSCH it is possible to associate the sequence of a new protein in to one of the families in PALI and automatically generate a phylogenetic tree combining the query sequence and proteins of known 3-D structure. Another new feature of PALI that is available with the present release is an interface to IMPALA program which matches the query sequence with the profiles (Position Specific Score Matrix - PSSM) of families in PALI.
1. Significant increase in the size of the database in terms of number protein domains and number of structure-based alignments. 2. Availability of structural divergence measure in the form of the metric proposed by Levitt and Gerstein for all the pairs of proteins in PALI. 3. Enhanced capability to match the query sequence with the profiles of PALI families using the profile-matching technique, IMPALA.
Category Structure
Go to the abstract in the NAR 2001 Database Issue.