List of all available prediction types |
Type | Methods |
Prediction server | PredictProtein |
Databases searched for homologues | SWISS-PROT TrEMBL PDB BIG |
Alignment and database searching methods | MaxHom BLASTP PSIblast |
Sequence motif searching methods | ProSite ProDom SEG PredictNLS |
Prediction of protein structure |
PHD PHDsec PHDacc PHDhtm PROF PROFsec PROFacc GLOBE TOPITS COILS CYSPRED |
Tools used for PP | MView |
Tools available with PP output | ESPript |
Categories of prediction methods PPluated |
Server | PredictProtein |
Site (URL) | http://cubic.bioc.columbia.edu/predictprotein |
Quote |
|
Authors | Burkhard Rost and Jinfeng Liu (CUBIC, Columbia Univ, New York) |
Contact | Jinfeng Liu (liu@cubic.bioc.columbia.edu) |
Version | 2000_06 |
Server | SWISS-PROT |
Site (URL) | http://expasy.cbr.nrc.ca/sprot/ |
About | PredictProtein is the acronym for all prediction
programs run.
SWISS-PROT is an annotated protein sequence database established in 1986 and maintained collaboratively, since 1987, by the Department of Medical Biochemistry of the University of Geneva and the EMBL Data Library (now the EMBL Outstation - The European Bioinformatics Institute (EBI)). The SWISS-PROT protein sequence data bank consists of sequence entries. |
Quote<>
Transfer interrupted!IDTH="85%">
|
|
Authors | Amos Bairoch (ExPasy, Geneva, Switzerland) and Rolf Appweiler (EBI, Hinxton, England) |
Contact | Amos Bairoch (Amos.Bairoch@isb-sib.ch) |
Version | 39 (05/2000), updated weekly |
Server | TrEMBL |
Site (URL) | http://www.ebi.ac.uk/swissprot/ |
About | TrEMBL is a computer-annotated supplement of SWISS-PROT that contains all the translations of EMBL nucleotide sequence entries not yet integrated in SWISS-PROT. |
Quote |
|
Authors | Rolf Appweiler (EBI, Hinxton, England) |
Contact | Rolf Apweiler (Rolf.Apweiler@ebi.ac.uk) |
Version | 05/2000, updated weekly |
Server | PDB |
Site (URL) | http://www.rcsb.org/pdb/ |
About | PDB contains proteins of experimentally known three-dimensional structure. |
Quote |
|
Authors | RCSB consortium |
Contact | Phil Bourne (bourne@sdsc.edu) |
Version | updated weekly |
Server | BIG |
Site (URL) | local at CUBIC |
About | BIG is our (CUBIC) in-house version merging SWISS-PROT, TrEMBL and PDB. |
Quote |
|
Authors | CUBIC group, Columbia University, New York |
Contact | Dariusz Przybylski (dudek@cubic.bioc.columbia.edu) |
Version | updated weekly |
Server | MaxHom |
Site (URL) | local at CUBIC |
About | MaxHom is a dynamic multiple sequence alignment
program which finds similar sequences in a database.
MaxHom builds up a protein family (defined as all closely related proteins likely to have similar structures) in two steps:
|
Quote |
|
Authors | Reinhard Schneider (LION, Boston) and Chris Sander (Millenium, Boston) |
Contact | Burkhard Rost (rost@columbia.edu) |
Version | 1.2000.06 |
Server | BLASTP |
Site (URL) | http://www.ncbi.nlm.nih.gov/BLAST/ |
About | BLASTP is a fast database search program.
|
Quote |
|
Authors | S Karlin and SF Altschul (NCBI, Washington) |
Contact | BLASTP admin (blast-help@ncbi.nlm.nih.gov) |
Version | 1.4 |
Server | PSIblast |
Site (URL) | http://www.ncbi.nlm.nih.gov/BLAST/ |
About | PSIblast is a fast, yet sensitive database search
program.
We are running the iterated PSIblast on a subset of the BIG database with SWISS-PROT + TrEMBL + PDB sequences. The number of iteration, the cut-off thresholds and the particular details of which sequences are used from BIG has been optimised in our group. |
Quote |
|
Authors | S F Altschul, T L Madden, A A Schäffer, J Zhang, Z Zhang, W Miller, and D J Lipman |
Contact | BLASTP admin (blast-help@ncbi.nlm.nih.gov) |
Version | 2000_06 |
Server | ProSite |
Site (URL) | http://www.expasy.ch/prosite/ |
About | PROSITE is a database of functional motifs.
ScanProsite, finds all functional motifs in your sequence that are annotated
in the ProSite db.
The following description is from the original ProSite site: ProSite is a method of determining what is the function of uncharacterized proteins translated from genomic or cDNA sequences. It consists of a database of biologically significant sites, patterns and profiles that help to reliably identify to which known family of protein (if any) a new sequence belongs. |
Transfer interrupted!TD> |
|
Authors | Kay Hofmann, Philip Bucher, and Amos Bairoch (SIB, Geneva, Switzerland) |
Contact | Christian Sigrist (Christian.Sigrist@isb-sib.ch) |
Version | 1999_07 |
Server | ProDom |
Site (URL) | http://protein.toulouse.inra.fr/prodom.html |
About | ProDom is a database of putative protein domains.
The database is searched with BLAST for domains corresponding to your protein.
The following description is from the original ProDom site (which supplies a rather useful graphical interface to the ProDom database): The ProDom protein domain database consists of an automatic compilation of homologous domains detected in the SWISS-PROT database by the DOMAINER algorithm (ELL Sonnhammer & D Kahn, Prot. Sci., 1994, 3, 482-492). It has been devised to assist with the analysis of the domain arrangement of proteins. ProDom `domains' are inferred on the basis of conserved subsequences as found in various proteins. Such a conservation corresponds frequently, though not always, to genuine structural domains: therefore domain boundaries should be treated with caution. For some domain families experts have been asked to correct domain boundaries on the basis of both sequence and structural information. This expertise will complement the automated process and improve the quality of ProDom domain families. |
Quote |
|
Authors | Florence Corpet, Florence Servant, Jerome Gouzy, and Daniel Kahn |
Contact | Jerome Gouzy (Jerome.Gouzy@toulouse.inra.fr) |
Version | 2000.1 |
Server | SEG |
Site (URL) | http://trex.musc.edu/manuals/unix/seg.html |
About | SEG divides sequences into regions of low-,
and high-complexity. Low-complexity regions typically correspond to 'simple
sequences' or 'compositionally-biased' regions.
The following description is from the original SEG documentation (JC Wootton & S Federhen, 1996, Meth Enzymology, 266, 554-571): SEG divides sequences into contrasting segments of low-complexity and high-complexity. Low-complexity segments defined by the algorithm represent "simple sequences" or "compositionally-biased regions". Locally-optimized low-complexity segments are produced at defined levels of stringency, based on formal definitions of local compositional complexity. The segment lengths and the number of segments per sequence are determined automatically by the algorithm. |
Quote |
|
Authors | John C Wootton and Scott Federhen (NCBI, Washington) |
Contact | Scott Federhen (federhen@ncbi.nlm.nih.gov) |
Version | 1994 |
Server | PredictNLS |
Site (URL) | http://cubic.bioc.columbia.edu/predictNLS |
About | PrecitNLS finds experimentally known nuclear
localisation in your protein.
|
Quote |
|
Authors | Raj Nair, Murad Cokol, and Burkhard Rost (CUBIC, Columbia Univ, New York) |
Contact | Raj Nair (nair@cubic.bioc.columbia.edu) |
Version | 2000_07 |
Server | PHD |
Site (URL) | http://cubic.bioc.columbia.edu/predictprotein |
About | PHD is a suite of programs predicting 1D structure
(secondary structure, solvent accessibility) from multiple sequence alignments.
see PHDsec PHDacc PHDhtm |
Quote |
|
Authors | Burkhard Rost (CUBIC, Columbia Univ, New York) |
Contact | Burkhard Rost (rost@columbia.edu) |
Version | 1996.1 |
Server | PHDsec |
Site (URL) | http://cubic.bioc.columbia.edu/predictprotein |
About | PHDsec predicts secondary structure from multiple
sequence alignments.
Secondary structure is predicted by a system of neural networks rating at an expected average accuracy > 72% for the three states helix, strand and loop (Rost & Sander, PNAS, 1993 , 90, 7558-7562; Rost & Sander, JMB, 1993 , 232, 584-599; and Rost & Sander, Proteins, 1994 , 19, 55-72). Evaluated on the same data set, PHDsec is rated at ten percentage points higher three-state accuracy than methods using only single sequence information, and at more than six percentage points higher than, e.g., a method using alignment information based on statistics (Levin, Pascarella, Argos & Garnier, Prot. Engng., 6, 849-54, 1993). PHDsec predictions have three main features:
|
Quote |
|
Authors | Burkhard Rost (CUBIC, Columbia Univ, New York) |
Contact | Burkhard Rost (rost@columbia.edu) |
Version | 1996.1 |
Server | PHDacc |
Site (URL) | http://cubic.bioc.columbia.edu/predictprotein |
About | PHDacc predicts per residue solvent accessibility
from multiple sequence alignments.
Solvent accessibility is predicted by a neural network method rating at a correlation coefficient (correlation between experimentally observed and predicted relative solvent accessibility) of 0.54 cross-validated on a set of 238 globular proteins (Rost & Sander, Proteins, 1994, 20, 216-226). The output of the neural network codes for 10 states of relative accessibility. Expressed in units of the difference between prediction by homology modelling (best method) and prediction at random (worst method), PHDacc is some 26 percentage points superior to a comparable neural network using three output states (buried, intermediate, exposed) and using no information from multiple alignments. |
Quote |
|
Authors | Burkhard Rost (CUBIC, Columbia Univ, New York) |
Contact | Burkhard Rost (rost@columbia.edu) |
Version | 1996.1 |
Server | PHDhtm |
Site (URL) | http://cubic.bioc.columbia.edu/predictprotein |
About | PHDhtm predicts the location and topology of
transmembrane helices from multiple sequence alignments.
Transmembrane helices in integral membrane proteins are predicted by a system of neural networks. The shortcoming of the network system is that often too long helices are predicted. These are cut by an empirical filter. The final prediction (Rost et al., Protein Science, 1995, 4, 521-533) has an expected per-residue accuracy of about 95%. The number of false positives, i.e., transmembrane helices predicted in globular proteins, is about 2% (Rost et al. 1996). The neural network prediction of transmembrane helices (PHDhtm) is refined by a dynamic programming-like algorithm. This method resulted in correct predictions of all transmembrane helices for 89% of the 131 proteins used in a cross-validation test; more than 98% of the transmembrane helices were correctly predicted. The output of this method is used to predict topology, i.e., the orientation of the N-term with respect to the membrane. The expected accuracy of the topology prediction is > 86%. Prediction accuracy is higher than average for eukaryotic proteins and lower than average for prokaryotes. PHDtopology was more accurate than all other methods tested on identical data sets in 1996 (Rost, Casadio & Fariselli, 1996a and 1996b). |
Quote |
|
Authors | Burkhard Rost (CUBIC, Columbia Univ, New York) |
Contact | Burkhard Rost (rost@columbia.edu) |
Version | 1996.1 |
Server | PROF |
Site (URL) | http://cubic.bioc.columbia.edu/predictprotein |
About | Improved version of PHD: Profile-based neural
network prediction of protein structure.
|
Quote |
|
Authors | Burkhard Rost (CUBIC, Columbia Univ, New York) |
Contact | Burkhard Rost (rost@columbia.edu) |
Version | 2000_06 |
Server | PROFsec |
Site (URL) | http://cubic.bioc.columbia.edu/predictprotein |
About | Improved version of PHDsec: Profile-based neural
network prediction of protein secondary structure.
|
Quote |
|
Authors | Burkhard Rost (CUBIC, Columbia Univ, New York) |
Contact | Burkhard Rost (rost@columbia.edu) |
Version | 2000.06 |
Server | PROFacc |
Site (URL) | http://cubic.bioc.columbia.edu/predictprotein |
About | Improved version of PHDacc: Profile-based neural
network prediction of residue solvent accessibility.
|
Quote |
|
Authors | Burkhard Rost (CUBIC, Columbia Univ, New York) |
Contact | Burkhard Rost (rost@columbia.edu) |
Version | 2000.06 |
Server | GLOBE |
Site (URL) | http://cubic.bioc.columbia.edu/predictprotein |
About | GLOBE predicts the globularity of a protein.
An additional result from the prediction of solvent accessibility is that of protein globularity. That method is not published, yet. For more information, you may have a look at the preliminary preprint. |
Quote |
|
Authors | Burkhard Rost (CUBIC, Columbia Univ, New York) |
Contact | Burkhard Rost (rost@columbia.edu) |
Version | 1996.1 |
Server | TOPITS |
Site (URL) | http://cubic.bioc.columbia.edu/predictprotein |
About | TOPITS is a prediction-based threading program,
that finds remote structural homologues in the DSSP database.
Remote homologues (0-25% sequence identity) are detected by a novel prediction-based threading method (Rost 1995a and 1995b). The principle idea is to detect similar motifs of secondary structure and accessibility between a sequence of unknown structure and a known fold . For the recognition of similarities between entire folds, the expected accuracy (first hit of alignment list correct) is about 60% (Rost, ISMB95 Proceedings, 1995, AAAI Press, 314-321). If the goal is to correctly detect even short homologous fragments, still about 30% of the first hits are correct (compared to an accuracy of 14% for simple sequence alignments: full paper). Hits with z-scores above 3.0 are more reliable (accuracy > 60%). (Note: a number of similar or better threading services based on similar principles are available through META: http://cubic.bioc.columbia.edu/pp/submit_meta.html). |
Quote |
|
Authors | Burkhard Rost (CUBIC, Columbia Univ, New York) |
Contact | Burkhard Rost (rost@columbia.edu) |
Version | 1997.1 |
Server | COILS |
Site (URL) | local at CUBIC |
About | COILS finds coiled-coil regions in your protein.
The following description is from the original COILS site: COILS is a program that compares a sequence to a database of known parallel two-stranded coiled-coils and derives a similarity score. By comparing this score to the distribution of scores in globular and coiled-coil proteins, the program then calculates the probability that the sequence will adopt a coiled-coil conformation. |
Quote |
|
Authors | Andrei Lupas (SmithKline Beecham, Collegeville) |
Contact | Andrei Lupas (lupasa00@mh.us.sbphrd.com)) |
Version | 1999_2.2 |
Server | CYSPRED |
Site (URL) | http://prion.biocomp.unibo.it/cyspred.html |
About | CYSPRED finds whether the cys residue in your
protein forms disulfide bridge.
The following description is from the original CYSPRED publication: A neural network-based predictor is trained to distinguish the bonding states of cysteine in proteins starting from the residue chain. Training is performed using 2452 cysteine-containing segments extracted from 641 non homologous proteins of well resolved 3D structure. After a cross-validation procedure efficiency of the prediction scores as high as 72% when the predictor is trained using protein single sequences. The addition of evolutionary information in the form of multiple sequence alignment and a jury of neural networks increase the prediction efficiency up to 81%. Assessment of the goodness of the prediction with a reliability index indicates that more than 60% of the predictions have an accuracy level greater than 90%. A comparison with a statistical method previously described and tested on the same data base shows that the neural network-based predictor is performing with the highest efficiency. |
Quote |
|
Authors | Piero Fariselli and Rita Casadio (Bologna Univ, Bologna) |
Contact | Piero Fariselli (farisel@kaiser.alma.unibo.it) |
Version | 2000 |
Server | MView |
Site (URL) | http://mathbio.nimr.mrc.ac.uk/~nbrown/mview/ |
About | MView is a program converting multiple sequence
alignments into fancy HTML formatted output.
|
Quote |
|
Authors | Nigel Brown |
Contact | Nigel Brown ( nbrown@nimr.mrc.ac.uk) |
Version | 1.40.2 |
Server | ESPript |
Site (URL) | http://cubic.bioc.columbia.edu/cgi/pp/ESPript |
About | ESPript converts the PredictProtein results
(and other alignments) into fancy images.
To use the tool, you have to do the following:
|
Quote |
|
Authors | Patrice Gouet and Emmanuel Courcelle |
Contact | Patrice Gouet (gouet@ipbs.fr) |
Version | 1.9 |
List of all available prediction types |
Type | Methods |
Prediction server | PredictProtein |
Databases searched for homologues | SWISS-PROT TrEMBL PDB BIG |
Alignment and database searching methods | MaxHom BLASTP PSIblast |
Sequence motif searching methods | ProSite ProDom SEG PredictNLS |
Prediction of protein structure |
PHD PHDsec PHDacc PHDhtm PROF PROFsec PROFacc GLOBE TOPITS COILS CYSPRED |
Tools used for PP | MView |
Tools available with PP output | ESPript |