SAC-CASP6: Registered People







Name Arlo Randall
E-mail arandall@ics.uci.edu
Institution University of California, Irvine
Title No abstract
Abstract No abstract

Name Claudia Bertonati
E-mail cb2144@columbia.edu
Institution Columbia University
Title No abstract
Abstract No abstract

Name Yun He
E-mail jarodpardon@hotmail.com
Institution Institute of Biophysics of the Chinese Academy of Sciences, Beijing, China
Title No abstract
Abstract No abstract

Name Sanbo Qin
E-mail qsb300@yahoo.com
Institution Institute of Biophysics of the Chinese Academy of Sciences, Beijing, China
Title No abstract
Abstract No abstract

Name Dariusz M. Plewczynski
E-mail darman@bioinfo.pl
Institution Ab Intio Unit, Bioinformatics Laboratory, BioInfoBank Institute
Title Ab Initio Server Prototype for Prediction of Active Sites in Proteins.
Abstract We present here an ab initio server prototype for prediction of active sites. A list of possible active sites for a given query protein is build using query protein sequence and the database of proteins annotated for a certain type of activation process by Swiss-Prot DB. All short segments of a query protein sequence centered around plausible active sites are compared with experimental profiles using Supprot Vector Machine approach. Those profiles describe both sequence and structure preferences for each type of active site. Prediction of local three-dimensional conformation of a backbone of a query protein chain around examined site is done with the specially prepared library of short local structural segments (LSSs). The short sequence fragments from a query protein are matched with segments in the library using profile with profile alignment. Predicted local structure of a chain near active site qualitatively agrees with experimental data fetched from PDB database (illustrated using the prosphorylation sites). We estimate also the level of improvement over purely sequence based methods gained by incorporating predicted structural information into the local description of phosphorylation sites.

Name Jinhui Ding
E-mail jhding@berkeley.edu
Institution QB3 Institute, UC Berkeley
Title No abstract
Abstract No abstract

Name Mickey Kosloff
E-mail mk2417@columbia.edu
Institution Columbia University
Title No abstract
Abstract No abstract

Name Adam C Marko
E-mail marko@psc.edu
Institution Pittsburgh Supercomputing Center
Title Comparative Modeling using Alternative Alignments and Statistical Potentials.
Abstract For even moderately difficult comparative modeling projects, there is often variable regions for which the alignment between target and template is highly arbitrary and hence structures generated through such an alignment can have significant errors. In an effort to overcome these errors, we have developed a protein structure prediction pipeline that is currently applicable for these comparative modeling targets. This pipeline consisted of 1) generating hundreds of alternative alignments between target and template 2) using these alignments to generate structures 3) scoring these structures with a statistical potential and 4) visually examining lowest energy structures in an effort to pick the one closest to native. Programs were written in Perl to enable the flow between modeling programs. Our goal for this part of our modeling strategy was to demonstrate improvement in our comparative models over those constructed from a T-coffee alignment. Template structures were identified by performing a BLAST search through the non-redundant database, building profiles from related sequences through the MEME program and using those profiles to search through the PDB using the MAST program. We constructed 100-500 alternative alignments between template and target using the program probA. This program uses a probabilistic backtracking procedure that generates ensembles of suboptimal alignments with correct statistical weights. This ensemble of alignments was used to build structures using MODELLER version 6.2. The structures were then ranked using ProsaII. For some targets, we attempted to distinguish between favorable ProsaII models with an all-atom molecular mechanical potential coupled to a Generalized Born implicit solvent model. This presentation will describe the 1) the ability of the ProsaII program to identify structures closest to native from an ensemble and 2) the improvements in alignment quality and native contacts generated through the use of this pipeline versus constructing a model from a T-coffee multiple sequence alignment.

Name A.V.S.K.Mohan Katta
E-mail km@mrna.tn.nic.in
Institution Bioinformatics Center,School of Bioinformatics, Madurai Kamaraj University, Madurai - 625021
Title No abstract
Abstract No abstract

Name Patrick May
E-mail patrick.may@zib.de
Institution ZIB (Zuse Institute Berlin)
Title No abstract
Abstract No abstract

Name Pernille Andersen
E-mail pan@cbs.dtu.dk
Institution Center for Biological Sequence Analysis, Technical University of Denmark, Denmark
Title No abstract
Abstract No abstract

Name Anne Molgaard
E-mail anne@cbs.dtu.dk
Institution Center for Biological Sequence Analysis, BioCentrum-DTU, Technical University of Denmark
Title No abstract
Abstract No abstract

Name Jeson
E-mail J.Pekel@t-online.de
Institution Home
Title No abstract
Abstract No abstract

Name Gong Cheng
E-mail gcheng@u.washington.edu
Institution University of Washington
Title No abstract
Abstract No abstract

Name Jakub Pas
E-mail kuba@bioinfo.pl
Institution BioInfo.PL
Title Multimethod Protein Structure Prediction
Abstract To determine whether the structure of a target protein can be predicted using homology modeling PSI-BLAST [1] search was carried out against the sequences of proteins in the non-redundant protein sequence. PSI-BLAST iterations were performed using manual inclusion/exclusion procedure. After that multiple sequence alignment was built using clustalw [2] program using selected proteins from PSI-BLAST profile. All alignments were manually inspected. Selection of template was confirmed using structure prediction METASERVER [3]. METASERVER was also used to choose template when no significant hits were found using PSI-BLAST searches. In addition other available information was used in an attempt to link the target with a protein with known structure. It was mainly literature search, known metabolic pathways, gene expression data, position on the chromosome, distribution of folds in the organism and secondary structure prediction. Selected target?template structural alignments were visually inspected in SWISS PDB Viewer and if necessary modified. Molecular 3D models were then built 3D using both SWISS-MODEL [4] and MODELLER [6] programs. Initial models were subjected to detailed evaluation, mainly by addition visual inspection of structural consistency and using Verify 3D program [5]. The same evaluation procedure was performed for final models. More than one template protein was used if possible after superimposition of their molecular structures using 3d hit program [Plewczynski in press]. During the modeling procedure superimposition of initial models were used to find best possible backbone conformation The overall quality of each modeled structure was evaluated in detail with the Verify 3D program. 1. Altschul S.F. et al. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25 (17), 3389-3402 2. Thompson J.D. et al (1994) CLUSTAL W: improving the sensivity of progressive multiple sequence alignment through sequence weighting. Nucleic Acids Res. 22, 4673-4680 3. Bujnicki J.M., Elofsson A., Fischer D., Rychlewski L. (2001) Structure prediction meta server. Bioinformatics. 17(8),750-751 4. Guex N., Peitsch M.C. (1997) SWISS-MODEL and the Swiss-PdbViewer: an environment for comparative protein modeling. Electrophoresis. 18(15), 2714-2723 5. Luthy R., Bowie J.U., Eisenberg D. (1999) Assessment of protein models with three-dimensional profiles. Nature 356, 83-85 6. Sali A., Blundell T.L. (1993) Comparative protein modeling by satisfaction of spatial restraints. J. Mol. Biol. 234, 779-815

Name Kazimierz O. Wrzeszczynski
E-mail kaz@cubic.bioc.columbia.edu
Institution Columbia University
Title No abstract
Abstract No abstract

Name Mindaugas Margelevicius
E-mail minmar@ibt.lt
Institution Institute of Biotechnology, Vilnius, Lithuania
Title PSI-BLAST-ISS: an intermediate sequence search tool for estimation of position-specific alignment reliability
Abstract The Intermediate Sequence Search (PSI-BLAST-ISS) tool is designed to assess the region-specific alignment reliability between two protein sequences (target and template). The main idea of the algorithm is to initiate additional PSI-BLAST1 searches against the non-redundant sequence database for a set of sequences that are related both to the target and to the template2. The position-specific reliability of the alignment between the target and the template is then assessed by merging alignment data obtained from intermediate sequence searches and analyzing alignment convergence. Algorithm. The whole ISS procedure may be described as the following steps: (1) identification of multiple sequences related to both target and template sequences, (2) creation of a representative set from these sequences by filtering out close homologs, (3) generation of multiple sequence alignments for all sequences from this representative set by searching sequence database, containing both target and template sequences, (4) retention of all instances of significant matches between the target and the template from multiple alignments obtained in step 3, (5) merging all of significant target-template alignments by taking one of the sequences (either the target or the template) as a frame of reference. Optionally, the procedure can include creation of the consensus template sequence derived from the final merged target-template alignment. Using this option, the position specific reliability for multiple target-template alignments can be contrasted simultaneously. Implementation. PSI-BLAST-ISS is a collection of fairly independent modules linked together using Perl. As an input, PSI-BLAST-ISS takes the target sequence, which is searched against the non-redundant sequence database to collect intermediate sequences. The set of intermediate sequences is currently filtered by CD-HIT3, the sequence clusterization program. Each of the intermediate sequences is used to generate sequence profiles in a form of PSI-BLAST checkpoint file by running a user-defined number of PSI-BLAST iterations. The resulting checkpoint files are then used to restart PSI-BLAST searches in a sequence database that has to include sequences of both proteins of interest (target and template). In a common situation, when the template represents a structural template intended for use in comparative modeling, such database may be derived by simply appending the target sequence to the PDB sequence database that already contains the template sequence. After the processing and merging obtained target-template alignment variants the final result is a multiple sequence alignment, where the reference sequence (say the target) is aligned with multiple instances of the second sequence (template) according to different alignment variants.

Name Alessandro Cestaro
E-mail alessandro.cestaro@unipd.it
Institution CRIBI Biotech Centre, University of Padova
Title No abstract
Abstract No abstract

Name Gonzalo Lopez
E-mail glopez@cnb.uam.es
Institution Protein Design Group-CNB-CSIC
Title No abstract
Abstract No abstract

Name Iakes Ezkurdia
E-mail iakes@gredos.cnb.uam.es
Institution Protein Design Group CNB-CSIC
Title No abstract
Abstract No abstract

Name Nir Kalisman
E-mail nirka@cs.bgu.ac.il
Institution Ben-Gurion University, Israel
Title Refinement of Fold Recognition Models by Optimization with Cooperative Potentials
Abstract In the current round of CASP we attempted to refine selected models submitted to CAFASP4. In general, we tried to find some minimal transformation of the original FR model that would be physically reasonable. By ?physically reasonable? we mean: all-atom, non-fragmented, no clashes or stretched bonds, minimal burial of hydrophilic groups, protein-like hydrogen-bonds patterns and acceptable torsion angle combinations. To this end, we developed BEAUTIFY, a program handling many aspects of protein structure prediction including loop-building and energy based optimization. BEAUTIFY is based on MESHI our newly developed software package for molecular structure modeling. BEAUTIFY starts from an initial Ca model, gradually completing and refining it into an all-atom model. The model refinement is done by direct minimization under derivable potentials. The MESHI potential relays heavily on cooperative terms, i.e. terms that couple the coordinates of a large set of atoms in a non-linear way. Minimization under cooperative potential involves a concerted movement of many atoms toward a common goal. Three main cooperative terms were used: (1) Hydrogen bond pairs - assigning low energy values to HB pair patterns frequently observed in the PDB while concurrently penalizing rare ones. (2) Solvation - inducing a native-like solvation environment around every atom by forcing a certain number of neighboring carbon atoms in its vicinity. (3) Torsion Pairs - Low energetic values were assigned to frequently occurring torsion pair conformations, such as the allowed regions of the Ramachandran plot or the chi1/chi2 of common side chain rotamers.

Name Agata Chmurzynska
E-mail agata@jay.au.poznan.pl
Institution Agricultural University of Poznan
Title No abstract
Abstract No abstract

Name Michal J. Gajda
E-mail misiek@genesilico.pl
Institution International Institute of Molecular and Cell Biology in Warsaw
Title Dr. FRankenstein: fully automated protein structure prediction by assembly of fragments of Fold Recognition models
Abstract The FRankenstein's Monster approach (Kosinski et al., 2003) is a protein modeling protocol that involves generation of multiple alternative models and assembly of a hybrid model based on identification of fragments that are most likely to be correct. Used manually by two groups in CASP-5, it proved to generate very accurate models in the comparative modeling (CM) and CM/fold-recognition (FR) categories (63% of all modeling targets in CASP-5) (Tramontano and Morea, 2003). One of the advantages of the approach is its ability to generate very accurate target-template alignments. The major disadvantage is the time required to build, evaluate and merge a large number of models. In order to reduce the workload for human predictors and provide a prototype of a new useful software tool, we have designed an automated structure predictor based on the FRankenstein's Monster approach.

Name Ali Razzazan
E-mail arazzazan@yahoo.com
Institution School of Pharmacy, Mashhad
Title How do the web facilities help predictors from head to toe of homology modeling?
Abstract In this project we applied the theory of evolution method, including threading and comparative modeling. It was carried out through NCBI, Swiss-Prot, EMBL, PDB, SCOP, CATH and many other related web sites that perform single and multiple alignments to get similar sequences and find proper template(s). Similarity search was carried out through PSI-BLAST against nr and PDB to find high identical proteins as the first line similarity and homology study as well as finding proper templates. BLAST, PSI-BLAST and FASTA were come into account using different threshold of PAM and BLOSUM as similarity matrices. Some computer based programs such as ClustalX and SPDBViewer were applied to analyze sequence alignments and find conserved and identical regions within the query and similar sequences. These templates had high overall sequence similarity to the target and small number and length of gaps in the alignment and best Z-score. Resolution and R-factor of a crystallographic structure were indicative of the accuracy of the structure. Templates were carefully considered regarding their folding and family in SCOP and CATH servers. Threading method came into account when proper template(s) did not come across from PDB BLAST. This was carried out through FUGUE server and 3D-PSSM. FUGUE program, scan a database of structural profiles, calculated the sequence-structure compatibility scores and produce a list of potential homologues proteins and alignments. Secondary structure prediction was carried out using JPRED and PROF server. The model was created with MODELLER-6v2 on a high performance PC platform by satisfaction of spatial restraints. The quality of the model was predicted from the sequence similarity between the target and the template. Sequence identity above 30% was a relatively good predictor of the expected accuracy of the model. Model was investigated in SPDBViewer and ViewerLite programs checking amino acids making clash, Phi-Psi angles, secondary structure matching the secondary structure prediction etc. before submission to PROCHECK, PROV and WHATCHECK on BIOTECH server, iMOLTALK on ExPASy server and also ERRAT and VERIFY3D on UCLA server. Stereochemistry assignments were carried out in WHATCHECK and WHAT IF showing proper Phi and Psi in Ramachandran Plot. Models were modified in MODELLER when needed and the last steps were continued to solve the problem. Models from CPHmodels, ESyPRed3D and SWISS-MODEL were compared to our model to refine and confirm the folding and improve the model. The accuracy of the various models from different methods was relatively similar. Other factors such as template selection and alignment accuracy usually had a larger impact on the model accuracy.

Name Marc Fasnacht
E-mail mf2206@columbia.edu
Institution Dept. Biochemistry and Molecular Biophysics, Columbia University
Title No abstract
Abstract No abstract

Name Andrea Bazzoli
E-mail Abazzoli@libero.it
Institution Universit? degli Studi di Milano - Dipartimento di Tecnologie dell'Informazione
Title No abstract
Abstract No abstract

Name Hamid Ramezani
E-mail hd-ramezani@mums.ac.ir
Institution Mashhad School of Pharmacy, Mashhad, Iran
Title Full-length homology modeling based upon multi-segmental and multiple alignments
Abstract The similarity search was exploited to obtain the comparative models for a selected query. The templates were chosen using similarity search by PSI-BLAST and PHI-BLAST against nr and PDB databases available on the ExPASy, NCBI and EBI services. After choosing templates their sequences and related information were obtained from NCBI web service and a multiple alignment file was created by Clustal W web service from several templates with high and medium identity percentages. This alignment could give us an impression about the identical, conserved and semi-conserved regions of our query. A secondary structure prediction was also obtained from JPred web service. The results gained from Clustul W and JPred services were helpful to evaluate the model from the secondary structure point of view. In the first approach models were constructed from a number of selected templates by MODELLER 6v2 program using the alignment files have been created by Clustal X program from our query and each template. The generated models were deeply studied using SPDBViewer software inspecting the amino acids making clash, Phi-Psi angles and comparing the models with secondary structure predictions were already available. The web-based programs and tests such as ERRAT, VERIFY3D, WHAT_CHECK, WHAT IF and iMOLTALK were utilized to assess the models and do full geometry analysis using UCLA, Biotech and ExPASy web services. In another attempt and as a second approach, a multi-segmental alignment strategy was examined to create an alignment file. In this strategy just one segment of a template which matches better to a given segment of the query was selected for that region and the reminder of query sequence was covered by a second template and thus an integrated alignment file was prepared. The related models were built using MODELER and Clustal X programs and they seemed to be the improved models in comparison with the models obtained from the first approach. The 3-D structure alignment files were produced using FUGUE program available on the web as the third approach. The models were generated by MODELER software and they showed even better and improved models over those built using two previous approaches. The results of three mentioned approaches suggest that the 3-D structure alignment is more reliable approach than other suggested approaches in this context probably because it takes into account the sterochemical considerations in doing alignment as well. The second approach, namely, creating integrated templates sequence for alignment might be an improved approach with respect the first approach. However, its success definitely depends on the templates which are chosen to be utilized in the preparation of alignment file.

Name Lucy Forrest
E-mail lrf2103@columbia.edu
Institution Columbia University
Title No abstract
Abstract No abstract

Name Jejoong Yoo
E-mail jejoong@bawi.org
Institution Korea Institute for Advanced Study
Title Tertiary Structure Prediction for Comparative Modeling, Fold Recognition and New Fold Targets in CASP6
Abstract For blind prediction of 3D structures of CASP6 targets, we have developed a unified method that can be applied to all classes of targets, called CMCSA (Combined Modeling using Conformational Space Annealing). The CMCSA method is based on an energy function, designed from the information on the radius of gyration, hydrophobicity, C? - C? contacts, restraints from templates, restraints from super-fragments, restraints for ?-pairing, hydrogen bond rewarding and steric hindrance, given as E = wrgErg + whpEhp + wMJEMJ + wrstErst + whbEhb + wscEsc (1) where w?s are the weights of energy components. Conformations are constructed by assembling fragments generated from PREDICT1. The PREDICT provides the secondary structure information of target proteins, libraries of local structure fragments, and C? - C? restraints of super-fragments extracted from the PDB_SELECT_90 by fold recognition developed by us. For fragment assembly, we have used PROFESY2, which was successfully applied to new fold targets in CASP5. Conformational search was carried out by conformational space annealing (CSA) method3. Final 100 conformations obtained by CSA are grouped into five clusters by a K-means algorithm and the best conformation from each cluster is selected as a model.

Name Keehyoung Joo
E-mail newton@kias.re.kr
Institution Korea Institute for Advanced Study
Title Tertiary Structure Prediction for Comparative
Abstract For blind prediction of 3D structures of CASP6 targets, we have developed a unified method that can be applied to all classes of targets, called CMCSA (Combined Modeling using Conformational Space Annealing). The CMCSA method is based on an energy function designed from the information on the radius of gyration, hydrophobicity, C? - C? contacts, restraints from templates, restraints from super-fragments, restraints for ?-pairing, hydrogen bond rewarding and steric hindrance. The energy function is given as E = wrgErg + whpEhp + wMJEMJ + wrstErst + whbEhb + wscEsc (1) where w?s are the weights of energy components. Conformations are constructed by assembling fragments generated from PREDICT1. The PREDICT provides the secondary structure information of target proteins, libraries of local structure fragments, and C? - C? restraints of super-fragments extracted from the PDB_SELECT_90 by fold recognition developed by us. For fragment assembly, we have used PROFESY2, which was successfully applied to new fold targets in CASP5. Conformational search was carried out by conformational space annealing (CSA) method 3. A standard set of weights in eq. (1) is obtained by parameter optimization using ?representative? proteins selected from the SCOP database. For targets without additional information, the standard weights are used. For targets with ?sure? templates (homology and threading targets), a larger weight is assigned for the restraints from templates. Weights are varied depending on the secondary structures of all alpha proteins, all beta proteins, a/b proteins, and a+b proteins.

Name Takashi Ishida
E-mail tak@bi.a.u-tokyo.ac.jp
Institution The University of Tokyo
Title No abstract
Abstract No abstract

Name Michael Sweredoski
E-mail msweredo@uci.edu
Institution Universtiy Of California, Irvine
Title No abstract
Abstract No abstract

Name Elke Michalsky
E-mail elke.michalsky@charite.de
Institution Institute of Biochemistry, Charite (University Medicine Berlin)
Title Loops In Proteins (LIP) ? a loop database for homology modelling
Abstract One of the most important and challenging tasks in protein modelling is the prediction of loops, as can be seen in the large variety of existing approaches. Loops In Proteins (LIP) is a database that includes all protein segments of a length up to 15 residues contained in the Protein Data Bank (PDB). In this study, the applicability of LIP to loop prediction in the framework of homology modelling is investigated. Searching the database for loop candidates takes less than one second on a desktop PC, ranking them a few minutes. This is an order of magnitude faster than most existing procedures. Measure of accuracy is the Root Mean Square Deviation (RMSD) with respect to the main-chain atoms after local superposition of target loop and predicted loop. Loops of up to nine residues length were modelled with a local RMSD less than 1 ?, those of length up to 14 residues with an accuracy better than 2 ?. The results were compared in detail with a thoroughly evaluated and tested ab initio method published recently. The LIP-method produced very good predictions. In particular for longer loops it outperformed other methods.

Name Yana Bromberg
E-mail yana.bromberg@dbmi.columbia.edu
Institution Columbia University
Title No abstract
Abstract No abstract

Name Alex Herbert
E-mail alex.herbert@ic.ac.uk
Institution Imperial College London
Title No abstract
Abstract No abstract

Name Dominik Gront
E-mail dgront@chem.uw.edu.pl
Institution Warsaw University
Title Multitemplate Modeling by a Hierarchy of High-Resolution Lattice Folding and All-Atom Refinement
Abstract Our method started from a number of molecular templates generated by threading metaservers. Twenty top scoring templates from bioinfo.pl provided a large set of intramolecular distances. Additional restraints were derived from strongly predicted elements of secondary structure. Those restraints were employed in Replica Exchange Monte Carlo Simulations with CABS protein modeling tool. Then, a large set of resulting distinct structures were subject of a clustering procedure. Average linkage hierarchical clustering algorithm was employed with drmsd as the measure of the distance between structures. Cluster?s centroids were computed via average distance maps. Finally 5-7 clusters were manually selected, according to the cluster size, average energy of its members and average distance dispersion (as a measure of the density of a cluster). Those centoids were refined and ranked using the Amber force field and MD simulations with the explicit solvent.

Name Riccardo Matjaz Bennett-Lovsey
E-mail riccardo.bennett-lovsey@imperial.ac.uk
Institution Imperial College, London
Title No abstract
Abstract No abstract

Name Maciej Milostan
E-mail Maciej.Milostan@cs.put.poznan.pl
Institution Institute of Computing Science, Poznan University of Technology
Title No abstract
Abstract No abstract

Name Josue Samayoa
E-mail josue@soe.ucsc.edu
Institution University of California at Santa Cruz
Title Homology Based Modeling with Rosetta and NMR Data
Abstract The promise of structural genomics is to elucidate the three-dimensional coordinates of many proteins in a high-throughput manner. For many proteins, it is difficult to collect sufficient data for a complete experimental structure determination, providing a need for methods that combine structure prediction techniques with limited experimental data sets to obtain accurate structural models. Here we address the problem of incorporating residual dipolar coupling data (RDC) into a homology-based modeling strategy using the Rosetta method (Rohl 2004, Methods Enzymol.), with the goal of using RDC data to extend the range of homology-based methods to proteins with only remote homologs of known structure. One of the main problems encountered in homology-based modeling is the identification of the correct alignment between a target sequence of unknown structure and a homologous template protein of known structure. Alignment errors can lead to severe adverse effects in structure predictions, and alignment accuracy is often the limiting factor in overall model quality. Our approach utilizes Rosetta and RDC data to identify the correct alignment between a query sequence and a template structure from a large ensemble of candidate alignments generated by the K*Sync algorithm (Chivian 2003). Our method starts by creating a pool of potential alignments between a query sequence and a template structure using K*Sync. Each alignment is converted into a three-dimensional structure with reduced side-chain representations, and loop regions are constructed using Rosetta (Rohl 2004, Proteins: Struct. Funct. Genet.). Each model is then refined to optimize agreement with the experimental data, and the models are ranked using Rosetta's knowledge-based energy potential. The ability of this ranking to identify models with low RMSD to the native conformation is assessed. Additionally, we assessed the added value of adding full-atom side-chains to each model with the Rosetta rotamer-packing algorithm and ranking the models using the Rosetta atom-based energy function. RDC data were provided by the South East Collaboratory for Structural Genomics (SECSG) and from the Bio-Magnetic Resonance Bank (BMRB). References: Rohl CA, Strauss CE, Misura KM, Baker D. (2004) Protein structure prediction using Rosetta. Methods Enzymol. 383:66-93. Chivian D, Kim DE, Malmstrom L. (2003) Automated prediction of CASP-5 structures using the Robetta server. Proteins. 53 Suppl 6:524-33. Rohl CA, Strauss CEM, Chivian D, Baker D. Modeling Structurally Variable Regions in Homologous Proteins using Rosetta. (2004) Proteins: Struct. Funct. Genet. 55: 656-677.

Name Firas Khatib
E-mail bort@soe.ucsc.edu
Institution University of California at Santa Cruz
Title Knotfind: A Simple Algorithm for Finding Knots in Proteins
Abstract Knots in polypeptide chains have been found in very few proteins, but any comparative modeling approach that creates chain breaks in the backbone may increase the likelihood of forming knots. The homology-based modeling strategy used by the automated server Robetta, for example, resulted in knotted models being predicted for some CASP6 targets. We have developed a simple algorithm, Knotfind, for detecting knots in a CA trace and have implemented it in Rosetta. For each knotted chain, the residue numbers and the corresponding coordinates that form the knot are returned, making the exact location of the knot simple to detect. Upon running Knotfind on decoy sets from CASP5 homology-based predictions made with Rosetta, we found a significant number of knotted decoys for a subset of targets, while decoy sets for other targets were knot-free. We developed a predictor to identify conditions under which our modeling strategy was likely to result in knotted decoys. This prediction is based on the total number of loop regions that need to be built, the different lengths of each of the loop regions, and the proximity between different loop regions. In CASP6, Knotfind was used as part of our modeling protocol, with the goal of rejecting knotted structures early in the prediction process to avoid spending computational search time on non-protein-like conformations. For long loop regions, which we judged likely to form knots, a library of possible conformations was constructed using the Rosetta de novo fragment assembly method in the context of sequentially adjacent aligned regions. Loop conformations were screened in the context of the entire protein model to identify potential knots. In addition, loop conformations were screened for steric clashes with the template and geometric fit to the template. The Knotfind algorithm and these additional loop modeling techniques may be useful for other protein structure prediction programs as well. Currently we are adapting the Knotfind algorithm to detect other non-protein-like conformations that occur at high frequency in Rosetta decoys.

Name George Shackelford
E-mail ggshack@cse.ucsc.edu
Institution University of California, Santa Cruz
Title Residue-Residue Contact Prediction Using Mutual Information and Neural Networks
Abstract We present a neural network predictor of residue-residue contacts that uses statistical analysis of mutual information and local property values as inputs. The results improves on earlier efforts[1,2]. Two problems with earlier efforts in using mutual information result from small sample size and biased sampling due to over-representation of sub-family sequences in the alignment. We show ways to deal with both these problems by two statistical methods for correction of small samples and an aggressive thinning of the sequences. We use SAM-T04 to get the alignments[3]. For each pair we randomize the contingency table while holding fixed the marginal sums, and build a histogram of the mutual information for each randomization. We use this histogram to adjust for small sample sizes in two ways. The first corrects for mutual information based on chance by subtracting the mean of the histogram from the raw mutual information to give a corrected mutual information. The second takes the histogram and fits a gamma distribution on it. We use that distribution to calculate an e-value. Both of these values show a significant improvement over raw mutual information. We compensate for the bias of over-represented sequences by thinning the sequences to a series of subsets with increasing dissimilarity between the sequences. We find that thinning in general improved results and thinning to 35% sequence similarity between all sequences provides the best results in balancing between the bias and sample size. Finally we are able to improve on these predictions by using these as part of the inputs to a neural network. The network consists of 280 inputs consisting of sequence length, separation, corrected mutual information and e-values for different thinnings, distributions of both residues including neighboring residues, and burial and secondary structure predictions. The network's single output is the probability value that there is a contact between the respective residues. The results of preliminary testing suggest a significant improvement over previous predictors. The predictions were available as constraints for the "undertaker" program here at UC, Santa Cruz. There are no current results to show whether or not those constraints improved the tertiary structure predictions. 1. Gobel,U., Sander,C., Schneider,R., Valencia,A., "Correleated mutations and residue contacts in proteins", Proteins, 18, 309-317, 1994. 2. Fariselli,P., Olmea,O., Valencia,A., Casadio,R. "Prediction of contact maps with neural networks and correlated mutations", Protein Engineering, 14:11, 835-843, 2001. 3. Karplus,K., Karchin,R., Draper,J., Casper,J. Mandel-Gutfreund,Y., Diekhans,M., and Hughey,R. "Combining local-structure, fold-recognition, and new-fold methods for protein structure prediction", Proteins: Structure, Function, and Genetics, (53)S6, 491-496, 15 Oct 2003.

Name Emil Mittag
E-mail mittag@zbh.uni-hamburg.de
Institution Center for Bioinformatics, University of Hamburg
Title No abstract
Abstract No abstract

Name Daisuke Katagiri
E-mail k-dai@graduate.chiba-u.jp
Institution Graduate School of Pharmaceutical Science, Chiba University
Title Ab initio Protein Structure Prediction
Abstract  Our protein structure predictions were carried out for 24 targets that have 130 or less residues and including two canceled targets. All structure predictions were started from a straight form of the polypeptide chain as the initial configuration. Then, a temperature ramp was used to suddenly raise the temperature of the whole system, and cooling simulation was performed after this heating procedure. In the cooling procedure, a temperature ramp was used to gradually decrease. All molecular dynamics simulations were performed with a 1.0 fs time step, a no cut off for Lennard-Jones interactions, and the use of SHAKE [1] for restricting motion of all covalent bonds involving hydrogen, using modified version of the AMBER 7 [2] suite of programs. As for a force field, originally developed force field was employed, where each of the 20 amino acids has the respective parameters set. The parameters for the 20 amino acids were generated by the force field parameterizing technique developed by us, in which quantum chemical calculation is essentially required. Accordingly, the structures of amino acids were optimized by Gaussian 98 [3] program using density functional method [4] ( B3LYP ) with 6-31G** basis set, before generating the respective force field parameters. Solvent effects were incorporated using the Generalized Born model [5], as implemented in AMBER 7. Our prediction accuracy was high for the helix region, however, low for the b-sheet region. b-sheet region tended to be predicted as amorphous structure, and turn region between b-sheet and b-sheet structure, as helix structure. Moreover, the prediction accuracy was low in the region including a lot of polarity amino acid side chains such as Arg, Asp, Glu and Lys. It is quite likely that these low accuracies depend on the computational condition of making the original force field in vacuo. As a matter fact, structural difference between in water and in vacuo was observed [6]. Therefore, making the original force field in water solvent will be needed to achieve a higher accuracy prediction. As for some mini-proteins, our prediction accuracies were high for both the helix and the b-sheet regions. 1. Ryckaert,J.P., Ciccotti,G. Berendsen,H.J.C. (1977) Numerical integration of the cartesian equations of motion of a system with constraints: Molecular dynamics of n-alkanes. J. Computat. Phys. 23, 327-341. 2. Case,D.A., Pearlman,D.A., Caldwell,J.W., Cheatham III,T.E., Wang,J., Ross,W.S., Simmerling,C.L., Darden,T.S., Merz,K.M., Stanton,R.V., Cheng,A.L., Vincent,J.J., Crowley,M., Tsui,V., Gohlke,H., Radmer,R.J., Duan,Y., Pitera,J., Massova,I., Seibel,G.L., Singh,U.C., Weiner,P.K. Kollman,P.A. (2002) AMBER 7, University of California, San Francisco. 3. Frisch,M.J., Trucks,G.W., Schlegel,H.B., Scuseria,G.E., Robb,M.A., Cheeseman,J.R., Zakrzewski,V.G., Montgomery,J.A., Jr., Stratmann, R.E., Burant, J.C., Dapprich, S., Millam, J.M., Daniels, A.D., Kudin, K.N.,Strain,M.C., Farkas,O., Tomasi,J., Barone,V., Cossi,M., Cammi,R., Mennucci,B., Pomelli,C., Adamo,C., Clifford,S., Ochterski,J., Petersson,G.A., Ayala,P.Y., Cui,Q., Morokuma,K., Malick,D.K., Rabuck,A.D., Raghavachari,K., Foresman,J.B., Cioslowski,J., Ortiz,J.V., Baboul,A.G., Stefanov,B.B., Liu,G., Liashenko,A., Piskorz,P., Komaromi,I., Gomperts,R., Martin,R.L., Fox,D.J., Keith,T., Al-Laham,M.A., Peng,C.Y., Nanayakkara,A., Gonzalez,C., Challacombe,M., Gill,P.M.W., Johnson,B., Chen,W., Wong,M.W., Andres,J.L., Gonzalez,C., Head-Gordon,M., Replogle,E.S. Pople,J.A. (1998) Gaussian98, revision A.7; Gaussian,Inc.: Pittsburgh,PA. 4. Becke,A.D. (1993) Density-functional thermochemistry. III; The role of exact exchange. J. Chem. Phys. 98, 5648. 5. Tsui,V. Case,D.A. (2001) Theory and applications of the generalized Born solvation model in macromolecular simulations. Biopolymers. (Nucl. Acid. Sci.) 56, 275-291. 6. Wang,Z.X. Duan,Y. (2004) Solvation Effects on Alanine Dipeptide: A MP2/cc-pVTZ//MP2/6-31G** Study of (F,Y) Energy Maps and Conformers in the Gas Phase, Ether, and Water. J. Comput. Chem. 25, 1699-1716.

Name Hidenori Ishikawa
E-mail hideishi@graduate.chiba-u.jp
Institution Graduate School of Pharmaceutical Science, Chiba University
Title No abstract
Abstract No abstract

Name Peter Hildebrand
E-mail peter.hildebrand@charite.de
Institution Charite Universit?tsmedizin Berlin
Title Travelling the structural world of membrane proteins: interesting insights and views
Abstract Protein helices spanning biological membranes have five portions: two terminal parts outside the membrane, two boundary regions flanked by lipid head groups and the core part surrounded by the hydrophobic tails of the fatty acids. The helices exhibit marked differences to helices in globular protein domains regarding amino acid composition and geometry. However in most pdb-structures the exact position of the helices relative to the membrane is unknown yet. Here we present structural and sequential criteria to determine the position of these helices within the lipid bilayer. These structural particularities reported here are relevant for the three-dimensional modelling of membrane protein structures.

Name Hyungrae Kim
E-mail hrkim@kias.re.kr
Institution Korea Institute for Advanced Study
Title No abstract
Abstract No abstract

Name Martina Koeva
E-mail martina@soe.ucsc.edu
Institution University of California, Santa Cruz
Title Human Interaction with Undertaker for the Structure Prediction of Targets T0212 and T0198
Abstract We present two examples ? T0212 and T0198 ? of human intervention in the protein structure prediction process through our interaction with the ?undertaker? program. Preliminary analysis of the results for these two targets allows us to gain some understanding of the abilities and limitations of the program, as well as to assess the human-added value to the quality of the predictions. Target T0212 consisted of approximately 126 residues and was annotated as protein SOR45 from S.oneidensis. We used a fully automated method, which involved the use of SAM-T04, SAM-T2K and undertaker1, to generate an initial 3D model for this target. Our initial alignment results did not suggest any obvious templates (for comparative modeling) or folds (for fold recognition). Based on the structural neighbors of our initial models and some of the sequence alignments, we decided to pursue a jelly-roll like topology. Our secondary structure predictions suggested that if we modeled T0212 as a jellyroll, our models were going to have either an extra strand, or a missing strand.We used undertaker to pursue both possibilities. The comparisons of our results with the correct structure (PDB: 1tza) indicate that our top submitted model, which represented the equivalence class of the ?jelly-roll with a missing strand? models scored the best from all of our submitted models with a GDT score of 30.645%. Target T0198, which corresponded to protein 1170B from Thermotoga maritima had a sequence of length 235 amino acids. The initial sequence alignments and secondary structure predictions suggested a helical up-and-down bundle fold, which our initial 3D model generated by undertaker did not reflect. We decided to pursue two different possible folds: an alpha-helical sandwich, based on some of the structural neighbors of T0198, and a helical bundle. We did not manage to use undertaker to successfully bundle the predicted helices. We could not find a bundling pattern that allowed us to make undertaker pack tightly the helices against each other, while exhibiting the appropriate exposure/burial patterns. Undertaker seemed to favor mostly alpha-helical sandwich models. The comparison between our submitted models and the correct structure (PDB: 1sum) has shown poor results and not much improvement over the initial automatically generated model. Our best model, which was not submitted, showed a GDT score of 21% and 12.54 Ang. RMSD. 1. Karplus,K., Karchin,R., Draper,J., Casper,J. Mandel-Gutfreund,Y., Diekhans,M., and Hughey,R. "Combining local-structure, fold-recognition, and new-fold methods for protein structure prediction", Proteins: Structure, Function, and Genetics, (53)S6,491-496, 15 Oct 2003.

Name Tina Lai
E-mail lai@zbh.uni-hamburg.de
Institution University of Hamburg
Title No abstract
Abstract No abstract

Name Domenico Cozzetto
E-mail domenico.cozzetto@uniroma1.it
Institution Biocomputing - Dept. Biochemical Sciences "A. Rossi Fanelli" - University of Rome "La Sapienza"
Title No abstract
Abstract No abstract

Back to main page