Discussion Papers
Structure and Function Prediction
-
Devos D, Valencia A (2000). Practical limits of function prediction. Proteins.
41(1):98-107.
-
Tatusov RL, Koonin EV, Lipman DJ (1997). A genomic perspective on protein
families. Science. 278(5338)_631-7.
-
Koonin, V., Cases, I, Enright, A.J., de Lorenzo, V. and Ouzonis, C.A. (2003)
Myriads of Protein Families and Still Counting, Genome Biology 4(2) 401.
-
Iyer LM, Aravind L, Bork P, Hofmann K, Mushegian AR, Zhulin IB, Koonin EV. (2001).
Quoderat demonstrandum? The mystery of experimental validation of apparently erroneous computational analyses of protein sequences.
Genome Biol. 2.
-
Contreras-Moreira B, Ezkurdia I, Tress ML, Valencia A. (2005). Empirical limits for template-based protein structure prediction: the CASP5 example.
FEBS Letters, 579, 1203.
-
Cozzetto D and Tramontano A (2005). Relationship Between Multiple Sequence Alignments and Quality of Protein Comparative Models. Proteins 58, 151.
Biological Examples
-
Faguy M (2003). Lateral Gene Transfer (LGT) between Archaea and Escherichia
coli is a contributor to the emergence of novel infectious disease. BMC
Infectious Diseases 3:13.
-
Gophna U, Charlebois RL, Doolittle WF (2004). Have archaeal genes contributed to bacterial virulence?
Trends Microbiol. 12:213.
-
Zheng M, Ginalski K, Rychlewski L, Grishin NV (2005). Protein domain of unknown function DUF1023 is an alpha/beta hydrolase
Proteins. 59:1.
-
Aravind L, Koonin, EV (1999). Novel Predicted RNA-Binding Domains Associated
with the Translation Machinery. J. Mol. Evol. 48:291.
-
Tong Liu, Ana Rojas, Yuzhen Ye and Adam Godzik, (2003). Homology modeling provides insights into the binding mode of the PAAD/DAPIN/pyrin domain, a fourth member of the CARD/DD/DED domain family. Protein Science, 12, 1872.
-
Bork, P., C. Sander, A. Valencia (1992). An ATPase domain common to prokaryotic
cell cycle proteins, sugar kinases, actin, and hsp70 heat shock proteins.
Proc.Natl.Acad.Sci.USA 89:7290-729.
-
Huynen M.A., Snel B., Bork P., Gibson T.J. (2001). The phylogenetic distribution
of frataxin indicates a role in iron-sulfur cluster protein assembly Hum
Mol Genet. 2001 Oct 1;10(21):2463-8.
General Background Reading Papers
Introduction
-
Bork P, Koonin EV. Predicting functions from protein sequences--where
are the bottlenecks? Nat Genet. 1998 Apr;18(4):313-8. Review.
-
General.
-
No PDF available.
-
Attwood TK. Genomics. The Babel of bioinformatics. Science. 2000 Oct 20;290(5491):471-3.
Sequence alignments, motifs and profiles. Protein domains.
Sequence alignments
-
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment
search tool. J Mol Biol. 1990 Oct 5;215(3):403-10.
-
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman
DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database
search programs. Nucleic Acids Res. 1997 Sep 1;25(17):3389-402. Review.
Motifs
-
Sigrist CJ, Cerutti L, Hulo N, Gattiker A, Falquet L, Pagni M, Bairoch
A, Bucher P. PROSITE: a documented database using patterns and profiles
as motif descriptors. Brief Bioinform. 2002 Sep;3(3):265-74.
-
Falquet L, Pagni M, Bucher P, Hulo N, Sigrist CJ, Hofmann K, Bairoch A.
The PROSITE database, its status in 2002. Nucleic Acids Res. 2002 Jan 1;30(1):235-8.
Profiles and HMMs
-
Bateman A, Birney E, Cerruti L, Durbin R, Etwiller L, Eddy SR, Griffiths-Jones
S, Howe KL, Marshall M, Sonnhammer EL. The Pfam protein families database.
Nucleic Acids Res. 2002 Jan 1;30(1):276-80.
-
Eddy SR. Profile hidden Markov models. Bioinformatics. 1998;14(9):755-63.
Review.
Sequence similarity based function prediction
Homology, orthology and paralogy
Limits and errors in function prediction.
-
Devos D, Valencia A. Practical limits of function prediction. Proteins.
2000 Oct 1;41(1)_98-107.
-
Doerks T, Bairoch A, Bork P. Protein annotation: detective work for function
prediction. Trends Genet. 1998 Jun;14(6):248-50. Review.
Clustering
-
Tatusov RL, Koonin EV, Lipman DJ. A genomic perspective on protein
families. Science. 1997 Oct 24;278(5338):631-7. Review.
-
Tatusov RL, Galperin MY, Natale DA, Koonin EV. The COG database: a tool
for genome-scale analysis of protein functions and evolution. Nucleic Acids
Res. 2000 Jan 1;28(1):33-6.
-
Yona G, Linial N, Linial M. ProtoMap: automatic classification of protein
sequences, a hierarchy of protein families, and local maps of the protein
space. Proteins. 1999 Nov 15;37(3):360-78.
Genomic strategies for function prediction not based on sequence similarity.
Protein features
-
Jensen LJ, Gupta R, Blom N, Devos D, Tamames J, Kesmir C, Nielsen H, Staerfeldt
HH, Rapacki K, Workman C, Andersen CA, Knudsen S, Krogh A, Valencia A,
Brunak S. Prediction of human protein function from post-translational
modifications and localization features. J Mol Biol. 2002 Jun 21;319(5):1257-65.
Structure to function
-
Skolnick J, Fetrow JS, Kolinski A. Structural genomics and its importance
for gene function analysis. Nat Biotechnol. 2000 Mar;18(3):283-7. Review.
-
Eisenstein E, Gilliland GL, Herzberg O, Moult J, Orban J, Poljak RJ, Banerjei
L, Richardson D, Howard AJ. Biological function made crystal clear - annotation
of hypothetical proteins via structural genomics. Curr Opin Biotechnol.
2000 Feb;11(1):25-30. Review.
Comparative sequence genomics
-
von Mering C, Huynen M, Jaeggi D, Schmidt S, Bork P, Snel B. STRING:
a database of predicted functional associations between proteins. Nucleic
Acids Res. 2003 Jan 1;31(1):258-61.
-
von Mering C, Krause R, Snel B, Cornell M, Oliver SG, Fields S, Bork P.
Comparative assessment of large-scale data sets of protein-protein interactions.
Nature. 2002 May 23;417(6887):399-403.
Proteomics
-
Vazquez A, Flammini A, Maritan A, Vespignani A. Global protein function
prediction from protein-protein interaction networks. Nat Biotechnol. 2003
Jun;21(6):697-700.
1D Features
-
Rost, B. (1996). PHD: predicting one-dimensional protein structure byprofile
based neural networks. Meth. Enzymol.,266, 525-539.
-
Sonnhammer, E. L. L., von Heijne, G. & Krogh, A. (1998). A hidden Markov
model for predicting transmembrane helices in protein sequences. In Sixth
International Conference on Intelligent Systems for Molecular Biology (ISMB98)eds.),
pp. 175-182.
Fold Recognition
-
Bowie JU, Luthy R, Eisenberg D. 1991 A method to identify protein sequences
that fold into a known three-dimensional structure. Science. 1991 Jul 12;253(5016):164-70.
-
Basic - only in postscript format
-
Bork, P., C. Sander, A. Valencia 1992. An ATPase domain common to prokaryotic
cell cycle proteins, sugar kinases, actin, and hsp70 heat shock proteins.
Proc.Natl.Acad.Sci.USA 89:7290-729.
-
Pcons: A neural network based consensus predictor that improves fold recognition.
Jesper Lundström, Leszek Rychlewski, Janusz Bujnicki and Arne Elofsson,
2001 Protein Science Nov;10(11):2354-62 .
3D Structures
-
Chothia C. and Lesk A. (EMBO J 1986;5:823-826)
-
Liisa Holm and Chris Sander (1996) Mapping the Protein Universe, Science,
273 (5275) p595
-
A.N. Lupas, C.P. Ponting, R.B. Russell On the Evolution of Protein Folds:
Are similar motifs in different protein folds the result of convergence,
insertion or relics of an ancient peptide world? J. Struct. Biol., 134,191-203,
2001 .
-
Coulson, A.F.W. and Moult J. (2002) AUnifold, Mesofold, and Superfold Model
of Protein Fold Use PROTEINS: Structure, Function and Genetics 46:61-71
Homology Modelling
-
A. Sali, L. Potterton, F. Yuan, H. vanVlijmen,
M. Karplus. Evaluation of comparative protein structure modeling
by MODELLER. Proteins 23, 318-326, 1995.
-
Baker, D., and Sali, A. (2001). Protein structure prediction and structural
genomics . Science, Vol 294 No. 5540 pp. 93-6.