Secondary Structure Prediction

and 1D Characteristics

Michael Tress (Centro Nacional de Biotecnología, Madrid).
 

1D charcteristics are those where each amino acid is associated with a single value. For secondary structure these are: H -helix-, E -strand-, L -loop-, ..., and they also exist for accessibility (buried or exposed), for hydrophobicity, etc. 1D characteristics are very useful in 3D structure prediction.
 
 

AA :        Sequence residues
OBSsec: Observed secondary structure (E: sheet, H: helix)
OBSacc: Observed accessibility (e: exposed, b: buried)
PHDsec: Predicted secondary structure
PHDacc: Predicted accessibility

Amino Acid Properties :  

Servers and Programs

PredictProtein PHDsec predicts secondary structure on the basis of multiple alignments using neural networks.  Example [Output]
JPred Jpred is a server that collects a protein sequence or an multiple alignment to predict secondary structure. Works by combining the results of various methods to generate a consensus.    Example [Output]
PsiPred PSIPRED incorporates neural nets that analyse PSI-BLAST output. Version 4 takes into account the the results of 4 independent neural nets to improve the reliablility. Example [Output]

Exercises

Find Secondary Structure Characteristics

1. Send the sequences to various servers (PHD, JPred, PsiPred, SSPro, SAM T99, CHOU FASMAN). Compare the results.

Then send the sequences to the signal peptides server SignalP)

Send interesting looking sequences to disorder prediction and low complexity servers: DisEMBL (try CAST option), PONDR (you need to register), DisoPRED, DRIPPRED, RONN

Optional: generate a multiple alignment and try sending that to (JPred). Compare with the results when you only send the sequence.

>RPE_YEAST
MVKPIIAPSI LASDFANLGC ECHKVINAGA DWLHIDVMDG HFVPNITLGQ PIVTSLRRSV
PRPGDASNTE KKPTAFFDCH MMVENPEKWV DDFAKCGADQ FTFHYEATQD PLHLVKLIKS
KGIKAACAIK PGTSVDVLFE LAPHLDMALV MTVEPGFGGQ KFMEDMMPKV ETLRAKFPHL
NIQVDGGLGK ETIPKAAKAG ANVIVAGTSV FTAADPHDVI SFMKEEVSKE LRSRDLLD

>F93
MKIRKYMRINYYIILKVLVINGSRLEKKRLRSEILKRFDIDISDGVLYPLIDSLIDDKIL
REEEAPDGKVLFLTEKGMKEFEELHEFFKKIVC

>BclA
MAFDPNLVGPTLPPIPPFTLPTGPTGPTGPTGPTGPTGPTGPTGDTGTTGPTGPTGPTGP
TGPTGATGLTGPTGPTGPSGLGLPAGLYAFNSGGISLDLGINDPVPFNTVGSQFGTAISQ
LDADTFVISETGFYKITVIANTATASVLGGLTIQVNGVPVPGTGSSLISLGAPIVIQAIT
QITTTPSLVEVIVTGLGLSLALGTSASIIIEKVAHHHHHH

>Rpf (resuscitation-promoting factor)
MTLFTTSATRSRRATASIVAGMTLAGAAAVGFSAPAQAATVDTWDRLAECESNGTWDINTGNGFYGGVQFTLSSWQAVGG
EGYPHQASKAEQIKRAEILQDLQGWGAWPLCSQKLGLTQADADAGDVDATEAAPVAVERTATVQRQSAADEAAAEQAAAA
EQAVVAEAETIVVKSGDSLWTLANEYEVEGGWTALYEANKGAVSDAAVIYVGQELVLPQA

>sw|P35833|CSF3_BOVIN Granulocyte colony-stimulating factor precursor (G-CSF).
MKLMVLQLLLWHSALWTVHEATPLGPARSLPQSFLLKCLEQVRKIQADGAELQERLCAAH
KLCHPEELMLLRHSLGIPQAPLSSCSSQSLQLTSCLNQLHGGLFLYQGLLQALAGISPEL
APTLDTLQLDVTDFATNIWLQMEDLGAAPAVQPTQGAMPTFTSAFQRRAGGVLVASQLHR
FLELAYRGLRYLAEP

>ICE9_HUMAN
MDEADRRLLR RCRLRLVEEL QVDQLWDALL SRELFRPHMI EDIQRAGSGS RRDQARQLII
DLETRGSQAL PLFISCLEDT GQDMLASFLR TNRQAAKLSK PTLENLTPVV LRPEIRKPEV
LRPETPRPVD IGSGGFGDVG ALESLRGNAD LAYILSMEPC GHCLIINNVN FCRESGLRTR
TGSNIDCEKL RRRFSSLHFM VEVKGDLTAK KMVLALLELA QQDHGALDCC VVVILSHGCQ
SWYVETLDDI FEQWAHSEDL QSLLLRVANA VSVKGIYKQM PGCFNFLRKK LFFKTS

2. Send these sequences to one of these trans-membrane prediction servers (TMHMM, PHD-TM, THUMB-UP)

>2_636 AA
MEGPAFSKPL KDKINPWGPL IILGILIRAG VSVQHDSPHQ VFNVTWRVTN LMTGQTANVT SLLGTMTDAF
PKLYFDLCDL IGDDWDETGL GCRTPGGRKR ARTFDFYVCP GHTVPTGCGG PREGYCGKWG CETTGQAYWK
PSSSWDLISL KRGNTPRNQG PCYDSSAVSS NIKGATPGGR CNPLVLEFTD AGKKASWDGP KVWGLRLYRS
TGIDPVTRFS LTRQVLNIGP RVSIGPNPVI TDQLPPSRPV QIMLPRPPQP PPPGAASIVP ETAPPSQQPG
TGDRLLNLVD GAYRALNLTS PDKTQECWLC LVAGPPYYEG VAILGTYSNH TSAPANCSVA SQHKLTLSEV
TGQGLCVGAV PKTHQALCNT TQTSSRGSYY LVAPTGTMWA CSTGLTPCIS TTILNLTTDY CVLVELWPRV
TYHSPSYVYG LFERSNRHKR EPVSLTLALL LGGLTMGGIA AGIGTGTTAL MATQQFQQLQ AAVQDDLREV
EKSISNLEKS LTSLSEVVLQ NRRGLDLLFL KEGGLCAALK EECCFYADHT GLVRDSMAKL RERLNQRQKL
FESTQGWFEG LFNRSPWFTT LISTIMGPLI VLLMILLFGP CILNRLVQFV KDRISVVQAL VLTQQYHQLK
PIEYEP

>Rpf (resuscitation-promoting factor)
MTLFTTSATRSRRATASIVAGMTLAGAAAVGFSAPAQAATVDTWDRLAECESNGTWDINTGNGFYGGVQFTLSSWQAVGG
EGYPHQASKAEQIKRAEILQDLQGWGAWPLCSQKLGLTQADADAGDVDATEAAPVAVERTATVQRQSAADEAAAEQAAAA
EQAVVAEAETIVVKSGDSLWTLANEYEVEGGWTALYEANKGAVSDAAVIYVGQELVLPQA

>O06005|AAPA_BACSU

PHD_TM output

 

3. Send the sequences to the predictors of phosphorylaytion and glyycoylation. (NetOGly, NetPhos)

>3_41 AA
ASYDGHKLVAGYDFTPPSTPSTDDPNVCREYSYKLGTYGAP

NetOGlyc output

>4_153 AA
ASQKRPSQRHGSKYLATASTMDHARHGFLPRHRDTGILDSIGRFFGGDRGAPKNMYKDSHHPARTAHYGSLPQKSHGRTQ
DENPVVHFFKNIVTPRTPPPSQGKGRKSAHKGFKGVDAQGTLSKIFKLGGRDSRSGSPKPELVISALIVESRR

NetPhos output

 

VOLVER