From phd@dodo.cpmc.columbia.edu Thu Feb 3 09:41:03 2000 Received: from jota.cnb.uam.es (jota.cnb.uam.es [150.244.12.4] (may be forged)) by gredos.cnb.uam.es (8.8.8/8.8.7) with ESMTP id JAA16837 for; Thu, 3 Feb 2000 09:41:02 GMT Received: from dodo.cpmc.columbia.edu (dodo.cpmc.columbia.edu [156.111.190.78]) by jota.cnb.uam.es (8.8.7/8.8.7) with ESMTP id JAA19043 for ; Thu, 3 Feb 2000 09:40:01 GMT Received: (from phd@localhost) by dodo.cpmc.columbia.edu (980427.SGI.8.8.8/980728.SGI.AUTOCF) id EAA12549 for pazos@gredos.cnb.uam.es; Thu, 3 Feb 2000 04:34:20 -0500 (EST) Date: Thu, 3 Feb 2000 04:34:20 -0500 (EST) From: phd@dodo.cpmc.columbia.edu (PredictProtein) Message-Id: <200002030934.EAA12549@dodo.cpmc.columbia.edu> To: pazos@gredos.cnb.uam.es Subject: PredictProtein X-Mozilla-Status: 0001 Content-Length: 120579 The following information has been received by the server: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ________________________________________________________________________________ reference predict_h6314640 (Feb 3, 2000 04:29:42) PPhdr from: pazos@gredos.cnb.uam.es PPhdr resp: MAIL PPhdr orig: HTML PPhdr want: ASCII PPhdr password(###) prediction of: - threading (TOPITS)- return msf format ret topits strip # default: single protein sequence description=course TTLSCKVTSV EAITDTVYRV RIVPDAAFSF RAGQYLMVVM DERDKRPFSM ASTPDEKGFI ELHIGASEIN LYAKAVMDRI LKDHQIVVDI PHGEAWLRDD EERPMILIAG GTGFSYARSI LLTALARNPN RDITIYWGGR EEQHLYDLCE LEALSLKHPG LQVVPVVEQP EAGWRGRTGT VLTAVLQDHG TLAEHDIYIA GRFEMAKIAR DLFCSERNAR EDRLFGDAFA FI ________________________________________________________________________________ Result of PROSITE search (Amos Bairoch): ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ please quote: A Bairoch, P Bucher & K Hofmann: The PROSITE database, its status in 1997. Nucl. Acids Res., 1997, 25, 217-221. ________________________________________________________________________________ -------------------------------------------------------- Pattern-ID: PKC_PHOSPHO_SITE PS00005 PDOC00005 Pattern-DE: Protein kinase C phosphorylation site Pattern: [ST].[RK] 4 SCK 29 SFR 155 SLK 215 SER Pattern-ID: CK2_PHOSPHO_SITE PS00006 PDOC00006 Pattern-DE: Casein kinase II phosphorylation site Pattern: [ST].{2}[DE] 8 TSVE 52 STPD 191 TLAE Pattern-ID: MYRISTYL PS00008 PDOC00008 Pattern-DE: N-myristoylation site Pattern: G[^EDRKHPFYW].{2}[STAGCN][^P] 111 GTGFSY 179 GTVLTA ________________________________________________________________________________ Result of ProDom domain search (Sonnhammer; Corpet, Gouzy, Kahn): ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - please quote: ELL Sonnhammer & D Kahn, Prot. Sci., 1994, 3, 482-492 ________________________________________________________________________________ --- ------------------------------------------------------------ --- Results from running BLAST against PRODOM domains --- --- PLEASE quote: --- F Corpet, J Gouzy, D Kahn (1998). The ProDom database --- of protein domain families. Nucleic Ac Res 26:323-326. --- --- BEGIN of BLASTP output BLASTP 1.4.7 [16-Oct-94] [Build 12:52:03 Oct 30 1994] Reference: Altschul, Stephen F., Warren Gish, Webb Miller, Eugene W. Myers, and David J. Lipman (1990). Basic local alignment search tool. J. Mol. Biol. 215:403-10. Query= prot (#) ppOld, default: single protein sequence description=course /home/phd/server/work/predict_h6314640 (232 letters) Database: prodom_99_2 157,167 sequences; 18,560,502 total letters. Searching..................................................done Smallest Sum High Probability Sequences producing High-scoring Segment Pairs: Score P(N) N PD150097 p99.2 (9) UBIB(4) LUXG(3) O85219(1) // REDUCTA... 182 2.7e-37 3 PD023013 p99.2 (3) UBIB(2) P94148(1) // REDUCTASE NADPH... 154 5.8e-15 1 PD175909 p99.2 (1) O86363_MYCTU // HYPOTHETICAL 43.3 KD ... 84 0.00086 1 PD000183 p99.2 (214) NIA(17) FENR(15) NCPR(13) // REDUC... 69 0.0032 2 PD152063 p99.2 (3) ASCD(1) RFBI(1) O31003(1) // CDP6DEO... 43 0.0078 2 PD109318 p99.2 (1) LUXG_VIBHA // PROBABLE FLAVIN REDUCTA... 63 0.066 1 >PD150097 p99.2 (9) UBIB(4) LUXG(3) O85219(1) // REDUCTASE OXIDOREDUCTASE FLAVOPROTEIN FAD LUMINESCENCE FLAVIN PROBABLE NADPHFLAVIN FERRISIDEROPHORE C Length = 106 Score = 182 (84.4 bits), Expect = 2.7e-37, Sum P(3) = 2.7e-37 Identities = 32/54 (59%), Positives = 43/54 (79%) Query: 1 TTLSCKVTSVEAITDTVYRVRIVPDAAFSFRAGQYLMVVMDERDKRPFSMASTP 54 TT++CKV +E +T +YRV + PD F F+AGQYLMVVM+E+DKRPFS+A+ P Sbjct: 1 TTINCKVEKIEPLTSNIYRVFLKPDQPFEFKAGQYLMVVMNEKDKRPFSIANCP 54 Score = 73 (33.9 bits), Expect = 2.7e-37, Sum P(3) = 2.7e-37 Identities = 11/18 (61%), Positives = 16/18 (88%) Query: 84 HQIVVDIPHGEAWLRDDE 101 ++I +D PHG+AWLRD+E Sbjct: 89 NEIEIDAPHGDAWLRDEE 106 Score = 69 (32.0 bits), Expect = 2.7e-37, Sum P(3) = 2.7e-37 Identities = 15/23 (65%), Positives = 16/23 (69%) Query: 59 FIELHIGASEINLYAKAVMDRIL 81 FIELHIG SE N YA VM+ L Sbjct: 60 FIELHIGGSEHNEYALEVMEHFL 82 >PD023013 p99.2 (3) UBIB(2) P94148(1) // REDUCTASE NADPHFLAVIN OXIDOREDUCTASE FAD FLAVOPROTEIN FLAVIN FERRISIDEROPHORE C NADP IRON Length = 31 Score = 154 (71.4 bits), Expect = 5.8e-15, P = 5.8e-15 Identities = 30/31 (96%), Positives = 30/31 (96%) Query: 202 RFEMAKIARDLFCSERNAREDRLFGDAFAFI 232 RFEMAKIARD FCSERNAREDRLFGDAFAFI Sbjct: 1 RFEMAKIARDAFCSERNAREDRLFGDAFAFI 31 >PD175909 p99.2 (1) O86363_MYCTU // HYPOTHETICAL 43.3 KD PROTEIN HYPOTHETICAL PROTEIN Length = 143 Score = 84 (39.0 bits), Expect = 0.00086, P = 0.00086 Identities = 19/64 (29%), Positives = 35/64 (54%) Query: 105 MILIAGGTGFSYARSILLTALARNPNRDITIYWGGREEQHLYDLCELEALSLKHPGLQVV 164 ++++AG TG + R++++ N + +++G R LYDL L ++ +P L V Sbjct: 4 VLMVAGSTGLAPLRALIIDLSRFAVNPRVHLFFGARYACELYDLPTLWQIAAHNPWLSVS 63 Query: 165 PVVE 168 PV E Sbjct: 64 PVSE 67 >PD000183 p99.2 (214) NIA(17) FENR(15) NCPR(13) // REDUCTASE OXIDOREDUCTASE FAD FLAVOPROTEIN NADP NITRATE HEME NAD ELECTRON MEMBRANE Length = 139 Score = 69 (32.0 bits), Expect = 0.0032, Sum P(2) = 0.0032 Identities = 14/20 (70%), Positives = 17/20 (85%) Query: 102 ERPMILIAGGTGFSYARSIL 121 ERP+I+IAGGTG + RSIL Sbjct: 1 ERPIIMIAGGTGIAPIRSIL 20 Score = 38 (17.6 bits), Expect = 0.0032, Sum P(2) = 0.0032 Identities = 7/23 (30%), Positives = 15/23 (65%) Query: 132 DITIYWGGREEQHLYDLCELEAL 154 ++ +++G R E+ +Y EL+ L Sbjct: 35 EVYLFYGCRNEEDIYLYEELDEL 57 >PD152063 p99.2 (3) ASCD(1) RFBI(1) O31003(1) // CDP6DEOXYDELTA3 4GLUCOSEEN REDUCTASE E3 OXIDOREDUCTASE ELECTRON TRANSPORT IRONSULFUR NAD RFBI Length = 53 Score = 43 (19.9 bits), Expect = 0.0078, Sum P(2) = 0.0078 Identities = 8/20 (40%), Positives = 12/20 (60%) Query: 2 TLSCKVTSVEAITDTVYRVR 21 T+ CKV S E +T + +R Sbjct: 16 TIPCKVASFEFVTKDIVSLR 35 Score = 39 (18.1 bits), Expect = 0.0078, Sum P(2) = 0.0078 Identities = 7/18 (38%), Positives = 10/18 (55%) Query: 19 RVRIVPDAAFSFRAGQYL 36 R R P F++ GQY+ Sbjct: 35 RFRFPPTTKFNYLPGQYI 52 >PD109318 p99.2 (1) LUXG_VIBHA // PROBABLE FLAVIN REDUCTASE EC 1... LUMINESCENCE OXIDOREDUCTASE FLAVOPROTEIN FAD Length = 22 Score = 63 (29.2 bits), Expect = 0.069, P = 0.066 Identities = 10/22 (45%), Positives = 16/22 (72%) Query: 211 DLFCSERNAREDRLFGDAFAFI 232 D FC +R A ++L+ DAFA++ Sbjct: 1 DWFCDKRGAEPEQLYADAFAYL 22 Parameters: E=0.1 B=500 V=500 -ctxfactor=1.00 Query ----- As Used ----- ----- Computed ---- Frame MatID Matrix name Lambda K H Lambda K H +0 0 BLOSUM62 0.321 0.139 0.407 same same same Query Frame MatID Length Eff.Length E S W T X E2 S2 +0 0 232 232 0.10 72 3 11 22 0.19 33 Statistics: Query Expected Observed HSPs HSPs Frame MatID High Score High Score Reportable Reported +0 0 62 (28.8 bits) 182 (84.4 bits) 10 10 Query Neighborhd Word Excluded Failed Successful Overlaps Frame MatID Words Hits Hits Extensions Extensions Excluded +0 0 5411 8213516 1724115 6479666 9733 2 Database: prodom_99_2 Release date: unknown Posted date: 10:12 PM EDT Jul 29, 1999 # of letters in database: 18,560,502 # of sequences in database: 157,167 # of database sequences satisfying E: 6 No. of states in DFA: 566 (56 KB) Total size of DFA: 115 KB (128 KB) Time to generate neighborhood: 0.01u 0.00s 0.01t Real: 00:00:00 Time to search database: 10.95u 0.04s 10.99t Real: 00:00:11 Total cpu time: 10.97u 0.06s 11.03t Real: 00:00:11 --- END of BLASTP output --- ------------------------------------------------------------ --- --- Again: these results were obtained based on the domain data- --- base collected by Daniel Kahn and his coworkers in Toulouse. --- --- PLEASE quote: --- F Corpet, J Gouzy, D Kahn (1998). The ProDom database --- of protein domain families. Nucleic Ac Res 26:323-326. --- --- The general WWW page is on: ---- --------------------------------------- --- http://www.toulouse.inra.fr/prodom.html ---- --------------------------------------- --- --- For WWW graphic interfaces to PRODOM, in particular for your --- protein family, follow the following links (each line is ONE --- single link for your protein!!): --- http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom1=PD150097 ==> multiple alignment, consensus, PDB and PROSITE links of domain PD150097 http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom2=PD150097 ==> graphical output of all proteins having domain PD150097 http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom1=PD023013 ==> multiple alignment, consensus, PDB and PROSITE links of domain PD023013 http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom2=PD023013 ==> graphical output of all proteins having domain PD023013 http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom1=PD175909 ==> multiple alignment, consensus, PDB and PROSITE links of domain PD175909 http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom2=PD175909 ==> graphical output of all proteins having domain PD175909 http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom1=PD000183 ==> multiple alignment, consensus, PDB and PROSITE links of domain PD000183 http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom2=PD000183 ==> graphical output of all proteins having domain PD000183 http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom1=PD152063 ==> multiple alignment, consensus, PDB and PROSITE links of domain PD152063 http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom2=PD152063 ==> graphical output of all proteins having domain PD152063 http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom1=PD109318 ==> multiple alignment, consensus, PDB and PROSITE links of domain PD109318 http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom2=PD109318 ==> graphical output of all proteins having domain PD109318 --- --- NOTE: if you want to use the link, make sure the entire line --- is pasted as URL into your browser! --- --- END of PRODOM --- ------------------------------------------------------------ ________________________________________________________________________________ The alignment that has been used as input to the network is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ________________________________________________________________________________ --- --- Version of database searched for alignment: --- SWISS-PROT release 38.0 (7/99) with 80000 proteins --- --- ------------------------------------------------------------ --- MAXHOM multiple sequence alignment --- ------------------------------------------------------------ --- --- MAXHOM ALIGNMENT HEADER: ABBREVIATIONS FOR SUMMARY --- ID : identifier of aligned (homologous) protein --- STRID : PDB identifier (only for known structures) --- PIDE : percentage of pairwise sequence identity --- WSIM : percentage of weighted similarity --- LALI : number of residues aligned --- NGAP : number of insertions and deletions (indels) --- LGAP : number of residues in all indels --- LSEQ2 : length of aligned sequence --- ACCNUM : SwissProt accession number --- NAME : one-line description of aligned protein --- --- MAXHOM ALIGNMENT HEADER: SUMMARY ID STRID IDE WSIM LALI NGAP LGAP LEN2 ACCNUM NAME ubib_ecoli 100 100 232 0 0 232 P23486 NAD(P)H-FLAVIN REDUCTASE ubib_pholu 73 83 232 0 0 232 P43129 NAD(P)H-FLAVIN REDUCTASE ubib_vibor 54 69 158 2 5 164 P43128 NAD(P)H-FLAVIN REDUCTASE ubib_vibfi 53 68 229 3 6 236 P43126 NAD(P)H-FLAVIN REDUCTASE ubib_vibha 49 66 231 2 5 236 P43127 NAD(P)H-FLAVIN REDUCTASE luxg_vibfi 41 56 222 2 4 236 P24273 PROBABLE FLAVIN REDUCTASE luxg_vibha 39 55 229 2 5 233 P16447 PROBABLE FLAVIN REDUCTASE luxg_phole 37 55 228 2 4 234 P29237 PROBABLE FLAVIN REDUCTASE ascd_yerps 31 43 224 5 6 328 P37911 CDP-6-DEOXY-DELTA-3,4-GLU dmpp_psesp 30 39 215 4 33 352 P19734 P5 COMPONENT). ndor_psepu 28 37 221 2 2 328 Q52126 COMPONENT (EC 1.18.1.3). rfbi_salty 28 36 222 6 7 330 P26395 RFBI PROTEIN. benc_acica 25 27 225 6 11 348 P07771 FERREDOXIN; FERREDOXIN--N --- --- MAXHOM ALIGNMENT: IN MSF FORMAT MSF of: /home/phd/server/work/predict_h6314640.hsspFilter from: 1 to: 232 /home/phd/server/work/predict_h6314640.msfRet MSF: 232 Type: P 3-Feb-00 04:30:2 Check: 293 .. Name: predict_h6310 Len: 232 Check: 7806 Weight: 1.00 Name: ubib_ecoli Len: 232 Check: 7806 Weight: 1.00 Name: ubib_pholu Len: 232 Check: 2836 Weight: 1.00 Name: ubib_vibor Len: 232 Check: 4005 Weight: 1.00 Name: ubib_vibfi Len: 232 Check: 5465 Weight: 1.00 Name: ubib_vibha Len: 232 Check: 6148 Weight: 1.00 Name: luxg_vibfi Len: 232 Check: 5710 Weight: 1.00 Name: luxg_vibha Len: 232 Check: 4217 Weight: 1.00 Name: luxg_phole Len: 232 Check: 669 Weight: 1.00 Name: ascd_yerps Len: 232 Check: 3940 Weight: 1.00 Name: dmpp_psesp Len: 232 Check: 6071 Weight: 1.00 Name: ndor_psepu Len: 232 Check: 1435 Weight: 1.00 Name: rfbi_salty Len: 232 Check: 9478 Weight: 1.00 Name: benc_acica Len: 232 Check: 4707 Weight: 1.00 // 1 50 predict_h6310 TTLSCKVTSV EAITDTVYRV RIVPDAAFSF RAGQYLMVVM DERDKRPFSM ubib_ecoli TTLSCKVTSV EAITDTVYRV RIVPDAAFSF RAGQYLMVVM DERDKRPFSM ubib_pholu TTLSCKVTSV EAITDTVYRV RLLPDSPFLF RAGQYLMVVM DERDKRPFSM ubib_vibor .......... .......... .......... .......... .EKDKRPFSI ubib_vibfi ..INCKVKSI EPLACNTFRI LLHPEQPVAF KAGQYLTVVM GEKDKRPFSI ubib_vibha .TIQCKVKSI QPLACNTYQI LLHPESPVPF KAGQYLMVVM GEKDKRPFSI luxg_vibfi .........L ASIKNNIYKV FITVNSPIKF IAGQFVMVTI NG.KKCPFSI luxg_vibha ..MLCSIEKI EPLTSFIFRV LLKPDQPFEF RAGQYINVSL S.FGSLPFSI luxg_phole ..FNCKVKKV EASDSHIYKV FIKPDKCFDF KAGQYVIVYL NG.KNLPFSI ascd_yerps .TYPCKLDSI EFIGeaILSL RLPPTAKIQY LAGQYIDLII NG.QRRSYSI dmpp_psesp ......VSAL VDLSPTIKGL HIKLDRPMPF QAGQYVNLAL PGIdtRAFSL ndor_psepu ......VVAV ESPTHDIRRL RVRLSKPFEF SPGQYATLQF SPEHARPYSM rfbi_salty ..VPCKVNSA VLVSgmTLKL RTPPTAKIGF LPGQYINLHY KG.VTRSYSI benc_acica HHFEGTLARV ENLSDSTITF DIQLDddIHF LAGQYVNVTL PGteTRSYSF 51 100 predict_h6310 ASTPDEKGFI ELHIGASEIN LYAKAVMDRI LKDHQIVVDI PHGEAWLRDD ubib_ecoli ASTPDEKGFI ELHIGASEIN LYAKAVMDRI LKDHQIVVDI PHGEAWLRDD ubib_pholu ASTPSEKEFI ELHIGASELN LYAMAVMDRI LDQKVINIDI PHGKAWFRKS ubib_vibor ASSPCreGEL ELHIGAAEQN AYALEVVEAM kqDGEITIDA PHGDAWVQEE ubib_vibfi ASSPCreGEI ELHIGAAEHN AYAGEVVESM ktGGDILIDA PHGEAWIRED ubib_vibha ASSPCreGEL ELHIGAAEHN AYALEVVEAM qtDGHIEIDA PHGDAWVQEE luxg_vibfi ANCPTKNHEI ELHIGSSNKD CSLdyFVDAL VEEVAIELDA PHGNAWLRSE luxg_vibha ASCPSNGAFL ELHIGGSDIS KKNTLVMEEL TNSwmVEVSE ARGKAWLRDE luxg_phole ANCPTCNELL ELHVGGSVKE SAIEAISHFI NaqKEFTIDA PHGDAWLRDE ascd_yerps ANAPGGNGNI ELHVRKVVNG VFSNIIFNEL KLQQLLRIEG PQGTFFVRED dmpp_psesp ANPPSRNDEV ELHVRLVEGG AATGFIHKQL KVGDAVELSG PYGQFFVRDS ndor_psepu AGLPDDQE.M EFHIRKVPGG RVTEYVFEHV REGTSIKLSG PLGTAYLRQK rfbi_salty ANSDESNG.I ELHVRNVPNG QMSSLIFGEL QENTLMRIEG PCGTFFIRES benc_acica SSQPGntGFV VRNVPQGKMS EY...LSVQA KAGDKMSFTG PFGSFYLRDV 101 150 predict_h6310 EERPMILIAG GTGFSYARSI LLTALARNPN RDITIYWGGR EEQHLYDLCE ubib_ecoli EERPMILIAG GTGFSYARSI LLTALARNPN RDITIYWGGR EEQHLYDLCE ubib_pholu SANPLLLIAG GTGFSYTRSI LLTALEEQPK RHISMYWGGR ESQHLYDLAE ubib_vibor SERPLLLIAG GTGFSYVRSI LDHCVAQELK NDIHLYWGGR DECQLYAKSE ubib_vibfi SDRSMLLIAG GTGFSYVRSI LDHCISQQIQ KPIYLYWGGR DECQLYAKAE ubib_vibha SERPLLLIAG GTGFSYVRSI LDHCVAQNKT NPIYLYWGAR DNCQLYAKEE luxg_vibfi SNNPLLLIAG GTGLSYINSI LTNCLNRNIP QDIYLYWGVK NSSLLYEDEE luxg_vibha SVKPLLLVAG GTGMSYTLSI LKNSLAQGFN QPIYVYWGAK DMENLYVHDE luxg_phole SQSPLLLIAG GTGLSYINSI LSCCISKQLS QPIYLYWGVN NCNLLYADQQ ascd_yerps .NLPIVFLAG GTGFAPVKSM VEALINKNDQ RQVHIYWGMP AGHNFYS.DI dmpp_psesp QAGDLIFIAG GSGLSSPQSM ILDLLERGDT RRITLFQGAR NRAELYNCEL ndor_psepu HTGPMLCVGG GTGLAPVLSI VRGALKSGMT NPILLYFGVR SQQDLYDAER rfbi_salty .DRPIIFLAG GTGFAPVKSM VEHLIQGKCR REIYIYWGMQ YSKDFYS.AL benc_acica .KRPVLMLAG GTGIAPFLSM LQVLEQKGSE HPVRLVFGVT QDCDLVALEQ 151 200 predict_h6310 LEALSLKHPG LQVVPVVEQP EAGWRGRTGT VLTAVLQDHG TLAEHDIYIA ubib_ecoli LEALSLKHPG LQVVPVVEQP EAGWRGRTGT VLTAVLQDHG TLAEHDIYIA ubib_pholu LRLLTERYPN LKVIPVVEQS DNGWCGRTGT VLKAVLEDFG SLANYDIYIA ubib_vibor LEEIAAKHNN VHFVPVVEEA PSEWAGKTGN VLQAVEQDFD SLAEFDIYI. ubib_vibfi LESIAQAHSH ITFVPVVEKS E.GWTGKTGN VLEAVKADFN SLADMDIYIA ubib_vibha LVEIADKFAN VHFVPVVEEA PADWQGKVGN VLQAVSEDFE SLENYDIYIA luxg_vibfi LLELSLNNKN LHYIPVIEDK SEEWIGKKGT VLDAVMEDFT DLAHFDIYVC luxg_vibha LVDIALENKN VSYVPVTEIS TCPQYAKQGK VLECVMSDFR NLSEFDIYLC luxg_phole LKTLAAQYRN INYIPVVENL NTDWQGKIGN VIDAVIEDFS DLSDFDIYVC ascd_yerps ANEWAIKHPN IHYVPVVSGD DSTWTGATGF VHQAVLEDIP DLSLFNVYAC dmpp_psesp FEELAARHPN FSYVPALNQa dPEWQGFKGF VHDAAKAHfg RLFERDIFME ndor_psepu LHKLAADHPQ LTVHTVIATG PINEGQRAGL ITDVIEKDIL SLAGWRAYLC rfbi_salty PQQWSEQHDN VHYIPVVSGD DAEWGGRKGF VHHAVMDDFD SLEFFDIYAC benc_acica LDALQQKLPW FEYRTVVAHA E.SQHERKGY VTGHIEYDWL NGGEVDVYLC 201 232 predict_h6310 GRFEMAKIAR DLFCSERNAR EDRLFGDAFA FI ubib_ecoli GRFEMAKIAR DLFCSERNAR EDRLFGDAFA FI ubib_pholu GRFEMAKIAR ERFCSERDAS ADSMYGDAFE FI ubib_vibor .......... .......... .......... .. ubib_vibfi GRFEMAGAAR EQFTTEKQAK KEQLFGDAFA FI ubib_vibha GRFEMAGAAR EQFTQNKKAK SERMFADAYA FI luxg_vibfi GPFMMAKTAK EKLIEEKKAK SEQMFADAFA YV luxg_vibha GPYKMVEVAR DWFCDKRGAE PEQLYADAFA YL luxg_phole GPFGMSRTAK DILISQKKAN IGKMYSDAFS Y. ascd_yerps GSLAMITAAR NDFINHGLA. ENKFFSDAF. .. dmpp_psesp RFYTAADGAG E...SSRSAL FKRI...... .. ndor_psepu GAPAMV.EAL CTVTKHLGIS PEHIYADAF. .. rfbi_salty GSPVMIDASK KDFMMKNLS. VEHFYSDAF. .. benc_acica GPVPMVEAVR SWLDTQGIQP ANFLFEKFSA .. ________________________________________________________________________________ Result of COILS prediction (Andrei Lupas): ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A Lupas: Methods in Enzymology, 1996, 266, 513-525. version 2.2: Rob B. Russell & Andrei N. Lupas, 1999 ________________________________________________________________________________ no coiled-coil above probability 0.5 ________________________________________________________________________________ Prediction of: - secondary structure, by PHDsec - solvent accessibility, by PHDacc PHD: Profile fed neural network systems from HeiDelberg ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Author: Burkhard Rost EMBL, Heidelberg, FRG Meyerhofstrasse 1, 69 117 Heidelberg Internet: Predict-Help@EMBL-Heidelberg.DE All rights reserved. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Secondary structure prediction by PHDsec: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Author: Burkhard Rost EMBL, Heidelberg, FRG Meyerhofstrasse 1, 69 117 Heidelberg Internet: Rost@EMBL-Heidelberg.DE All rights reserved. About the network method ~~~~~~~~~~~~~~~~~~~~~~~ The network procedure is described in detail in: 1) Rost, Burkhard; Sander, Chris: Prediction of protein structure at better than 70% accuracy. J. Mol. Biol., 1993, 232, 584-599. A brief description is given in: Rost, Burkhard; Sander, Chris: Improved prediction of protein secondary structure by use of se- quence profiles and neural networks. Proc. Natl. Acad. Sci. U.S.A., 1993, 90, 7558-7562. The PHD mail server is described in: 2) Rost, Burkhard; Sander, Chris; Schneider, Reinhard: PHD - an automatic mail server for protein secondary structure prediction. CABIOS, 1994, 10, 53-60. The latest improvement steps (up to 72%) are explained in: 3) Rost, Burkhard; Sander, Chris: Combining evolutionary information and neural networks to predict protein secondary structure. Proteins, 1994, 19, 55-72. To be quoted for publications of PHD output: Papers 1-3 for the prediction of secondary structure and the pre- diction server. About the input to the network ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The prediction is performed by a system of neural networks. The input is a multiple sequence alignment. It is taken from an HSSP file (produced by the program MaxHom: Sander, Chris & Schneider, Reinhard: Database of Homology-Derived Structures and the Structural Meaning of Sequence Alignment. Proteins, 1991, 9, 56-68. For optimal results the alignment should contain sequences with varying degrees of sequence similarity relative to the input protein. The following is an ideal situation: +-----------------+----------------------+ | sequence: | sequence identity | +-----------------+----------------------+ | target sequence | 100 % | | aligned seq. 1 | 90 % | | aligned seq. 2 | 80 % | | ... | ... | | aligned seq. 7 | 30 % | +-----------------+----------------------+ Estimated Accuracy of Prediction ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A careful cross validation test on some 250 protein chains (in total about 55,000 residues) with less than 25% pairwise sequence identity gave the following results: ++================++-----------------------------------------+ || Qtotal = 72.1% || ("overall three state accuracy") | ++================++-----------------------------------------+ +----------------------------+-----------------------------+ | Qhelix (% of observed)=70% | Qhelix (% of predicted)=77% | | Qstrand(% of observed)=62% | Qstrand(% of predicted)=64% | | Qloop (% of observed)=79% | Qloop (% of predicted)=72% | +----------------------------+-----------------------------+ .......................................................................... These percentages are defined by: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | number of correctly predicted residues |Qtotal = --------------------------------------- (*100) | number of all residues | | no of res correctly predicted to be in helix |Qhelix (% of obs) = -------------------------------------------- (*100) | no of all res observed to be in helix | | | no of res correctly predicted to be in helix |Qhelix (% of pred)= -------------------------------------------- (*100) | no of all residues predicted to be in helix .......................................................................... Averaging over single chains ~~~~~~~~~~~~~~~~~~~~~~~~~~~ The most reasonable way to compute the overall accuracies is the above quoted percentage of correctly predicted residues. However, since the user is mainly interested in the expected performance of the prediction for a particular protein, the mean value when averaging over protein chains might be of help as well. Computing first the three state accuracy for each protein chain, and then averaging over 250 chains yields the following average: +-------------------------------====--+ | Qtotal/averaged over chains = 72.2% | +-------------------------------====--+ | standard deviation = 9.3% | +-------------------------------------+ .......................................................................... Further measures of performance ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Matthews correlation coefficient: +---------------------------------------------+ | Chelix = 0.63, Cstrand = 0.53, Cloop = 0.52 | +---------------------------------------------+ .......................................................................... Average length of predicted secondary structure segments: . +------------+----------+ . | predicted | observed | +-----------+------------+----------+ | Lhelix = | 10.3 | 9.3 | | Lstrand = | 5.0 | 5.3 | | Lloop = | 7.2 | 5.9 | +-----------+------------+----------+ .......................................................................... The accuracy matrix in detail: +---------------------------------------+ | number of residues with H, E, L | +---------+------+------+------+--------+ | |net H |net E |net L |sum obs | +---------+------+------+------+--------+ | obs H |12447 | 1255 | 3990 | 17692 | | obs E | 949 | 7493 | 3750 | 12192 | | obs L | 2604 | 2875 |19962 | 25441 | +---------+------+------+------+--------+ | sum Net |16000 |11623 |27702 | 55325 | +---------+------+------+------+--------+ Note: This table is to be read in the following manner: 12447 of all residues predicted to be in helix, were observed to be in helix, 949 however belong to observed strands, 2604 to observed loop regions. The term "observed" refers to the DSSP assignment of secondary structure calculated from 3D coordinates of experimentally determined structures (Dictionary of Secondary Structure of Proteins: Kabsch & Sander (1983) Biopolymers, 22, 2577-2637). Position-specific reliability index ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The network predicts the three secondary structure types using real numbers from the output units. The prediction is assigned by choosing the maximal unit ("winner takes all"). However, the real numbers contain additional information. E.g. the difference between the maximal and the second largest output unit can be used to derive a "reliability index". This index is given for each residue along with the prediction. The index is scaled to have values between 0 (lowest reliability), and 9 (highest). The accuracies (Qtot) to be expected for residues with values above a particular value of the index are given below as well as the fraction of such residues (%res).: +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ | index| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | | %res |100.0| 99.2| 90.4| 80.9| 71.6| 62.5| 52.8| 42.3| 29.8| 14.1| +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ | | | | | | | | | | | | | Qtot | 72.1| 72.3| 74.8| 77.7| 80.3| 82.9| 85.7| 88.5| 91.1| 94.2| | | | | | | | | | | | | +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ | H%obs| 70.4| 70.6| 73.7| 77.1| 80.1| 83.1| 86.0| 89.3| 92.5| 96.4| | E%obs| 61.5| 61.7| 63.7| 66.6| 69.1| 71.7| 74.6| 77.0| 77.8| 68.1| | | | | | | | | | | | | | H%prd| 77.8| 78.0| 80.0| 82.6| 84.7| 86.9| 89.2| 91.3| 93.1| 95.4| | E%prd| 64.5| 64.7| 67.8| 71.0| 74.2| 77.6| 81.4| 85.1| 89.8| 93.5| +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ The above table gives the cumulative results, e.g. 62.5% of all residues have a reliability of at least 5. The overall three-state accuracy for this subset of almost two thirds of all residues is 82.9%. For this subset, e.g., 83.1% of the observed helices are correctly predicted, and 86.9% of all residues predicted to be in helix are correct. .......................................................................... The following table gives the non-cumulative quantities, i.e. the values per reliability index range. These numbers answer the question: how reliable is the prediction for all residues labeled with the particular index i. +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+ | index| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | | %res | 8.8| 9.5| 9.3| 9.1| 9.7| 10.5| 12.5| 15.7| 14.1| +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+ | | | | | | | | | | | | Qtot | 46.6| 50.6| 57.7| 62.6| 67.9| 74.2| 82.2| 88.3| 94.2| | | | | | | | | | | | +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+ | H%obs| 36.8| 42.3| 49.5| 55.2| 61.7| 69.9| 78.8| 87.4| 96.4| | E%obs| 44.7| 44.5| 52.1| 55.4| 60.9| 68.0| 75.9| 81.0| 68.1| | | | | | | | | | | | | H%prd| 49.9| 52.5| 60.3| 64.2| 69.2| 77.5| 85.4| 89.9| 95.4| | E%prd| 41.7| 47.1| 53.6| 57.0| 64.0| 71.6| 78.8| 88.8| 93.5| +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+ For example, for residues with Relindex = 5 64% of all predicted betha- strand residues are correctly identified. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Solvent accessibility prediction by PHDacc: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Author: Burkhard Rost EMBL, Heidelberg, FRG Meyerhofstrasse 1, 69 117 Heidelberg Internet: Rost@EMBL-Heidelberg.DE All rights reserved. About the network method ~~~~~~~~~~~~~~~~~~~~~~~ The network for prediction of secondary structure is described in detail in: Rost, Burkhard; Sander, Chris: Prediction of protein structure at better than 70% accuracy. J. Mol. Biol., 1993, 232, 584-599. The analysis of the prediction of solvent exposure is given in: Rost, Burkhard; Sander, Chris: Conservation and prediction of solvent accessibility in protein families. Proteins, 1994, 20, 216-226. To be quoted for publications of PHD exposure prediction: Both papers quoted above. Definition of accessibility ~~~~~~~~~~~~~~~~~~~~~~~~~~ For training the residue solvent accessibility the DSSP (Dictionary of Secondary Structure of Proteins; Kabsch & Sander (1983) Biopolymers, 22, 2577-2637) values of accessible surface area have been used. The prediction provides values for the relative solvent accessibility. The normalisation is the following: | ACCESSIBILITY (from DSSP in Angstrom) |RELATIVE_ACCESSIBILITY = ------------------------------------- * 100 | MAXIMAL_ACC (amino acid type i) where MAXIMAL_ACC (i) is the maximal accessibility of amino acid type i. The maximal values are: +----+----+----+----+----+----+----+----+----+----+----+----+ | A | B | C | D | E | F | G | H | I | K | L | M | | 106| 160| 135| 163| 194| 197| 84| 184| 169| 205| 164| 188| +----+----+----+----+----+----+----+----+----+----+----+----+ | N | P | Q | R | S | T | V | W | X | Y | Z | | 157| 136| 198| 248| 130| 142| 142| 227| 180| 222| 196| +----+----+----+----+----+----+----+----+----+----+----+ Notation: one letter code for amino acid, B stands for D or N; Z stands for E or Q; and X stands for undetermined. The relative solvent accessibility can be used to estimate the number of water molecules (W) in contact with the residue: W = ACCESSIBILITY /10 The prediction is given in 10 states for relative accessibility, with RELATIVE_ACCESSIBILITY = (PREDICTED_ACC * PREDICTED_ACC) where PREDICTED_ACC = 0 - 9. Estimated Accuracy of Prediction ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A careful cross validation test on some 238 protein chains (in total about 62,000 residues) with less than 25% pairwise sequence identity gave the following results: Correlation ........... The correlation between observed and predicted solvent accessibility is: ----------- corr = 0.53 ----------- This value ought to be compared to the worst and best case prediction scenario: random prediction (corr = 0.0) and homology modelling (corr = 0.66). (Note: homology modelling yields a relative accurate prediction in 3D if, and only if, a significantly identical sequence has a known 3D structure.) 3-state accuracy ................ Often the relative accessibility is projected onto, e.g., 3 states: b = buried (here defined as < 9% relative accessibility), i = intermediate ( 9% <= rel. acc. < 36% ), e = exposed ( rel. acc. >= 36% ). A projection onto 3 states or 2 states (buried/exposed) enables the compilation of a 3- and 2-state prediction accuracy. PHD reaches an overall 3-state accuracy of: Q3 = 57.5% (compared to 35% for random prediction and 70% for homology modelling). In detail: +-----------------------------------+-------------------------+ | Qburied (% of observed)=77% | Qb (% of predicted)=60% | | Qintermediate (% of observed)= 9% | Qi (% of predicted)=44% | | Qexposed (% of observed)=78% | Qe (% of predicted)=56% | +-----------------------------------+-------------------------+ 10-state accuracy ................. The network predicts relative solvent accessibility in 10 states, with state i (i = 0-9) corresponding to a relative solvent accessibility of i*i %. The 10-state accuracy of the network is: Q10 = 24.5% .......................................................................... These percentages are defined by: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | number of correctly predicted residues |Q3 = --------------------------------------- (*100) | number of all residues | | no of res. correctly predicted to be buried |Qburied (% of obs) = ------------------------------------------- (*100) | no of all res. observed to be buried | | | no of res. correctly predicted to be buried |Qburied (% of pred)= ------------------------------------------- (*100) | no of all residues predicted to be buried .......................................................................... Averaging over single chains ~~~~~~~~~~~~~~~~~~~~~~~~~~~ The most reasonable way to compute the overall accuracies is the above quoted percentage of correctly predicted residues. However, since the user is mainly interested in the expected performance of the prediction for a particular protein, the mean value when averaging over protein chains might be of help as well. Computing first the correlation between observed and predicted accessibility for each protein chan, and then averaging over all 238 chains yields the following average: +-------------------------------====--+ | corr/averaged over chains = 0.53 | +-------------------------------====--+ | standard deviation = 0.11 | +-------------------------------------+ .......................................................................... Further details of performance accuracy ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The accuracy matrix in detail: .............................. -------+----------------------------------------------------+----------- \ PHD | 0 1 2 3 4 5 6 7 8 9 | SUM %obs -------+----------------------------------------------------+----------- OBS 0 | 8611 140 8 44 82 169 772 334 27 0 | 10187 16.6 OBS 1 | 4367 164 0 50 106 231 738 346 44 3 | 6049 9.8 OBS 2 | 3194 168 1 68 125 303 951 513 42 7 | 5372 8.7 OBS 3 | 2760 159 8 80 136 327 1246 746 58 19 | 5539 9.0 OBS 4 | 2312 144 2 72 166 396 1615 1245 124 19 | 6095 9.9 OBS 5 | 1873 96 3 84 138 425 1979 1834 187 27 | 6646 10.8 OBS 6 | 1387 67 1 60 80 278 2237 2627 231 51 | 7019 11.4 OBS 7 | 1082 35 0 32 56 225 1871 3107 302 60 | 6770 11.0 OBS 8 | 660 25 0 27 43 136 1206 2374 325 87 | 4883 7.9 OBS 9 | 325 20 2 27 29 74 648 1159 366 214 | 2864 4.7 -------+----------------------------------------------------+----------- SUM |26571 1018 25 544 961 2564 13263 14285 1706 487 | %pred | 43.3 1.7 0.0 0.9 1.6 4.2 21.6 23.3 2.8 0.8 | -------+----------------------------------------------------+----------- Note: This table is to be read in the following manner: 8611 of all residues predicted to be in exposed by 0%, were observed with 0% relative accessibility. However, 325 of all residues predicted to have 0% are observed as completely exposed (obs = 9 -> rel. acc. >= 81%). The term "observed" refers to the DSSP compilation of area of solvent accessibility calculated from 3D coordinates of experimentally determined structures (Diction- ary of Secondary Structure of Proteins: Kabsch & Sander (1983) Biopolymers, 22, 2577-2637). Accuracy for each amino acid: ............................. +---+------------------------------+-----+-------+------+ |AA | Q3 b%o b%p i%o i%p e%o e%p | Q10 | corr | N | +---+------------------------------+-----+-------+------+ | A | 59.0 87 60 2 38 66 57 | 31 | 0.530 | 5054 | | C | 62.0 91 67 5 39 25 21 | 34 | 0.244 | 893 | | D | 56.5 21 45 6 49 94 57 | 20 | 0.321 | 3536 | | E | 60.8 9 40 3 41 98 61 | 21 | 0.347 | 3743 | | F | 63.3 94 67 9 46 29 37 | 27 | 0.366 | 2436 | | G | 52.1 75 51 1 31 67 53 | 22 | 0.405 | 4787 | | H | 50.9 63 53 23 45 71 50 | 18 | 0.442 | 1366 | | I | 64.9 95 68 6 41 30 38 | 34 | 0.360 | 3437 | | K | 66.6 2 11 2 37 98 67 | 23 | 0.267 | 3652 | | L | 61.6 93 65 8 44 31 40 | 31 | 0.368 | 5016 | | M | 60.1 92 64 5 39 45 44 | 29 | 0.452 | 1371 | | N | 55.5 45 45 8 38 87 59 | 17 | 0.410 | 2923 | | P | 53.0 48 48 9 39 83 56 | 18 | 0.364 | 2920 | | Q | 54.3 27 44 7 44 92 56 | 20 | 0.344 | 2225 | | R | 49.9 15 47 36 47 76 51 | 18 | 0.372 | 2765 | | S | 55.6 69 53 3 51 81 56 | 22 | 0.464 | 3981 | | T | 51.8 61 51 8 38 78 53 | 21 | 0.432 | 3740 | | V | 61.1 93 65 5 40 39 42 | 34 | 0.418 | 4156 | | W | 56.2 85 62 20 49 29 27 | 21 | 0.318 | 891 | | Y | 49.7 73 52 33 49 36 38 | 19 | 0.359 | 2301 | +---+------------------------------+-----+-------+------+ Abbreviations: AA: amino acid in one-letter code b%o, i%o, e%o: = Qburied, Qintermediate, Qexposed (% of observed), i.e. percentage of correct prediction in each state, see above b%p, i%p, e%p: = Qburied, Qintermediate, Qexposed (% of predicted), i.e. probability of correct prediction in each state, see above b%o: = Qburied (% of observed), see above Q10: percentage of correctly predicted residues in each of the 10 states of predicted relative accessibility. corr: correlation between predicted and observed rel. acc. N: number of residues in data set Accuracy for different secondary structure: ........................................... +--------+------------------------------+----+-------+-------+ | type | Q3 b%o b%p i%o i%p e%o e%p |Q10 | corr | N | +--------+------------------------------+----+-------+-------+ | helix | 59.5 79 64 8 44 80 56 | 27 | 0.574 | 20100 | | strand | 61.3 84 73 9 46 69 37 | 35 | 0.524 | 13356 | | loop | 54.4 64 43 11 44 78 61 | 18 | 0.442 | 27968 | +--------+------------------------------+----+-------+-------+ Abbreviations as before. Position-specific reliability index ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The network predicts the 10 states for relative accessibility using real numbers from the output units. The prediction is assigned by choosing the maximal unit ("winner takes all"). However, the real numbers contain additional information. E.g. the difference between the maximal and the second largest output unit (with the constraint that the second largest output is compiled among all units at least 2 positions off the maximal unit) can be used to derive a "reliability index". This index is given for each residue along with the prediction. The index is scaled to have values between 0 (lowest reliability), and 9 (highest). The accuracies (Q3, corr, asf.) to be expected for residues with values above a particular value of the index are given below as well as the fraction of such residues (%res).: +---+------------------------------+----+-------+-------+ |RI | Q3 b%o b%p i%o i%p e%o e%p |Q10 | corr | %res | +---+------------------------------+----+-------+-------+ | 0 | 57.5 77 60 9 44 78 56 | 24 | 0.535 | 100.0 | | 1 | 59.1 76 63 9 45 82 57 | 25 | 0.560 | 91.2 | | 2 | 61.7 79 66 4 47 87 58 | 27 | 0.594 | 77.1 | | 3 | 66.6 87 70 1 51 89 63 | 30 | 0.650 | 57.1 | | 4 | 70.0 89 72 0 83 91 67 | 32 | 0.686 | 45.8 | | 5 | 72.9 92 75 0 0 93 70 | 34 | 0.722 | 35.6 | | 6 | 76.3 95 77 0 0 93 75 | 36 | 0.769 | 24.7 | | 7 | 79.0 97 79 0 0 93 78 | 39 | 0.803 | 16.0 | | 8 | 80.9 98 80 0 0 91 81 | 43 | 0.824 | 9.6 | | 9 | 81.2 99 80 0 0 88 83 | 45 | 0.828 | 5.9 | +---+------------------------------+----+-------+-------+ Abbreviations as before. The above table gives the cumulative results, e.g. 45.8% of all residues have a reliability of at least 4. The correlation for this most reliably predicted half of the residues is 0.686, i.e. a value comparable to what could be expected if homology modelling were possible. For this subset of 45.8% of all residues, 89% of the buried residues are correctly predicted, and 72% of all residues predicted to be buried are correct. .......................................................................... The following table gives the non-cumulative quantities, i.e. the values per reliability index range. These numbers answer the question: how reliable is the prediction for all residues labeled with the particular index i. +---+------------------------------+----+-------+-------+ |RI | Q3 b%o b%p i%o i%p e%o e%p |Q10 | corr | %res | +---+------------------------------+----+-------+-------+ | 0 | 40.9 79 40 16 41 21 40 | 14 | 0.175 | 8.8 | | 1 | 45.4 61 46 28 44 48 44 | 17 | 0.278 | 14.1 | | 2 | 47.4 53 52 10 46 80 44 | 19 | 0.343 | 19.9 | | 3 | 52.9 75 59 4 50 77 47 | 23 | 0.439 | 11.4 | | 4 | 60.0 81 63 0 83 84 56 | 25 | 0.547 | 10.1 | | 5 | 65.2 82 70 0 0 93 62 | 28 | 0.607 | 10.9 | | 6 | 71.3 90 72 0 0 94 70 | 31 | 0.692 | 8.8 | | 7 | 76.0 94 76 0 0 95 75 | 34 | 0.762 | 6.3 | | 8 | 80.5 97 81 0 0 94 79 | 39 | 0.808 | 3.8 | | 9 | 81.2 99 80 0 0 88 83 | 45 | 0.828 | 5.9 | +---+------------------------------+----+-------+-------+ For example, for residues with RI = 4 83% of all predicted intermediate residues are correctly predicted as such. The resulting network (PHD) prediction is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ________________________________________________________________________________ PHD: Profile fed neural network systems from HeiDelberg ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Prediction of: secondary structure, by PHDsec solvent accessibility, by PHDacc and helical transmembrane regions, by PHDhtm Author: Burkhard Rost EMBL, 69012 Heidelberg, Germany Internet: Rost@EMBL-Heidelberg.DE All rights reserved. The network systems are described in: PHDsec: B Rost & C Sander: JMB, 1993, 232, 584-599. B Rost & C Sander: Proteins, 1994, 19, 55-72. PHDacc: B Rost & C Sander: Proteins, 1994, 20, 216-226. PHDhtm: B Rost et al.: Prot. Science, 1995, 4, 521-533. Some statistics ~~~~~~~~~~~~~~~ Percentage of amino acids: +--------------+--------+--------+--------+--------+--------+ | AA: | A | L | R | E | I | | % of AA: | 9.9 | 9.1 | 8.2 | 7.8 | 7.3 | +--------------+--------+--------+--------+--------+--------+ | AA: | D | V | G | T | S | | % of AA: | 7.3 | 6.9 | 6.9 | 5.6 | 4.3 | +--------------+--------+--------+--------+--------+--------+ | AA: | F | P | Y | K | H | | % of AA: | 4.3 | 3.9 | 3.0 | 3.0 | 3.0 | +--------------+--------+--------+--------+--------+--------+ | AA: | Q | M | N | W | C | | % of AA: | 2.6 | 2.6 | 1.7 | 1.3 | 1.3 | +--------------+--------+--------+--------+--------+--------+ Percentage of secondary structure predicted: +--------------+--------+--------+--------+ | SecStr: | H | E | L | | % Predicted: | 33.6 | 26.3 | 40.1 | +--------------+--------+--------+--------+ According to the following classes: all-alpha: %H>45 and %E< 5; all-beta : %H<5 and %E>45 alpha-beta : %H>30 and %E>20; mixed: rest, this means that the predicted class is: alpha-beta PHD output for your protein ~~~~~~~~~~~~~~~~~~~~~~~~~~~ Thu Feb 3 04:30:32 2000 Jury on: 10 different architectures (version 5.94_317 ). Note: differently trained architectures, i.e., different versions can result in different predictions. About the protein ~~~~~~~~~~~~~~~~~ HEADER /home/phd/server/work/predict_h6314640.f COMPND SOURCE AUTHOR SEQLENGTH 232 NCHAIN 1 chain(s) in predict_h6314640 data set NALIGN 12 (=number of aligned sequences in HSSP file) Abbreviations: PHDsec ~~~~~~~~~~~~~~~~~~~~~ sequence: AA : amino acid sequence secondary structure: HEL: H=helix, E=extended (sheet), blank=other (loop) PHD: Profile network prediction HeiDelberg Rel: Reliability index of prediction (0-9) detail: prH: 'probability' for assigning helix prE: 'probability' for assigning strand prL: 'probability' for assigning loop note: the 'probabilites' are scaled to the interval 0-9, e.g., prH=5 means, that the first output node is 0.5-0.6 subset: SUB: a subset of the prediction, for all residues with an expected average accuracy > 82% (tables in header) note: for this subset the following symbols are used: L: is loop (for which above " " is used) ".": means that no prediction is made for this residue, as the reliability is: Rel < 5 Abbreviations: PHDacc ~~~~~~~~~~~~~~~~~~~~~ SS : secondary structure HEL: H=helix, E=extended (sheet), blank=other (loop) solvent accessibility: 3st: relative solvent accessibility (acc) in 3 states: b = 0-9%, i = 9-36%, e = 36-100%. PHD: Profile network prediction HeiDelberg Rel: Reliability index of prediction (0-9) O_3: observed relative acc. in 3 states: B, I, E note: for convenience a blank is used intermediate (i). P_3: predicted relative accessibility in 3 states 10st:relative accessibility in 10 states: = n corresponds to a relative acc. of n*n % subset: SUB: a subset of the prediction, for all residues with an expected average correlation > 0.69 (tables in header) note: for this subset the following symbols are used: "I": is intermediate (for which above " " is used) ".": means that no prediction is made for this residue, as the reliability is: Rel < 4 protein: predict length 232 ....,....1....,....2....,....3....,....4....,....5....,....6 AA |TTLSCKVTSVEAITDTVYRVRIVPDAAFSFRAGQYLMVVMDERDKRPFSMASTPDEKGFI| PHD sec | EEEEE EEE EEEEEEE EE EEEEEEE EE EE| Rel sec |992226741111322126888834998314266289999837989971221399998289| detail: prH sec |000000001233333321000000000100000000000000000000000000000000| prE sec |003557864433101247888863000346322589999831000015534300000389| prL sec |985442124332555431000136998542577310000168889984455599998510| subset: SUB sec |LL...EE..........EEEEE..LLL....LL.EEEEEE.LLLLLL.....LLLLL.EE| accessibility 3st: P_3 acc |eebebebebbeebeeebbeb beeeeebebebbbbbbbbbeeeee bbbbbb eeeeeeb| 10st: PHD acc |960706060067067700604067766060600000000077777500000047877760| Rel acc |501351522510302042170220202215063239391331322114555001122108| subset: SUB acc |e...b.b..b......b..b.........b.b...b.b.........bbbb........b| ....,....7....,....8....,....9....,....10...,....11...,....12 AA |ELHIGASEINLYAKAVMDRILKDHQIVVDIPHGEAWLRDDEERPMILIAGGTGFSYARSI| PHD sec |EEEE HHHHHHHHHHHH EEEEE EEE EEEEE HHHHHHH| Rel sec |997335544157899999993288179968997311433799996989399527999999| detail: prH sec |000001123467899999995400000000001321000000000000000247899999| prE sec |998531110000000000000000589971000034653100007988600000000000| prL sec |001356666421100000003588410028897644235899992000398752000000| subset: SUB sec |EEE..LL...HHHHHHHHHH..LL.EEEELLLL......LLLLLEEEE.LLL.HHHHHHH| accessibility 3st: P_3 acc |bbbbebbeeeeebebbbeebeeeeebebebeebebbb eeeeebbbbbbbbbbbb b bb| 10st: PHD acc |000060077776070007607777606070760600057777600000000000050500| Rel acc |062720112100210732042311151500013132712113127887843042306076| subset: SUB acc |.b.b...........b...b.....b.b........b.......bbbbbb..b...b.bb| ....,....13...,....14...,....15...,....16...,....17...,....18 AA |LLTALARNPNRDITIYWGGREEQHLYDLCELEALSLKHPGLQVVPVVEQPEAGWRGRTGT| PHD sec |HHHHHHH EEEEE HHHHHHHHHHHHHHHH EEEEEE | Rel sec |999999579992799874875478844899999999549955999743799665557653| detail: prH sec |999999710000000000012678866899999999730000000000000122210011| prE sec |000000000004899873000000000000000000000027998863100000111122| prL sec |000000279995100016877311123000000000269972000136889776667765| subset: SUB sec |HHHHHHHLLLL.EEEEE.LLL.HHH..HHHHHHHHHH.LLLEEEEE..LLLLLLLLLLL.| accessibility 3st: P_3 acc |bebbbeeeeeeebbbbbbbeeeeebbbeeebeebbee eebebbbbbeeeeeeeebeebe| 10st: PHD acc |060007776766000000066676000676067007747706000007767977707706| Rel acc |711322410310829243120001601111422231201030454751102020201131| subset: SUB acc |b.....e.....b.b.b.......b.....b...........bbbbb.............| ....,....19...,....20...,....21...,....22...,....23...,....24 AA |VLTAVLQDHGTLAEHDIYIAGRFEMAKIARDLFCSERNAREDRLFGDAFAFI| PHD sec |HHHHHHHHHHHH E EEEE HHHHHHHHHHHHHH HHHHHHHHHH | Rel sec |1479999896303412654155479999999998213367236778887519| detail: prH sec |4579999897543210011001689999999998553321567778888640| prE sec |3210000000000143766421000000000000000000000000000000| prL sec |2110000002346645112476210000000001346678432111001259| subset: SUB sec |..HHHHHHHH......EE..LL.HHHHHHHHHHH....LL..HHHHHHHH.L| accessibility 3st: P_3 acc |bbebbeebbeebbebbbbbbb bebbebbeeebeeeeebeeeebbbebbbbb| 10st: PHD acc |0060067007600600000005060070067606777707776000600000| Rel acc |7207802032141232937550017320713150212202211542085332| subset: SUB acc |b..bb......b....b.bbb...b...b...b..........bb..bb...| _ _______________________________________________________________________________ The resulting prediction of globularity is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ________________________________________________________________________________ --- --- GLOBE: prediction of protein globularity --- --- nexp = 116 (number of predicted exposed residues) --- nfit = 102 (number of expected exposed residues --- diff = 14.00 (difference nexp-nfit) --- =====> your protein appears as compact, as a globular domain --- --- --- GLOBE: further explanations preliminaryily in: --- http://www.columbia.edu/~rost/Papers/98globe.html --- --- END of GLOBE The resulting prediction is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~ ________________________________________________________________________________ --- --- ------------------------------------------------------------ --- TOPITS prediction-based threading --- ------------------------------------------------------------ --- --- TOPITS ALIGNMENTS HEADER: PARAMETERS --- str:seq= 50 : structure (sec str, acc)= 50%, sequence= 50% --- str:seq = 50 : weight structure/sequence,i.e. str= 50%, seq= 50% --- smin = -1.00 : minimal value of alignment metric --- smax = 2.00 : maximal value of alignment metric --- go = 2 : gap open penalty --- ge = 0.2 : gap elongation penalty --- len1 = 232 : length of search sequence, i.e., your protein --- --- TOPITS ALIGNMENTS HEADER: ABBREVIATIONS --- RANK : rank in alignment list, sorted according to z-score --- EALI : alignment score --- LALI : length of alignment --- IDEL : number of residues inserted --- NDEL : number of insertions --- ZALI : alignment zcore; note: hits with z>3 more reliable --- PIDE : percentage of pairwise sequence identity --- LEN2 : length of aligned protein structure --- ID2 : PDB identifier of aligned structure --- NAME2 : name of aligned protein structure --- IFIR : position of first residue of search sequence --- ILAS : position of last residue of search sequence --- JFIR : PDB position of first residue of remote homologue --- JLAS : PDB position of last residue of remote homologue --- --- TOPITS ALIGNMENTS HEADER: ACCURACY --- : Tested on 80 proteins, TOPITS found the --- : correct remote homologue in about 30% of --- : the cases, detection accuracy was higher --- : for higher z-scores (ZALI): --- ZALI>0 : 1st hit correct in 33% of cases --- ZALI>3 : 1st hit correct in 50% of cases --- ZALI>3.5 : 1st hit correct in 60% of cases --- --- TOPITS ALIGNMENTS HEADER: SUMMARY RANK EALI LALI IDEL NDEL ZALI PIDE LEN2 ID2 NAME2 1 82.80 210 69 18 2.31 31 270 1ndh OCHROME B=5= REDUCTASE (E 2 76.13 220 85 19 1.97 28 321 2pia HALATE DIOXYGENASE REDUCT 3 75.67 229 128 23 1.95 33 525 1der_A OL_ID: 1; 4 74.87 222 47 13 1.91 27 296 1fnc REDOXIN:NADP+ OXIDOREDUCT 5 74.67 207 60 25 1.90 31 295 1uox _ID: 1; 6 74.40 220 84 21 1.89 31 698 1aa6 _ID: 1; 7 74.27 224 112 25 1.88 31 785 1pys_B OL_ID: 1; 8 73.73 226 100 27 1.85 34 436 4enl LASE (E.C.4.2.1.11) (2-PH 9 73.60 217 78 21 1.84 30 619 1sqc _ID: 1; 10 73.27 217 78 21 1.83 30 487 1bpo_A MOL_ID: 1; 11 72.73 208 59 16 1.80 26 378 1ba1 L_ID: 1; 12 72.67 204 60 20 1.80 30 839 1yge L_ID: 1; 13 72.47 217 50 24 1.79 28 500 1ecf_B MOL_ID: 1; 14 72.47 214 77 26 1.79 33 720 1oac_A MOL_ID: 1; 15 72.27 210 76 24 1.78 30 316 1onr_A MOL_ID: 1; 16 72.27 217 87 25 1.78 32 447 1nhp DH PEROXIDASE (NPX) (E.C. 17 72.00 226 92 27 1.76 33 452 1pii (5'PHOSPHORIBOSYL)ANTHRAN 18 71.93 197 75 23 1.76 32 602 1kfs_A MOL_ID: 1; 19 71.87 220 148 34 1.76 40 420 1adj_A MOL_ID: 1; 20 71.73 220 61 23 1.75 30 405 1eft ONGATION FACTOR TU (EF-TU --- --- TOPITS ALIGNMENTS HEADER: PDB_POSITIONS FOR ALIGNED PAIR RANK PIDE IFIR ILAS JFIR JLAS LALI LEN2 ID2 --- --- TOPITS ALIGNMENTS: SYMBOLS AND EXPLANATIONS --- BLOCK 1 : your protein and its predicted 1D structure, --- : i.e., secondary structure and solvent accessibility --- line 1 : amino acid sequence (one-letter-code) --- line 2 : predicted secondary structure: --- H : helix --- E : strand (extended) --- L : other (no regular secondary structure) --- line 3 : predicted residue relative solvent accessibility --- B : buried, i.e., relative accessibility < 15% --- O : exposed (outside), i.e., relative accessibility >= 15% --- : --- BLOCKS 1-20 : 20 best hits of the prediction-based threading --- ATTENTION : We chose to include all first 20 hit. However, --- ATTENTION : most of them will not constitute true remote --- ATTENTION : homologues. Instead, all hits with a zscore --- ATTENTION : (ZALI) < 3.5 are, at best, rather speculative! --- : for each aligned protein: --- line 1 : amino acids conserved between guide (yours) and the --- : aligned protein (putative homologue) --- line 1 : sequence of aligned protein --- line 3 : secondary structure, taken from DSSP (assignment --- : of secondary structure based on experimental coordinates) --- line 4 : relative solvent accessibility, taken from DSSP --- --- TOPITS ALIGNMENTS 1 - 51 ....:....1....:....2....:....3....:....4....:....5 pred TTLSCKVTSVEAITDTVYRVRIVPDAAFSFRAGQYLMVVMDERDKRPFSMA EEEEE EEE EEEEEEE EE EEEEEEE EE OOBOBOBOBBOOBOOOBBOBOBOOOOOBOBOBBBBBBBBBOOOOOOBBBBB 1. 1ndh 82.80 PAItdIKYPLRLipEHILGLPVGqyLSARIDGNLvrPYTPV LLELLLEEEEEEELLLELLLLLLLEEEEEELLEEEEEELLL TTL K S E I P A F AG L V R S 2. 2pia 76.13 TtlRLKIASKEKIARDIWSFELtpQGapPFEAGANLTVAVPNGSRRTYSLC LLEEEEEEEEEEEELLEEEEEEELLLLLLLLLLLEEEEELLLLLEEEEELL TT V IT V AA G V S 3. 1der_A 75.67 TTTA.TVLAQAIITEGLKAV.....AamDLKRGIDKAVTVAVEELKALSVP HHHH.HHHHHHHHHHHHHHH.....HLHHHHHHHHHHHHHHHHHHHHHLEL L K T A T R GQ V D DKR S A 4. 1fnc 74.87 LNTKITGDDAPGET.WHMVFSHEGEIPYREGQSVGVIPDGEDkrLYSIA EEEELLLLLLLLLE.EEEEEELLLLLLLLLLLEEEEELLLELLEEEELL KV E TVY V S A V D A 5. 1uox 74.67 KVHKDEKtvQTVYevLLEGEIETSykADNSVIVATDSIKNTIYITA EEEELLLLLEEEEEEEEEELLHHHHLLLHHHLLLHHHHHHHHHHHH S TD V IV R G V D R A 6. 1aa6 74.40 MSNAINEIDN.TDLVfsHPIVANHVINarNGAKIIV....CDPRKIETA LLLLHHHHHH.LLEEEHLHHHHHHHHHHHLLLEEEE....ELLLLLHHH T S E V R P AAF R Q L VMD D R F 7. 1pys_B 74.27 TPPSHRllRLEelVEEVARIqtIPLaafpYRKEQRLREvmDPEDARRFRL. ELLLLLLLLLHHHHHHHHHHHHLLLLLLHHHHHHHHHHHELLLHHHHLLL. S A V D S Y VV F A 8. 4enl 73.73 VSLAASRAAAAEKNVPLYKHLADLSKS.KTSPYVlvvLNGGShqEFMIA HHHHHHHHHHHHHLLLHHHHHHHHHLL.LLLLEEEEEEELHHHLEEEEE T EA Y D A F Q L V K P 9. 1sqc 73.60 TTIEAYVALKY.IGMSRDeaLRFIQSqwLALVGepWEKVPM... HHHHHHHHHHH.HLLLLLLHHHHHHHLHHHHLLLLHHHLLL... T VT I V V D A S G V M R S A 10. 1bpo_A 73.27 HTMTDDVTFWKWI..SLNTVALVTDNawSM.EGESQPVKMFDRHS...SLa EELLLLLLEEEEE..ELLEEEEELLLEEEL.LLLLLLEEEEELLH...HHL V T Y V F G D P A 11. 1ba1 72.73 AVGIDLGTTykVGV.......FQHGKVEIIANDQGNrtPSYVA LEEEELLLLEEEEE.......EELLEEEELLLLLLLLEELLEE S A T IV AAF D DK P 12. 1yge 72.67 HLKSKDALegTKSLSQIVQPaaFDLKSTPifHSFQDVHdkLPRDVI LLLHHHLHHHHHHHHHLHHHHHHHLLLLLLLLLHHHHHHELLHHHH T S Y A E KR 13. 1ecf_B 72.47 TAGSSSASEAQpyVNSPYGITLAHNGNLT.NAHELRKKLFEE..KRRH.IN ELLELLLLLLLLEELLLLLEEEEEEEEEL.LHHHHHHHHHHH..HLLL.LL LS V TVY IVPD F AG Y M R K S A 14. 1oac_A 72.47 HLSMN.SRVGPMISTvyEGsivpDIGWYFKagDYGMGTLTsrGKDAPSNA EEEEE.LLLEEEEEEEEEEEEELLLLLLLLEHHHLLLLLELLLLLLLLLL TT I YR DAA RA Q D DK 15. 1onr_A 72.27 TTNPSLILNAAQIPE..YR.KLIDDaawnDRAQQ....IVDATDKLAVNIg ELLHHHHHHHLLLHH..HH.HHHHHHHHLLHHHH....HHHHHHHHHHHHH T V V I Y I AF AG V D DK F 16. 1nhp 72.27 TVDPEVNNVVVI.GSGY.IGIEAAEAFA.KAGKKVTviLDRpdK.EFTDV HLLLLLLEEEEE.LLLH.HHHHHHHHHH.HLLLEEEELLLLLLH.HHHHH L CK S I D RI A A V DE F 17. 1pii 72.00 LECKKASPsvIRDDFDPARI..AAIYKHYASA.ISVLTDEKyqGSFniV EEELLEELLELLLLLLHHHH..HHHHLLLLLE.EEEELLLLLLLLLLHH S I I P AA A YL D D R 18. 1kfs_A 71.93 FDTETDSLDNISANLVGLsiEPgaAYIPVAHDYL....DAPDqrALElp EEEEELLLLLLLLLEEEEEEELLEEEEELLLLLL....LLLLLHHHHHH T K V A TD IV F FR G LMV K P MA 19. 1adj_A 71.87 TQVFEK..GVGAATD......IVRKEMFTfrGGRSlmvyLEHGMkqplWMa HHHHHH..HHLLLLH......HHHHLLLEELLLLEEHHHHHLLHHLLEEEE TTL T V A V I D A RA V E KR S 20. 1eft 71.73 TTLTAALTFVTAAENPNVEVki..DKAPEERARGITihVEYETAKRHYSHV HHHHHHHHHHHHLLLLLLLLLL..LLLHHHHHHLLLLEEEEELLLLEEEEE --- --- TOPITS ALIGNMENTS CONTINUED --- 1. 1ndh 82.80 VSSDDDKGFVDLVIKVYFKDTHPkgKMSQYLESMKIGDTIerGpaIRPDKK LLLLLLLLLLLEEEEELLLLLLLLLHHHHHHHHLLLLLEEEEEEEELLLLL E I I D V P E D 2. 2pia 76.13 CNDSQERnvIAVKRDSnsISF.....IDDTSEGDAVEVSLPRNE.FPLDKR LLLLLLLLEEEEELLLLHHHH.....HHLLLLLLEEEELLLELL.LLLLLL D K I A E L A A MD K I V G L D E 3. 1der_A 75.67 PCS.DSKAIAQvtISAnetkLIAEA.MDKVGKEGVITVEDGTG...LQDee LLL.LHHHHHHHHHHLLHHHHHHHH.HHHLLLLLEEEEELLLL...LLLEL AS D K L A E K V L D P GE D 4. 1fnc 74.87 ASSadAKS.VSLCvdAGET...IKGVCSNFLCdaEVKLTGPVgeMLMPKDP LLLLLLLE.EEEEELLLLE...EELHHHHHHHLLEEEEEEEELLLLLELLL A TP E FIE HI A N V R DIPH RD E 5. 1uox 74.67 AktPPetHFIEkhIHAAHVNI....VCHRWTR.....MDipHPHSFIRDSE HHLLHHHHHHHHLEEEEEEEE....EEELLEE.....EEEEEEEEEELLLL A D I L G E NLY KAV RI V DI A 6. 1aa6 74.40 ARIADMH..IALKNGSneENLYDKavASriVEGyeSVEDItrQAARMYAQA HHHLLEE..ELLLLLLHHLLLLLHHHHHHHHLLLHHHHHHHHHHHHHHHHL D L A E L V R LK D EA LR E 7. 1pys_B 74.27 ....DPPRLLLLNPLAPetHLFPGLV..RVLKEN...LDLDRPeaLlrERE ....LLLLLEELLLLLHHLLLHHHHH..HHHHHH...HHHLLLLEEELLLE A T K F EL IG SE NL K V G A E 8. 4enl 73.73 APT.GAKTFAelRIG.SEvnL..KSLTKKRYGASAGNVGDEGGVaiQTAEE ELL.LLLLHHHHHHH.HHHHH..HHHHHHHHLHHHHLELLLLLELLLLHHH P E F L I E A AVM R L V D P G W D 9. 1sqc 73.60 ..VPPEIMfmPLNI..YEFGSWARavMSrpLPERARvtDVPpgGGWIFDAL ..LLHHHHHLLLLH..HHLLHHHHHHHHHLLLHHHLLLLLLLLLLHHHHHH A T L I A LY V I HQ E LR 10. 1bpo_A 73.27 arTDAKQKWLLLtiSAQqmQLYsrKVSQPI.EGhqFKMEGNAEESTlrGQA LEELLLLLEEEEEEEEELEEEEELLLEEEE.LLLEEELLLLLLEEEELLLL A T E I AK R D D H D 11. 1ba1 72.73 AFTDTER.LiqVAMNPTNTVFDAKRLIGRRFDDAVVQSDMKHWPFMVVNDA EELLLLE.EELLLLLHHHEELLHHHLLLLLLLLHHHHHHHLLLLLEEEEEL ST I L I LY ILK Q VV AW D E 12. 1yge 72.67 IST.....IIPLPV....ieLY.RTDGQHILKFPqhVVQVSQ.SAWMTDEe HHH.....HLLLLL....HHHL.EELLLLEEELLLHHHLLLL.LHHHLHHH T D I L I ASE N A A R V I HG RD 13. 1ecf_B 72.47 NTTSDSE..ILLNIFASElnIFAaaATNRLIRGAYACVaiGHGMVAFRDPN LLLLHHH..HHHHHHHHHHHHHHHHHHHHHLLEEEEEEELLLEEEEEELLL A E I G EI A AV R K V E R D 14. 1oac_A 72.47 AVLLNET..IADYTgpMEI.PRAIAVFERyyKHQEMgvSTERRELVVrxdh LEEEEEE..EELLLLEEEE.EEEEEEEEEEEEELLLLEEEEEEEEEEEEEE ST E I I LY A DRILK I E 15. 1onr_A 72.27 gsTEVDARltEASIAKAkiKLYNDAGidRIlkLASTWQGIRAAEQLEKEGI HEEELLHHHHHHHHHHHHHHHHHHLLLHHEEEEELLHHHHHHHHHHHHLLL T E I G Y K V D D VV AWL E 16. 1nhp 72.27 VLTEeeANNITIATGET.VERYekVVTDKNAYDADLVvgVRPNTAWLKGte HHHHHHLLLEEEEELLL.EEEEEEEEELLLEEELLEEELEEELLHHHLLLL S P K FI I I LYA A M L D Q H E 17. 1pii 72.00 VsaPQpkDFI...IDPYQIYlyaDalMLSVLDDDQylAAVAHseVSNEEEQ HHLLLLELLL...LLHHHHHHHLLEEELLLLLHHHHHHHHHHHEELLHHHH DEK L G NLY DRIL I G A E 18. 1kfs_A 71.93 pLLEDEK...ALKVGQ...Nly.....DriLANYGIEL...RGIAFDTMLE HHHLLLL...LLEEEL...LHH.....HHHHHLLLLLL...LLEEEEHHHH A KGF ASE NL A AV LK V PH EA L DE 19. 1adj_A 71.87 aaERPQKgfHQVNYEasE.nlDAEAvlYECLKerRLKVKlpHREA.LSEde ELLLLLLLEEEEEEEELL.LHHHHHHHHHHHHHLLLEEEEHHHHH.LLHHL P I IGA L A M IL QIVVD L E 20. 1eft 71.73 VDCPGHADYIKNMigAAQMdlVVSAameHILLARqivvDMVDDPEllVEME EELLLLHHHHHHHHHHLLLLEEEELLLHHHHHHHLEEHHHLLLLLHHHHHH --- --- TOPITS ALIGNMENTS CONTINUED --- 1. 1ndh 82.80 KSSPVimIAGGTGIT...PMliRAIMKDPD.DHtlLFANQTEKDILLRPEL LLEEELEEEEHHHHH...HHHHHHHHHLLL.LLLEEEEEEEHHHLLLHHHH IL AGG G S ARS L L R P D I W QH Y C 2. 2pia 76.13 RAKSFILVAGGIglSMarSFRLYYLTRDPesDVKifwkSKPAQHVYC.CGP LLLEEEEEEEHHHHHHHLEEEEEEEELLHHLLEEEHHLLLLLEEEEE.ELL EE P IL A S R LL A A N R I G R L D EL 3. 1der_A 75.67 eeSPFILLA.DKKISNIREMllEAVAknTMRGIVKvfGDRRKAMLQDimEL LELLEEEEE.LLEELLLHHHHHHHHLLHHLLLLLLELLHHHHHHHHHHHHH I GTG RS L N G L E 4. 1fnc 74.87 PNATIIMLGTGTGIAPFRSFLWKMFFEKhnGLAWLFLGVPTSSSLLYKEEF LLLEEEEEEEHHHHHHHHHHHHHHHLLLELLEEEEEEEELLHHHLLLHHHH EE G G S LT L N WG R E L DL 5. 1uox 74.67 EEKRNVqvVEGKGIDIKSSLslTVL.KSTNSQ...FWglRDetTlwdltDV LLEEEEEEELLLLEEEEEEEEEEEE.ELLLEL...ELLLLLLLLLLLEEEE IL G T F Y RS LT LA N R C 6. 1aa6 74.40 AKSAAILwmGVTQF.yvRS..LTSLagNLGKPHAGVNPVRGQNNVQGACDM LLLEEEEEHHHHLL.LHHH..HHHHHLLLLLLLLLEEELLLELLHHHHHHL EE L G G A LL ALAR P G E L L 7. 1pys_B 74.27 EETHllLFGEGVGLPWAKERllEAlarhPGVSGRVLVEGEEVGFLGALHPE EEEEEEEEELLEELLLLLLEEHHHHHHEEEEEEEEEELLEEEEEEEEELHH E I IA G G A S L NPN D G LY L 8. 4enl 73.73 EALDLIviaAGhgLDCASSEFFkdLdkNPNSDKSKWLTGPQLADLYhlmdW HHHHHHHHHLLLEEELLHHHHEELLLLLLLLLHHHLELHHHHHHHHHHHLH R R AL R WGG Y L L 9. 1sqc 73.60 LDRALHGYQKLSVHPFRRAAEIRALDWLLERQadGSWGGIQPPWFYALIAL HHHHHHHHHLLLLLLLHHHHHHHHHHHHHHHLLLLLLLLEHHHHHHHHHHH I GT F A A N D HLYDL 10. 1bpo_A 73.27 AGGKLHIIEVGtpfkKAVDVFFPPEAQneKHDVVFLITKYGYIHLYDLETG LLLEEEEEELLLLLLEEEELLLLLLLLLLLLLEEEEEELLLEEEEEELLLL RP G T Y S LT A N T Y Q D 11. 1ba1 72.73 AGRPKVQVegETKSFYpsSMVLteIAEatNAVVTvyFNDSQRQATKD.... LLEEEEEEELEEEEELHHHHHHHHHHHHLEEEEEELLLHHHHHHHHH.... E R MI I G F S L A IT LY EL 12. 1yge 72.67 eaREMivIRGLEEFP.PKSNLDPAIYGDQSSKIT.....ADSLDlyTMDel HHHHHHLLEELLLLL.LLLLLLHHHHLLLLLLLL.....HHHLLLLLHHHE RP LI T A S L L RD IY EE L C 13. 1ecf_B 72.47 NgrPLVlieNRTEYMVAssVALDTLGFDFLRDvaIYI..TEEGQLFtqCAD LLLLLEEELLEEEEEEELLHHHHHLLLEEEEELEEEE..ELLLLEEEELLL E I IAG TG A R T G QH Y L 14. 1oac_A 72.47 hENGTIGiaGATGIEAVKGvmHDETAKDDTRYGTLiiVGTTHQHIYNF.RL ELLLLEEEEEEEELLLEEELLLLLLHHHHLLLEEEEEEEELEEEEEEE.EE L FS A L I Y 15. 1onr_A 72.27 INCNLTLL.....FSFAQafLISPFVGR....ILDWYKANTDKKEYA.... LLEEEEEE.....LLHHHHLEEEEELHH....HHHHHHHLLLLLLLL.... E P LIA GT YA ALA N GG D E 16. 1nhp 72.27 eLHPNGLiaVgtLIKyaDTEVNIALATNARKqvKPFPggSSGLAVFdiNEV LELLLLLEELHLLEEEHLEEELLLLHHHHHHHLLLLLLLLEEEEELLLLHH ER L A G RSI L LA T Y RE H L L 17. 1pii 72.00 QERAIALGAKVVGIN.NrsIDlrELAPKLGHNVTvyAQVRELSHFAnlsAL HHHHHHLLLLEEEEE.LEEELLHHHHHHHLLLLEEHHHHHHHLLLLLEHHH E AG S A I A T EE YD L 18. 1kfs_A 71.93 ESYILNSVAGRHDmsLakTITFEEIAGKGKNQLTFNQIALEEAGRydV.TL HHHHHLLLLLLLLHHHHLLLLHHHHHLLHHHLLLHHHLLHHHHHHHHH.HH EE PM A LL L P D G EE HL L EL 19. 1adj_A 71.87 eENPMRILDSKSERDQA...LLKELGVRPMLD....FLGEeeRHLERLseL LLLHHHHLLLLLHHHHH...HHHHHLLLLHHH....HLLHHHHHHHHLLEE E R G R L AL NP G E L L 20. 1eft 71.73 EVRDLlyEFPGDEVPVIRGSALLALekNPKTK.....RGENedKIWEL..L HHHHHHLLLLLLLLLEEELLHHHHHHHLLLLL.....LLLLHHHHHHH..H --- --- TOPITS ALIGNMENTS CONTINUED --- 1. 1ndh 82.80 LEELRNEhaRFKLWYTVDRAPEAWDYSQGFVNEEMIRDHLPPPEEevLMCG HHLLHHHHLLEEEEEEEEELLLLLLLEELLLLLHHHHHHLLLHHHLEELLL AL G VE A R T T L GT A I 2. 2pia 76.13 PQAltVRdtGHWPSGTveSFGanTNARENTPFTVRLSRSGTsaNRSILEVL LHHHHHHHLLLLLLLLEELLLLLLLLLLLLLEEEEELLLLLELLLLHHHHH LE L G VV V E EA GR D L E AG 3. 1der_A 75.67 LEKATLEDLGqrVvgVGE..EAAIQGRVAQIRQQIEedREKLQERVAKLAG HLLLLLLLLEEEEEELLL..LHHHHHHHHHHHHHLLLHHHHHHHHHHHHLL E K P V G T Q L E D Y G 4. 1fnc 74.87 FEKMKEKApnFRLDFAVSREQTNEKGEKMYIQTRMAQYAVELWekdvYMCG HHHHHHHLLLEEEEEEELLLLELLLLLELLHHHHHHLLHHHHHHLLEEEEE A GLQV V A W R T T D Y 5. 1uox 74.67 VDATWqnFSGLqvRSHVPKFDATwtAREVTLKT.FAEDNSASVQATMY... EEEEEELELLHHHHHLHHHHHHHHHHHHHHHHH.HHHLLELLHHHHHH... AL PG Q VP VE AG R G V A L A 6. 1aa6 74.40 MGALPDTYPGYQYvpavESLPagyrAAHGEVRAAYIMGEDPLQTDAELSAV LLLELLEELLLEELHHLLLLLLLLHHHLLLLLEEEEELLLHHHHLLLLHHH A L P P P A R V A LA D Y 7. 1pys_B 74.27 EIAQELELPPVHllPLPDKppAAFRDLAVVvvEALVREaeSLALFDLYQGP HHHHHHLLLLLEELLLLLLLLLEEEEEEEEEHHHHHHHHEEEEEEEEELLL EA S K G Q V V P A L V Q GTLA D AG 8. 4enl 73.73 WEAWsfKTAGIQIVatVTNptAIEKKAADALLLKVNQ.IGTlaAQDSFAAG HHHHHHHHHLLEEEELLLLHHHHHLLLLLEEEELHHH.HLLHHHHHHHHLL L L HPGL VE GW TG A L G A HD AG 9. 1sqc 73.60 LKILDmqHpgLELYG.VELDYGGwqAstGLAVLA.LRAAGLPADHdlVKAG HHHLLLLLHHHHHHE.EELLLLLELLEHHHHHHH.HHHHLLLLLLHHHHHH V EAG GR G VLT VLQ LA A 10. 1bpo_A 73.27 GTCIYMNRISGETIFVTAPHeaGIIgrKGQVltNVLQN.PDLA...LRMAV LLEEEEEELLLLLEEEEEEELLEEEELLLEEEHHLLLL.HHHH...HHHHH A GL V P A A D AE I G 11. 1ba1 72.73 ..AGTI..AGLNVLRIINEPTA........AAIAYGLDKKVGAERNVLigG ..HHHH..LLLEEEEEEEHHHH........HHHHLLLLLLLLLLEEEEELL L L V Q T T L L GTL A 12. 1yge 72.67 lFMLDYHDIFMPYVRQINQLNSAKTYATRTIL..FLREDGTLKP....VAI EEEEELHHHHHHHHHHHHLLLLLLLLEEEEEE..EELLLLLEEE....EEE S P L P V A GTL E IA 13. 1ecf_B 72.47 DNPVS..NPCLFEYVYFARPDS.FIDKI.SVYSARV.NMGtlGEK...IAR LLLLL..LLEHHHHHLLLLLLL.EELLE.EHHHHHH.HHHHHHHH...HHH L L VPVV AG RT T E D A 14. 1oac_A 72.47 LD.LDVDGENNSLvpVVKPNTAG.GPRTSTMQ...VNQYNIGNEQD..AAQ EE.ELLLLLEEEEEEEEEELLLL.LLLLEEEE...EEEEEELEHHH..HLE A PG VV V EQ E G V A G LA D IA 15. 1onr_A 72.27 .PA...EDPG..VVSVSeqkEHGY...ETVVMGASFRNIgeLAGCdlTIAP .HH...HLHH..HHHHHHHHHLLL...LLEEEEELLLLHHHLLLLLEEELH A L V VVE P A T L A L L I A 16. 1nhp 72.27 VMAQKLGKE.TKAVTVVenPdaWFkpETTQILGAQLMSKADLTanAISLAI HHHHHHLLL.LEEEEEEELLLEEEELLLLEEEEEEEEELLLLLLHHHHHHH L A H V E AG G V A LQ G HD IA 17. 1pii 72.00 LMAHDDLHAAVRRVLLGENkdAgyGgqAQEVMAAalQYVGVFRNHD..IAD HHLLLLHHHHHHHHHHLLLEHHLEEEHHHHHHHHLLEEEEEELLLL..HHH L L LK P LQ VPV E R G VL H E A 18. 1kfs_A 71.93 LQ.LHLkwPDLqlVPVLSRIE.....RNGVKIDpvLHNHS..EELTLRLA. HH.HHHHHHHLLHHHHHHHHH.....HHLELELHHHHHHH..HHHHHHHH. LEA H GL P V P G R L A G E D Y A 19. 1adj_A 71.87 LeaFEVHHegLSElpRV..PGVGfvERVALALEA..EGFGLPEEkdLyvAE ELEEEEELLLHHHHLLL..LEEEEHHHHHHHHHH..LLLLLLLLLLEEHHH L A P V P E GR GTV T G D I G 20. 1eft 71.73 LDAiyIPTPVRDvkPFLMPVEDVftGR.GTVATGRI.ERGKVKVGdvEIVG HHHHHLLLLLLLLLLLEEEEEEEELLL.EEEEEEEL.LELEEELLLEEELL --- --- TOPITS ALIGNMENTS CONTINUED --- 1. 1ndh 82.80 GPPPMIQYapNL...ErgHPKERCF..AF LLLHHHLLLHHH...HHLLLHHHEE..LL R CS D D 2. 2pia 76.13 LRDANVRVpkTALCSGEADHRDMVLRD HHHLLLLLLEEEEEELLEELLLLLLLL G K A E ARED L G A I 3. 1der_A 75.67 GGVAVIKvaTEVEMKEKKAreDALhgGGVALI LLEEEEELLLLLHHHHHHHHHHHHHLLLHHHH G M K D S A R A 4. 1fnc 74.87 GLKGMEKGIDDIMVSLAAAEgkRQLKKA ELHHHHHHHHHHHHHHHHLLLHHHHHHL MA AR E F 5. 1uox 74.67 ...KMAelARQQLIeeYSLPNKHYFEIDLSW ...HHHHHHHLLLEEEEEEEELLEEELLLLL RFE I D F D 6. 1aa6 74.40 VrfEDLeiVQDIFMTKTASAADVIL HHHHHLLEEEELELLHHHHLLLEEE E K A LF R R A 7. 1pys_B 74.27 PPleGHklAFHlfhPKRTLRDEEV.EEAVS LLLLLEEEEEEEELLLLLLLHHHH.HHHHH G M IA DL SER A L GD FA 8. 4enl 73.73 GWGVMVsiA.DLvrSERLAKLNQllGdvFA LLEEEEEHH.HHHLHHHHHHHHHHHHHEEL G D N G AF F 9. 1sqc 73.60 GEWLLDrvPGDWAVKRPNLKPG...GFAFQF HHHHHHLLLLHHHHLLLLLLLL...LLLLLL R A A LF NA LF 10. 1bpo_A 73.27 VRNNLAG.AEELFARKFNA....LFAQ HHLLLLL.LHHHHHHHHHH....HHHL G F I F A L G F FI 11. 1ba1 72.73 GTFDVstIEDGIFEVKSTAGDTHLGGEDfhFI LLEEEEEEELLEEEEEEEEEELLLLHHHHHHH A DL A E L A 12. 1yge 72.67 IELSLPHSAGDLSAAVSqaKEgwLLAKAYVIV EEEELLLLLLLLLLLLLELLLHHHHHHHHHHH E I E A E R G F 13. 1ecf_B 72.47 REWEDLDIDVVIPIPETsaLeaRILGKPygFV HHLLLLLLLEEEELLLLLHHHHHHHLLLELEE F I R L S N E R G 14. 1oac_A 72.47 QKFDPGTI.RLL..SNPN.KENRM.GNPVSY ELLLLLLE.EEE..EEEE.EELLL.LLEEEE E A I R L E AR R F 15. 1onr_A 72.27 PAleLAeiERKlyTGEVKARPARITESEFLW HHHHHHHLLLLLLLLLLLLLLLLLLHHHHHH AK DL A D F AF 16. 1nhp 72.27 IQ...AKMteDL......AYADFFFQPAF HH...LLLEHHH......HLLLLLLLLLL AK A L E D L A A 17. 1pii 72.00 DVVDKAKvaVQLHGNEEQLYIDTL.REAlaHV HHHHHHHHEEEELLLLLHHHHHHH.HHHLLLL E K A E N LF 18. 1kfs_A 71.93 ...ELEKKAHEIAGEEFNLSSTklF ...HHHHHHHHHLLLLLLLLLLLHL F A R ER A E L G AFAF 19. 1adj_A 71.87 EAFYLAEALRPRLRAerkakeEAlrGAAFafL HHHHHHHHHLLLLLEELLHHHHHHLLLLEEEE G A R R GD 20. 1eft 71.73 G...LAPETRKTVVTgrKTLQEGIAGDNVGLL L...LLLLLEEEEEEELEEELEEELLLEEEEE --- --- TOPITS ALIGNMENTS END --- The alignments from threading in MSF format: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ________________________________________________________________________________ MSF of: /home/phd/server/work/predict_h6314640.hsspTopits from: 1 to: 232 /home/phd/server/work/predict_h6314640.msfTopits MSF: 232 Type: P 3-Feb-00 04:34:2 Check: 6594 .. Name: predict_h6310 Len: 232 Check: 7806 Weight: 1.00 Name: 1ndh Len: 232 Check: 3236 Weight: 1.00 Name: 2pia Len: 232 Check: 8303 Weight: 1.00 Name: 1der_A Len: 232 Check: 255 Weight: 1.00 Name: 1fnc Len: 232 Check: 5648 Weight: 1.00 Name: 1uox Len: 232 Check: 4613 Weight: 1.00 Name: 1aa6 Len: 232 Check: 9992 Weight: 1.00 Name: 1pys_B Len: 232 Check: 5752 Weight: 1.00 Name: 4enl Len: 232 Check: 5130 Weight: 1.00 Name: 1sqc Len: 232 Check: 8389 Weight: 1.00 Name: 1bpo_A Len: 232 Check: 2196 Weight: 1.00 Name: 1ba1 Len: 232 Check: 7766 Weight: 1.00 Name: 1yge Len: 232 Check: 5639 Weight: 1.00 Name: 1ecf_B Len: 232 Check: 9634 Weight: 1.00 Name: 1oac_A Len: 232 Check: 4992 Weight: 1.00 Name: 1onr_A Len: 232 Check: 737 Weight: 1.00 Name: 1nhp Len: 232 Check: 1863 Weight: 1.00 Name: 1pii Len: 232 Check: 2824 Weight: 1.00 Name: 1kfs_A Len: 232 Check: 5538 Weight: 1.00 Name: 1adj_A Len: 232 Check: 4043 Weight: 1.00 Name: 1eft Len: 232 Check: 2238 Weight: 1.00 // 1 50 predict_h6310 TTLSCKVTSV EAITDTVYRV RIVPDAAFSF RAGQYLMVVM DERDKRPFSM 1ndh .......... PAItdIKYPL RLipEHILGL PVGqyLSARI DGNLvrPYTP 2pia TtlRLKIASK EKIARDIWSF ELtpQGapPF EAGANLTVAV PNGSRRTYSL 1der_A TTTA.TVLAQ AIITEGLKAV .....AamDL KRGIDKAVTV AVEELKALSV 1fnc ..LNTKITGD DAPGET.WHM VFSHEGEIPY REGQSVGVIP DGEDkrLYSI 1uox .....KVHKD EKtvQTVYev LLEGEIETSy kADNSVIVAT DSIKNTIYIT 1aa6 ..MSNAINEI DN.TDLVfsH PIVANHVINa rNGAKIIV.. ..CDPRKIET 1pys_B TPPSHRllRL EelVEEVARI qtIPLaafpY RKEQRLREvm DPEDARRFRL 4enl ..VSLAASRA AAAEKNVPLY KHLADLSKS. KTSPYVlvvL NGGShqEFMI 1sqc .......TTI EAYVALKY.I GMSRDeaLRF IQSqwLALVG epWEKVPM.. 1bpo_A HTMTDDVTFW KWI..SLNTV ALVTDNawSM .EGESQPVKM FDRHS...SL 1ba1 ........AV GIDLGTTykV GV.......F QHGKVEIIAN DQGNrtPSYV 1yge .....HLKSK DALegTKSLS QIVQPaaFDL KSTPifHSFQ DVHdkLPRDV 1ecf_B TAGSSSASEA QpyVNSPYGI TLAHNGNLT. NAHELRKKLF EE..KRRH.I 1oac_A .HLSMN.SRV GPMISTvyEG sivpDIGWYF KagDYGMGTL TsrGKDAPSN 1onr_A TTNPSLILNA AQIPE..YR. KLIDDaawnD RAQQ....IV DATDKLAVNI 1nhp .TVDPEVNNV VVI.GSGY.I GIEAAEAFA. KAGKKVTviL DRpdK.EFTD 1pii ..LECKKASP svIRDDFDPA RI..AAIYKH YASA.ISVLT DEKyqGSFni 1kfs_A ..FDTETDSL DNISANLVGL siEPgaAYIP VAHDYL.... DAPDqrALEl 1adj_A TQVFEK..GV GAATD..... .IVRKEMFTf rGGRSlmvyL EHGMkqplWM 1eft TTLTAALTFV TAAENPNVEV ki..DKAPEE RARGITihVE YETAKRHYSH 51 100 predict_h6310 ASTPDEKGFI ELHIGASEIN LYAKAVMDRI LKDHQIVVDI PHGEAWLRDD 1ndh VSSDDDKGFV DLVIKVYFKD THPkgKMSQY LESMKIGDTI erGpaIRPDK 2pia CNDSQERnvI AVKRDSnsIS F.....IDDT SEGDAVEVSL PRNE.FPLDK 1der_A PCS.DSKAIA QvtISAnetk LIAEA.MDKV GKEGVITVED GTG...LQDe 1fnc ASSadAKS.V SLCvdAGET. ..IKGVCSNF LCdaEVKLTG PVgeMLMPKD 1uox AktPPetHFI EkhIHAAHVN I....VCHRW TR.....MDi pHPHSFIRDS 1aa6 ARIADMH..I ALKNGSneEN LYDKavASri VEGyeSVEDI trQAARMYAQ 1pys_B ....DPPRLL LLNPLAPetH LFPGLV..RV LKEN...LDL DRPeaLlrER 4enl APT.GAKTFA elRIG.SEvn L..KSLTKKR YGASAGNVGD EGGVaiQTAE 1sqc ..VPPEIMfm PLNI..YEFG SWARavMSrp LPERARvtDV PpgGGWIFDA 1bpo_A arTDAKQKWL LLtiSAQqmQ LYsrKVSQPI .EGhqFKMEG NAEESTlrGQ 1ba1 AFTDTER.Li qVAMNPTNTV FDAKRLIGRR FDDAVVQSDM KHWPFMVVND 1yge IST.....II PLPV....ie LY.RTDGQHI LKFPqhVVQV SQ.SAWMTDE 1ecf_B NTTSDSE..I LLNIFASEln IFAaaATNRL IRGAYACVai GHGMVAFRDP 1oac_A AVLLNET..I ADYTgpMEI. PRAIAVFERy yKHQEMgvST ERRELVVrxd 1onr_A gsTEVDARlt EASIAKAkiK LYNDAGidRI lkLASTWQGI RAAEQLEKEG 1nhp VLTEeeANNI TIATGET.VE RYekVVTDKN AYDADLVvgV RPNTAWLKGt 1pii VsaPQpkDFI ...IDPYQIY lyaDalMLSV LDDDQylAAV AHseVSNEEE 1kfs_A pLLEDEK... ALKVGQ...N ly.....Dri LANYGIEL.. .RGIAFDTML 1adj_A aaERPQKgfH QVNYEasE.n lDAEAvlYEC LKerRLKVKl pHREA.LSEd 1eft VDCPGHADYI KNMigAAQMd lVVSAameHI LLARqivvDM VDDPEllVEM 101 150 predict_h6310 EERPMILIAG GTGFSYARSI LLTALARNPN RDITIYWGGR EEQHLYDLCE 1ndh KSSPVimIAG GTGIT...PM liRAIMKDPD .DHtlLFANQ TEKDILLRPE 2pia RAKSFILVAG GIglSMarSF RLYYLTRDPe sDVKifwkSK PAQHVYC.CG 1der_A eeSPFILLA. DKKISNIREM llEAVAknTM RGIVKvfGDR RKAMLQDimE 1fnc PNATIIMLGT GTGIAPFRSF LWKMFFEKhn GLAWLFLGVP TSSSLLYKEE 1uox EEKRNVqvVE GKGIDIKSSL slTVL.KSTN SQ...FWglR DetTlwdltD 1aa6 AKSAAILwmG VTQF.yvRS. .LTSLagNLG KPHAGVNPVR GQNNVQGACD 1pys_B EETHllLFGE GVGLPWAKER llEAlarhPG VSGRVLVEGE EVGFLGALHP 4enl EALDLIviaA GhgLDCASSE FFkdLdkNPN SDKSKWLTGP QLADLYhlmd 1sqc LDRALHGYQK LSVHPFRRAA EIRALDWLLE RQadGSWGGI QPPWFYALIA 1bpo_A AGGKLHIIEV GtpfkKAVDV FFPPEAQneK HDVVFLITKY GYIHLYDLET 1ba1 AGRPKVQVeg ETKSFYpsSM VLteIAEatN AVVTvyFNDS QRQATKD... 1yge eaREMivIRG LEEFP.PKSN LDPAIYGDQS SKIT.....A DSLDlyTMDe 1ecf_B NgrPLVlieN RTEYMVAssV ALDTLGFDFL RDvaIYI..T EEGQLFtqCA 1oac_A hENGTIGiaG ATGIEAVKGv mHDETAKDDT RYGTLiiVGT THQHIYNF.R 1onr_A INCNLTLL.. ...FSFAQaf LISPFVGR.. ..ILDWYKAN TDKKEYA... 1nhp eLHPNGLiaV gtLIKyaDTE VNIALATNAR KqvKPFPggS SGLAVFdiNE 1pii QERAIALGAK VVGIN.NrsI DlrELAPKLG HNVTvyAQVR ELSHFAnlsA 1kfs_A ESYILNSVAG RHDmsLakTI TFEEIAGKGK NQLTFNQIAL EEAGRydV.T 1adj_A eENPMRILDS KSERDQA... LLKELGVRPM LD....FLGE eeRHLERLse 1eft EVRDLlyEFP GDEVPVIRGS ALLALekNPK TK.....RGE NedKIWEL.. 151 200 predict_h6310 LEALSLKHPG LQVVPVVEQP EAGWRGRTGT VLTAVLQDHG TLAEHDIYIA 1ndh LEELRNEhaR FKLWYTVDRA PEAWDYSQGF VNEEMIRDHL PPPEEevLMC 2pia PQAltVRdtG HWPSGTveSF GanTNARENT PFTVRLSRSG TsaNRSILEV 1der_A LEKATLEDLG qrVvgVGE.. EAAIQGRVAQ IRQQIEedRE KLQERVAKLA 1fnc FEKMKEKApn FRLDFAVSRE QTNEKGEKMY IQTRMAQYAV ELWekdvYMC 1uox VDATWqnFSG LqvRSHVPKF DATwtAREVT LKT.FAEDNS ASVQATMY.. 1aa6 MGALPDTYPG YQYvpavESL PagyrAAHGE VRAAYIMGED PLQTDAELSA 1pys_B EIAQELELPP VHllPLPDKp pAAFRDLAVV vvEALVREae SLALFDLYQG 4enl WEAWsfKTAG IQIVatVTNp tAIEKKAADA LLLKVNQ.IG TlaAQDSFAA 1sqc LKILDmqHpg LELYG.VELD YGGwqAstGL AVLA.LRAAG LPADHdlVKA 1bpo_A GTCIYMNRIS GETIFVTAPH eaGIIgrKGQ VltNVLQN.P DLA...LRMA 1ba1 ..AGTI..AG LNVLRIINEP TA........ AAIAYGLDKK VGAERNVLig 1yge lFMLDYHDIF MPYVRQINQL NSAKTYATRT IL..FLREDG TLKP....VA 1ecf_B DNPVS..NPC LFEYVYFARP DS.FIDKI.S VYSARV.NMG tlGEK...IA 1oac_A LD.LDVDGEN NSLvpVVKPN TAG.GPRTST MQ...VNQYN IGNEQD..AA 1onr_A .PA...EDPG ..VVSVSeqk EHGY...ETV VMGASFRNIg eLAGCdlTIA 1nhp VMAQKLGKE. TKAVTVVenP daWFkpETTQ ILGAQLMSKA DLTanAISLA 1pii LMAHDDLHAA VRRVLLGENk dAgyGgqAQE VMAAalQYVG VFRNHD..IA 1kfs_A LQ.LHLkwPD LqlVPVLSRI E.....RNGV KIDpvLHNHS ..EELTLRLA 1adj_A LeaFEVHHeg LSElpRV..P GVGfvERVAL ALEA..EGFG LPEEkdLyvA 1eft LDAiyIPTPV RDvkPFLMPV EDVftGR.GT VATGRI.ERG KVKVGdvEIV 201 232 predict_h6310 GRFEMAKIAR DLFCSERNAR EDRLFGDAFA FI 1ndh GPPPMIQYap NL...ErgHP KERCF..AF. .. 2pia LRDANVRVpk TALCSGEADH RDMVLRD... .. 1der_A GGVAVIKvaT EVEMKEKKAr eDALhgGGVA LI 1fnc GLKGMEKGID DIMVSLAAAE gkRQLKKA.. .. 1uox ...KMAelAR QQLIeeYSLP NKHYFEIDLS W. 1aa6 VrfEDLeiVQ DIFMTKTASA ADVIL..... .. 1pys_B PPleGHklAF HlfhPKRTLR DEEV.EEAVS .. 4enl GWGVMVsiA. DLvrSERLAK LNQllGdvFA .. 1sqc GEWLLDrvPG DWAVKRPNLK PG...GFAFQ F. 1bpo_A VRNNLAG.AE ELFARKFNA. ...LFAQ... .. 1ba1 GTFDVstIED GIFEVKSTAG DTHLGGEDfh FI 1yge IELSLPHSAG DLSAAVSqaK EgwLLAKAYV IV 1ecf_B REWEDLDIDV VIPIPETsaL eaRILGKPyg FV 1oac_A QKFDPGTI.R LL..SNPN.K ENRM.GNPVS Y. 1onr_A PAleLAeiER KlyTGEVKAR PARITESEFL W. 1nhp IQ...AKMte DL......AY ADFFFQPAF. .. 1pii DVVDKAKvaV QLHGNEEQLY IDTL.REAla HV 1kfs_A ...ELEKKAH EIAGEEFNLS STklF..... .. 1adj_A EAFYLAEALR PRLRAerkak eEAlrGAAFa fL 1eft G...LAPETR KTVVTgrKTL QEGIAGDNVG LL ________________________________________________________________________________ TOPITS (threading) results in STRIP format: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ________________________________________________________________________________ ================================================== MAXHOM-STRIP ===================================================== test sequence : /home/phd/server/work/predict_h6314640.phdDssp list name : /home/phd/server/pub/topits/mat/Topits_dssp_99_01. last name was : /data/dssp/2plc.dssp seq_length : 232 alignments : 500 sort-mode : ZSCORE weights 1 : NO weights 2 : NO smin : -1.00 smax : 2.00 maplow : 0.00 maphigh : 0.00 epsilon : 0.00 gamma : 0.00 gap_open : 2 gap_elongation : 0.2 INDEL in sec-struc of SEQ 1: YES INDEL in sec-struc of SEQ 2: YES NBEST alignments : 1 secondary structure alignment: NO =================================================== SUMMARY =========================================================== IAL VAL LEN IDEL NDEL ZSCORE %IDEN STRHOM LEN2 RMS SIGMA NAME 1 82.80 210 69 18 2.31 0.31 0.58 270 -1.00 0.000 1ndh CYTOCHROME B=5= REDUCTASE (E.C.1.6.2.2) . 2 76.13 220 85 19 1.97 0.29 0.54 321 -1.00 0.000 2pia PHTHALATE DIOXYGENASE REDUCTASE (E.C.1.18.1.) . 3 75.67 229 128 23 1.95 0.33 0.40 525 -1.00 0.000 1der_A MOL_ID: 1; . 4 74.87 222 47 13 1.91 0.27 0.75 296 -1.00 0.000 1fnc FERREDOXIN:NADP+ OXIDOREDUCTASE (FERREDOXIN REDUCTASE, . 5 74.67 207 60 25 1.90 0.31 0.27 295 -1.00 0.000 1uox MOL_ID: 1; . 6 74.40 220 84 21 1.89 0.31 0.39 698 -1.00 0.000 1aa6 MOL_ID: 1; . 7 74.27 224 112 25 1.88 0.31 0.44 785 -1.00 0.000 1pys_B MOL_ID: 1; . 8 73.73 226 100 27 1.85 0.34 0.44 436 -1.00 0.000 4enl ENOLASE (E.C.4.2.1.11) (2-PHOSPHO-*D-GLYCERATE HYDROLASE) . 9 73.60 217 78 21 1.84 0.30 0.40 619 -1.00 0.000 1sqc MOL_ID: 1; . 10 73.27 217 78 21 1.83 0.30 0.44 487 -1.00 0.000 1bpo_A MOL_ID: 1; . 11 72.73 208 59 16 1.80 0.26 0.46 378 -1.00 0.000 1ba1 MOL_ID: 1; . 12 72.67 204 60 20 1.80 0.30 0.23 839 -1.00 0.000 1yge MOL_ID: 1; . 13 72.47 217 50 24 1.79 0.29 0.38 500 -1.00 0.000 1ecf_B MOL_ID: 1; . 14 72.47 214 77 26 1.79 0.33 0.31 720 -1.00 0.000 1oac_A MOL_ID: 1; . 15 72.27 210 76 24 1.78 0.30 0.26 316 -1.00 0.000 1onr_A MOL_ID: 1; . 16 72.27 217 87 25 1.78 0.32 0.34 447 -1.00 0.000 1nhp NADH PEROXIDASE (NPX) (E.C.1.11.1.1) MUTANT WITH CYS 42 . 17 72.00 226 92 27 1.76 0.33 0.33 452 -1.00 0.000 1pii N-(5'PHOSPHORIBOSYL)ANTHRANILATE ISOMERASE (E.C.5.3.1.24) . 18 71.93 197 75 23 1.76 0.32 0.45 602 -1.00 0.000 1kfs_A MOL_ID: 1; . 19 71.87 220 148 34 1.76 0.40 0.43 420 -1.00 0.000 1adj_A MOL_ID: 1; . 20 71.73 220 61 23 1.75 0.30 0.35 405 -1.00 0.000 1eft ELONGATION FACTOR TU (EF-TU) COMPLEXED WITH . ==================================== ALIGNMENTS =================================== 1 - 51 ....:....1....:....2....:....3....:....4....:....5 pred TTLSCKVTSVEAITDTVYRVRIVPDAAFSFRAGQYLMVVMDERDKRPFSMA EEEEE EEE EEEEEEE EE EEEEEEE EE OOBOBOBOBBOOBOOOBBOBOBOOOOOBOBOBBBBBBBBBOOOOOOBBBBB LLLLLLLLLHHHHLLHHHHHLLLLLLHHHHLLLLHHHHHHLLLLLLLLLLL AITD Y R P GQYL D RP 1. 1ndh 82.80 PAItdIKYPLRLipEHILGLPVGqyLSARIDGNLvrPYTPV LLELLLEEEEEEELLLELLLLLLLEEEEEELLEEEEEELLL TTL K S E I P A F AG L V R S 2. 2pia 76.13 TtlRLKIASKEKIARDIWSFELtpQGapPFEAGANLTVAVPNGSRRTYSLC LLEEEEEEEEEEEELLEEEEEEELLLLLLLLLLLEEEEELLLLLEEEEELL TT V IT V AA G V S 3. 1der_A 75.67 TTTA.TVLAQAIITEGLKAV.....AamDLKRGIDKAVTVAVEELKALSVP HHHH.HHHHHHHHHHHHHHH.....HLHHHHHHHHHHHHHHHHHHHHHLEL L K T A T R GQ V D DKR S A 4. 1fnc 74.87 LNTKITGDDAPGET.WHMVFSHEGEIPYREGQSVGVIPDGEDkrLYSIA EEEELLLLLLLLLE.EEEEEELLLLLLLLLLLEEEEELLLELLEEEELL KV E TVY V S A V D A 5. 1uox 74.67 KVHKDEKtvQTVYevLLEGEIETSykADNSVIVATDSIKNTIYITA EEEELLLLLEEEEEEEEEELLHHHHLLLHHHLLLHHHHHHHHHHHH S TD V IV R G V D R A 6. 1aa6 74.40 MSNAINEIDN.TDLVfsHPIVANHVINarNGAKIIV....CDPRKIETA LLLLHHHHHH.LLEEEHLHHHHHHHHHHHLLLEEEE....ELLLLLHHH T S E V R P AAF R Q L VMD D R F 7. 1pys_B 74.27 TPPSHRllRLEelVEEVARIqtIPLaafpYRKEQRLREvmDPEDARRFRL. ELLLLLLLLLHHHHHHHHHHHHLLLLLLHHHHHHHHHHHELLLHHHHLLL. S A V D S Y VV F A 8. 4enl 73.73 VSLAASRAAAAEKNVPLYKHLADLSKS.KTSPYVlvvLNGGShqEFMIA HHHHHHHHHHHHHLLLHHHHHHHHHLL.LLLLEEEEEEELHHHLEEEEE T EA Y D A F Q L V K P 9. 1sqc 73.60 TTIEAYVALKY.IGMSRDeaLRFIQSqwLALVGepWEKVPM... HHHHHHHHHHH.HLLLLLLHHHHHHHLHHHHLLLLHHHLLL... T VT I V V D A S G V M R S A 10. 1bpo_A 73.27 HTMTDDVTFWKWI..SLNTVALVTDNawSM.EGESQPVKMFDRHS...SLa EELLLLLLEEEEE..ELLEEEEELLLEEEL.LLLLLLEEEEELLH...HHL V T Y V F G D P A 11. 1ba1 72.73 AVGIDLGTTykVGV.......FQHGKVEIIANDQGNrtPSYVA LEEEELLLLEEEEE.......EELLEEEELLLLLLLLEELLEE S A T IV AAF D DK P 12. 1yge 72.67 HLKSKDALegTKSLSQIVQPaaFDLKSTPifHSFQDVHdkLPRDVI LLLHHHLHHHHHHHHHLHHHHHHHLLLLLLLLLHHHHHHELLHHHH T S Y A E KR 13. 1ecf_B 72.47 TAGSSSASEAQpyVNSPYGITLAHNGNLT.NAHELRKKLFEE..KRRH.IN ELLELLLLLLLLEELLLLLEEEEEEEEEL.LHHHHHHHHHHH..HLLL.LL LS V TVY IVPD F AG Y M R K S A 14. 1oac_A 72.47 HLSMN.SRVGPMISTvyEGsivpDIGWYFKagDYGMGTLTsrGKDAPSNA EEEEE.LLLEEEEEEEEEEEEELLLLLLLLEHHHLLLLLELLLLLLLLLL TT I YR DAA RA Q D DK 15. 1onr_A 72.27 TTNPSLILNAAQIPE..YR.KLIDDaawnDRAQQ....IVDATDKLAVNIg ELLHHHHHHHLLLHH..HH.HHHHHHHHLLHHHH....HHHHHHHHHHHHH T V V I Y I AF AG V D DK F 16. 1nhp 72.27 TVDPEVNNVVVI.GSGY.IGIEAAEAFA.KAGKKVTviLDRpdK.EFTDV HLLLLLLEEEEE.LLLH.HHHHHHHHHH.HLLLEEEELLLLLLH.HHHHH L CK S I D RI A A V DE F 17. 1pii 72.00 LECKKASPsvIRDDFDPARI..AAIYKHYASA.ISVLTDEKyqGSFniV EEELLEELLELLLLLLHHHH..HHHHLLLLLE.EEEELLLLLLLLLLHH S I I P AA A YL D D R 18. 1kfs_A 71.93 FDTETDSLDNISANLVGLsiEPgaAYIPVAHDYL....DAPDqrALElp EEEEELLLLLLLLLEEEEEEELLEEEEELLLLLL....LLLLLHHHHHH T K V A TD IV F FR G LMV K P MA 19. 1adj_A 71.87 TQVFEK..GVGAATD......IVRKEMFTfrGGRSlmvyLEHGMkqplWMa HHHHHH..HHLLLLH......HHHHLLLEELLLLEEHHHHHLLHHLLEEEE TTL T V A V I D A RA V E KR S 20. 1eft 71.73 TTLTAALTFVTAAENPNVEVki..DKAPEERARGITihVEYETAKRHYSHV HHHHHHHHHHHHLLLLLLLLLL..LLLHHHHHHLLLLEEEEELLLLEEEEE ================================== ALIGNMENTS ================================== 51 - 101 ....:....1....:....2....:....3....:....4....:....5 pred ASTPDEKGFIELHIGASEINLYAKAVMDRILKDHQIVVDIPHGEAWLRDDE EEEEEE HHHHHHHHHHHH EEEEE EEE BBOOOOOOOBBBBBOBBOOOOOBOBBBOOBOOOOOBOBOBOOBOBBBOOOO LLLLLLLLHHHHHLLLLHHHHHHHHHHHHHHLLLLEEELLLLLLHHHLLLL S D KGF L I K M L I I G A D 1. 1ndh 82.80 VSSDDDKGFVDLVIKVYFKDTHPkgKMSQYLESMKIGDTIerGpaIRPDKK LLLLLLLLLLLEEEEELLLLLLLLLHHHHHHHHLLLLLEEEEEEEELLLLL E I I D V P E D 2. 2pia 76.13 CNDSQERnvIAVKRDSnsISF.....IDDTSEGDAVEVSLPRNE.FPLDKR LLLLLLLLEEEEELLLLHHHH.....HHLLLLLLEEEELLLELL.LLLLLL D K I A E L A A MD K I V G L D E 3. 1der_A 75.67 PCS.DSKAIAQvtISAnetkLIAEA.MDKVGKEGVITVEDGTG...LQDee LLL.LHHHHHHHHHHLLHHHHHHHH.HHHLLLLLEEEEELLLL...LLLEL AS D K L A E K V L D P GE D 4. 1fnc 74.87 ASSadAKS.VSLCvdAGET...IKGVCSNFLCdaEVKLTGPVgeMLMPKDP LLLLLLLE.EEEEELLLLE...EELHHHHHHHLLEEEEEEEELLLLLELLL A TP E FIE HI A N V R DIPH RD E 5. 1uox 74.67 AktPPetHFIEkhIHAAHVNI....VCHRWTR.....MDipHPHSFIRDSE HHLLHHHHHHHHLEEEEEEEE....EEELLEE.....EEEEEEEEEELLLL A D I L G E NLY KAV RI V DI A 6. 1aa6 74.40 ARIADMH..IALKNGSneENLYDKavASriVEGyeSVEDItrQAARMYAQA HHHLLEE..ELLLLLLHHLLLLLHHHHHHHHLLLHHHHHHHHHHHHHHHHL D L A E L V R LK D EA LR E 7. 1pys_B 74.27 ....DPPRLLLLNPLAPetHLFPGLV..RVLKEN...LDLDRPeaLlrERE ....LLLLLEELLLLLHHLLLHHHHH..HHHHHH...HHHLLLLEEELLLE A T K F EL IG SE NL K V G A E 8. 4enl 73.73 APT.GAKTFAelRIG.SEvnL..KSLTKKRYGASAGNVGDEGGVaiQTAEE ELL.LLLLHHHHHHH.HHHHH..HHHHHHHHLHHHHLELLLLLELLLLHHH P E F L I E A AVM R L V D P G W D 9. 1sqc 73.60 ..VPPEIMfmPLNI..YEFGSWARavMSrpLPERARvtDVPpgGGWIFDAL ..LLHHHHHLLLLH..HHLLHHHHHHHHHLLLHHHLLLLLLLLLLHHHHHH A T L I A LY V I HQ E LR 10. 1bpo_A 73.27 arTDAKQKWLLLtiSAQqmQLYsrKVSQPI.EGhqFKMEGNAEESTlrGQA LEELLLLLEEEEEEEEELEEEEELLLEEEE.LLLEEELLLLLLEEEELLLL A T E I AK R D D H D 11. 1ba1 72.73 AFTDTER.LiqVAMNPTNTVFDAKRLIGRRFDDAVVQSDMKHWPFMVVNDA EELLLLE.EELLLLLHHHEELLHHHLLLLLLLLHHHHHHHLLLLLEEEEEL ST I L I LY ILK Q VV AW D E 12. 1yge 72.67 IST.....IIPLPV....ieLY.RTDGQHILKFPqhVVQVSQ.SAWMTDEe HHH.....HLLLLL....HHHL.EELLLLEEELLLHHHLLLL.LHHHLHHH T D I L I ASE N A A R V I HG RD 13. 1ecf_B 72.47 NTTSDSE..ILLNIFASElnIFAaaATNRLIRGAYACVaiGHGMVAFRDPN LLLLHHH..HHHHHHHHHHHHHHHHHHHHHLLEEEEEEELLLEEEEEELLL A E I G EI A AV R K V E R D 14. 1oac_A 72.47 AVLLNET..IADYTgpMEI.PRAIAVFERyyKHQEMgvSTERRELVVrxdh LEEEEEE..EELLLLEEEE.EEEEEEEEEEEEELLLLEEEEEEEEEEEEEE ST E I I LY A DRILK I E 15. 1onr_A 72.27 gsTEVDARltEASIAKAkiKLYNDAGidRIlkLASTWQGIRAAEQLEKEGI HEEELLHHHHHHHHHHHHHHHHHHLLLHHEEEEELLHHHHHHHHHHHHLLL T E I G Y K V D D VV AWL E 16. 1nhp 72.27 VLTEeeANNITIATGET.VERYekVVTDKNAYDADLVvgVRPNTAWLKGte HHHHHHLLLEEEEELLL.EEEEEEEEELLLEEELLEEELEEELLHHHLLLL S P K FI I I LYA A M L D Q H E 17. 1pii 72.00 VsaPQpkDFI...IDPYQIYlyaDalMLSVLDDDQylAAVAHseVSNEEEQ HHLLLLELLL...LLHHHHHHHLLEEELLLLLHHHHHHHHHHHEELLHHHH DEK L G NLY DRIL I G A E 18. 1kfs_A 71.93 pLLEDEK...ALKVGQ...Nly.....DriLANYGIEL...RGIAFDTMLE HHHLLLL...LLEEEL...LHH.....HHHHHLLLLLL...LLEEEEHHHH A KGF ASE NL A AV LK V PH EA L DE 19. 1adj_A 71.87 aaERPQKgfHQVNYEasE.nlDAEAvlYECLKerRLKVKlpHREA.LSEde ELLLLLLLEEEEEEEELL.LHHHHHHHHHHHHHLLLEEEEHHHHH.LLHHL P I IGA L A M IL QIVVD L E 20. 1eft 71.73 VDCPGHADYIKNMigAAQMdlVVSAameHILLARqivvDMVDDPEllVEME EELLLLHHHHHHHHHHLLLLEEEELLLHHHHHHHLEEHHHLLLLLHHHHHH ================================== ALIGNMENTS ================================== 101 - 151 ....:....1....:....2....:....3....:....4....:....5 pred EERPMILIAGGTGFSYARSILLTALARNPNRDITIYWGGREEQHLYDLCEL EEEEE HHHHHHHHHHHHHH EEEEE HHHHHHHHHH OOOBBBBBBBBBBBBOBOBBBOBBBOOOOOOOBBBBBBBOOOOOBBBOOOB LLLLHHHHLLLLLLLHHHHHHHHHHHLLLLLLLEEELLLLLHHHHHHHHHH P I IAGGTG L A P D T E EL 1. 1ndh 82.80 KSSPVimIAGGTGIT...PMliRAIMKDPD.DHtlLFANQTEKDILLRPEL LLEEELEEEEHHHHH...HHHHHHHHHLLL.LLLEEEEEEEHHHLLLHHHH IL AGG G S ARS L L R P D I W QH Y C 2. 2pia 76.13 RAKSFILVAGGIglSMarSFRLYYLTRDPesDVKifwkSKPAQHVYC.CGP LLLEEEEEEEHHHHHHHLEEEEEEEELLHHLLEEEHHLLLLLEEEEE.ELL EE P IL A S R LL A A N R I G R L D EL 3. 1der_A 75.67 eeSPFILLA.DKKISNIREMllEAVAknTMRGIVKvfGDRRKAMLQDimEL LELLEEEEE.LLEELLLHHHHHHHHLLHHLLLLLLELLHHHHHHHHHHHHH I GTG RS L N G L E 4. 1fnc 74.87 PNATIIMLGTGTGIAPFRSFLWKMFFEKhnGLAWLFLGVPTSSSLLYKEEF LLLEEEEEEEHHHHHHHHHHHHHHHLLLELLEEEEEEEELLHHHLLLHHHH EE G G S LT L N WG R E L DL 5. 1uox 74.67 EEKRNVqvVEGKGIDIKSSLslTVL.KSTNSQ...FWglRDetTlwdltDV LLEEEEEEELLLLEEEEEEEEEEEE.ELLLEL...ELLLLLLLLLLLEEEE IL G T F Y RS LT LA N R C 6. 1aa6 74.40 AKSAAILwmGVTQF.yvRS..LTSLagNLGKPHAGVNPVRGQNNVQGACDM LLLEEEEEHHHHLL.LHHH..HHHHHLLLLLLLLLEEELLLELLHHHHHHL EE L G G A LL ALAR P G E L L 7. 1pys_B 74.27 EETHllLFGEGVGLPWAKERllEAlarhPGVSGRVLVEGEEVGFLGALHPE EEEEEEEEELLEELLLLLLEEHHHHHHEEEEEEEEEELLEEEEEEEEELHH E I IA G G A S L NPN D G LY L 8. 4enl 73.73 EALDLIviaAGhgLDCASSEFFkdLdkNPNSDKSKWLTGPQLADLYhlmdW HHHHHHHHHLLLEEELLHHHHEELLLLLLLLLHHHLELHHHHHHHHHHHLH R R AL R WGG Y L L 9. 1sqc 73.60 LDRALHGYQKLSVHPFRRAAEIRALDWLLERQadGSWGGIQPPWFYALIAL HHHHHHHHHLLLLLLLHHHHHHHHHHHHHHHLLLLLLLLEHHHHHHHHHHH I GT F A A N D HLYDL 10. 1bpo_A 73.27 AGGKLHIIEVGtpfkKAVDVFFPPEAQneKHDVVFLITKYGYIHLYDLETG LLLEEEEEELLLLLLEEEELLLLLLLLLLLLLEEEEEELLLEEEEEELLLL RP G T Y S LT A N T Y Q D 11. 1ba1 72.73 AGRPKVQVegETKSFYpsSMVLteIAEatNAVVTvyFNDSQRQATKD.... LLEEEEEEELEEEEELHHHHHHHHHHHHLEEEEEELLLHHHHHHHHH.... E R MI I G F S L A IT LY EL 12. 1yge 72.67 eaREMivIRGLEEFP.PKSNLDPAIYGDQSSKIT.....ADSLDlyTMDel HHHHHHLLEELLLLL.LLLLLLHHHHLLLLLLLL.....HHHLLLLLHHHE RP LI T A S L L RD IY EE L C 13. 1ecf_B 72.47 NgrPLVlieNRTEYMVAssVALDTLGFDFLRDvaIYI..TEEGQLFtqCAD LLLLLEEELLEEEEEEELLHHHHHLLLEEEEELEEEE..ELLLLEEEELLL E I IAG TG A R T G QH Y L 14. 1oac_A 72.47 hENGTIGiaGATGIEAVKGvmHDETAKDDTRYGTLiiVGTTHQHIYNF.RL ELLLLEEEEEEEELLLEEELLLLLLHHHHLLLEEEEEEEELEEEEEEE.EE L FS A L I Y 15. 1onr_A 72.27 INCNLTLL.....FSFAQafLISPFVGR....ILDWYKANTDKKEYA.... LLEEEEEE.....LLHHHHLEEEEELHH....HHHHHHHLLLLLLLL.... E P LIA GT YA ALA N GG D E 16. 1nhp 72.27 eLHPNGLiaVgtLIKyaDTEVNIALATNARKqvKPFPggSSGLAVFdiNEV LELLLLLEELHLLEEEHLEEELLLLHHHHHHHLLLLLLLLEEEEELLLLHH ER L A G RSI L LA T Y RE H L L 17. 1pii 72.00 QERAIALGAKVVGIN.NrsIDlrELAPKLGHNVTvyAQVRELSHFAnlsAL HHHHHHLLLLEEEEE.LEEELLHHHHHHHLLLLEEHHHHHHHLLLLLEHHH E AG S A I A T EE YD L 18. 1kfs_A 71.93 ESYILNSVAGRHDmsLakTITFEEIAGKGKNQLTFNQIALEEAGRydV.TL HHHHHLLLLLLLLHHHHLLLLHHHHHLLHHHLLLHHHLLHHHHHHHHH.HH EE PM A LL L P D G EE HL L EL 19. 1adj_A 71.87 eENPMRILDSKSERDQA...LLKELGVRPMLD....FLGEeeRHLERLseL LLLHHHHLLLLLHHHHH...HHHHHLLLLHHH....HLLHHHHHHHHLLEE E R G R L AL NP G E L L 20. 1eft 71.73 EVRDLlyEFPGDEVPVIRGSALLALekNPKTK.....RGENedKIWEL..L HHHHHHLLLLLLLLLEEELLHHHHHHHLLLLL.....LLLLHHHHHHH..H ================================== ALIGNMENTS ================================== 151 - 201 ....:....1....:....2....:....3....:....4....:....5 pred LEALSLKHPGLQVVPVVEQPEAGWRGRTGTVLTAVLQDHGTLAEHDIYIAG HHHHHHH EEEEEE HHHHHHHHHHHH E EEEE BOOBBOOOOOBOBBBBBOOOOOOOOBOOBOBBOBBOOBBOOBBOBBBBBBB HHHHHHLLLLLLEEHELLLLLLLLLLLLLHHHHHHHHLLLLHHLLLHHHHL LE L H V W G V DH E G 1. 1ndh 82.80 LEELRNEhaRFKLWYTVDRAPEAWDYSQGFVNEEMIRDHLPPPEEevLMCG HHLLHHHHLLEEEEEEEEELLLLLLLEELLLLLHHHHHHLLLHHHLEELLL AL G VE A R T T L GT A I 2. 2pia 76.13 PQAltVRdtGHWPSGTveSFGanTNARENTPFTVRLSRSGTsaNRSILEVL LHHHHHHHLLLLLLLLEELLLLLLLLLLLLLEEEEELLLLLELLLLHHHHH LE L G VV V E EA GR D L E AG 3. 1der_A 75.67 LEKATLEDLGqrVvgVGE..EAAIQGRVAQIRQQIEedREKLQERVAKLAG HLLLLLLLLEEEEEELLL..LHHHHHHHHHHHHHLLLHHHHHHHHHHHHLL E K P V G T Q L E D Y G 4. 1fnc 74.87 FEKMKEKApnFRLDFAVSREQTNEKGEKMYIQTRMAQYAVELWekdvYMCG HHHHHHHLLLEEEEEEELLLLELLLLLELLHHHHHHLLHHHHHHLLEEEEE A GLQV V A W R T T D Y 5. 1uox 74.67 VDATWqnFSGLqvRSHVPKFDATwtAREVTLKT.FAEDNSASVQATMY... EEEEEELELLHHHHHLHHHHHHHHHHHHHHHHH.HHHLLELLHHHHHH... AL PG Q VP VE AG R G V A L A 6. 1aa6 74.40 MGALPDTYPGYQYvpavESLPagyrAAHGEVRAAYIMGEDPLQTDAELSAV LLLELLEELLLEELHHLLLLLLLLHHHLLLLLEEEEELLLHHHHLLLLHHH A L P P P A R V A LA D Y 7. 1pys_B 74.27 EIAQELELPPVHllPLPDKppAAFRDLAVVvvEALVREaeSLALFDLYQGP HHHHHHLLLLLEELLLLLLLLLEEEEEEEEEHHHHHHHHEEEEEEEEELLL EA S K G Q V V P A L V Q GTLA D AG 8. 4enl 73.73 WEAWsfKTAGIQIVatVTNptAIEKKAADALLLKVNQ.IGTlaAQDSFAAG HHHHHHHHHLLEEEELLLLHHHHHLLLLLEEEELHHH.HLLHHHHHHHHLL L L HPGL VE GW TG A L G A HD AG 9. 1sqc 73.60 LKILDmqHpgLELYG.VELDYGGwqAstGLAVLA.LRAAGLPADHdlVKAG HHHLLLLLHHHHHHE.EELLLLLELLEHHHHHHH.HHHHLLLLLLHHHHHH V EAG GR G VLT VLQ LA A 10. 1bpo_A 73.27 GTCIYMNRISGETIFVTAPHeaGIIgrKGQVltNVLQN.PDLA...LRMAV LLEEEEEELLLLLEEEEEEELLEEEELLLEEEHHLLLL.HHHH...HHHHH A GL V P A A D AE I G 11. 1ba1 72.73 ..AGTI..AGLNVLRIINEPTA........AAIAYGLDKKVGAERNVLigG ..HHHH..LLLEEEEEEEHHHH........HHHHLLLLLLLLLLEEEEELL L L V Q T T L L GTL A 12. 1yge 72.67 lFMLDYHDIFMPYVRQINQLNSAKTYATRTIL..FLREDGTLKP....VAI EEEEELHHHHHHHHHHHHLLLLLLLLEEEEEE..EELLLLLEEE....EEE S P L P V A GTL E IA 13. 1ecf_B 72.47 DNPVS..NPCLFEYVYFARPDS.FIDKI.SVYSARV.NMGtlGEK...IAR LLLLL..LLEHHHHHLLLLLLL.EELLE.EHHHHHH.HHHHHHHH...HHH L L VPVV AG RT T E D A 14. 1oac_A 72.47 LD.LDVDGENNSLvpVVKPNTAG.GPRTSTMQ...VNQYNIGNEQD..AAQ EE.ELLLLLEEEEEEEEEELLLL.LLLLEEEE...EEEEEELEHHH..HLE A PG VV V EQ E G V A G LA D IA 15. 1onr_A 72.27 .PA...EDPG..VVSVSeqkEHGY...ETVVMGASFRNIgeLAGCdlTIAP .HH...HLHH..HHHHHHHHHLLL...LLEEEEELLLLHHHLLLLLEEELH A L V VVE P A T L A L L I A 16. 1nhp 72.27 VMAQKLGKE.TKAVTVVenPdaWFkpETTQILGAQLMSKADLTanAISLAI HHHHHHLLL.LEEEEEEELLLEEEELLLLEEEEEEEEELLLLLLHHHHHHH L A H V E AG G V A LQ G HD IA 17. 1pii 72.00 LMAHDDLHAAVRRVLLGENkdAgyGgqAQEVMAAalQYVGVFRNHD..IAD HHLLLLHHHHHHHHHHLLLEHHLEEEHHHHHHHHLLEEEEEELLLL..HHH L L LK P LQ VPV E R G VL H E A 18. 1kfs_A 71.93 LQ.LHLkwPDLqlVPVLSRIE.....RNGVKIDpvLHNHS..EELTLRLA. HH.HHHHHHHLLHHHHHHHHH.....HHLELELHHHHHHH..HHHHHHHH. LEA H GL P V P G R L A G E D Y A 19. 1adj_A 71.87 LeaFEVHHegLSElpRV..PGVGfvERVALALEA..EGFGLPEEkdLyvAE ELEEEEELLLHHHHLLL..LEEEEHHHHHHHHHH..LLLLLLLLLLEEHHH L A P V P E GR GTV T G D I G 20. 1eft 71.73 LDAiyIPTPVRDvkPFLMPVEDVftGR.GTVATGRI.ERGKVKVGdvEIVG HHHHHLLLLLLLLLLLEEEEEEEELLL.EEEEEEEL.LELEEELLLEEELL ================================== ALIGNMENTS ================================== 201 - 232 ....:....1....:....2....:....3....:....4....:....5 pred GRFEMAKIARDLFCSERNAREDRLFGDAFAFI HHHHHHHHHHHHHH HHHHHHHHHH BOBOBBOBBOOOBOOOOOBOOOOBBBOBBBBB LLHHHHHHLHHHHLLLLLLLLLHHHLLHHHHH G M A L ER R F AF 1. 1ndh 82.80 GPPPMIQYapNL...ErgHPKERCF..AF LLLHHHLLLHHH...HHLLLHHHEE..LL R CS D D 2. 2pia 76.13 LRDANVRVpkTALCSGEADHRDMVLRD HHHLLLLLLEEEEEELLEELLLLLLLL G K A E ARED L G A I 3. 1der_A 75.67 GGVAVIKvaTEVEMKEKKAreDALhgGGVALI LLEEEEELLLLLHHHHHHHHHHHHHLLLHHHH G M K D S A R A 4. 1fnc 74.87 GLKGMEKGIDDIMVSLAAAEgkRQLKKA ELHHHHHHHHHHHHHHHHLLLHHHHHHL MA AR E F 5. 1uox 74.67 ...KMAelARQQLIeeYSLPNKHYFEIDLSW ...HHHHHHHLLLEEEEEEEELLEEELLLLL RFE I D F D 6. 1aa6 74.40 VrfEDLeiVQDIFMTKTASAADVIL HHHHHLLEEEELELLHHHHLLLEEE E K A LF R R A 7. 1pys_B 74.27 PPleGHklAFHlfhPKRTLRDEEV.EEAVS LLLLLEEEEEEEELLLLLLLHHHH.HHHHH G M IA DL SER A L GD FA 8. 4enl 73.73 GWGVMVsiA.DLvrSERLAKLNQllGdvFA LLEEEEEHH.HHHLHHHHHHHHHHHHHEEL G D N G AF F 9. 1sqc 73.60 GEWLLDrvPGDWAVKRPNLKPG...GFAFQF HHHHHHLLLLHHHHLLLLLLLL...LLLLLL R A A LF NA LF 10. 1bpo_A 73.27 VRNNLAG.AEELFARKFNA....LFAQ HHLLLLL.LHHHHHHHHHH....HHHL G F I F A L G F FI 11. 1ba1 72.73 GTFDVstIEDGIFEVKSTAGDTHLGGEDfhFI LLEEEEEEELLEEEEEEEEEELLLLHHHHHHH A DL A E L A 12. 1yge 72.67 IELSLPHSAGDLSAAVSqaKEgwLLAKAYVIV EEEELLLLLLLLLLLLLELLLHHHHHHHHHHH E I E A E R G F 13. 1ecf_B 72.47 REWEDLDIDVVIPIPETsaLeaRILGKPygFV HHLLLLLLLEEEELLLLLHHHHHHHLLLELEE F I R L S N E R G 14. 1oac_A 72.47 QKFDPGTI.RLL..SNPN.KENRM.GNPVSY ELLLLLLE.EEE..EEEE.EELLL.LLEEEE E A I R L E AR R F 15. 1onr_A 72.27 PAleLAeiERKlyTGEVKARPARITESEFLW HHHHHHHLLLLLLLLLLLLLLLLLLHHHHHH AK DL A D F AF 16. 1nhp 72.27 IQ...AKMteDL......AYADFFFQPAF HH...LLLEHHH......HLLLLLLLLLL AK A L E D L A A 17. 1pii 72.00 DVVDKAKvaVQLHGNEEQLYIDTL.REAlaHV HHHHHHHHEEEELLLLLHHHHHHH.HHHLLLL E K A E N LF 18. 1kfs_A 71.93 ...ELEKKAHEIAGEEFNLSSTklF ...HHHHHHHHHLLLLLLLLLLLHL F A R ER A E L G AFAF 19. 1adj_A 71.87 EAFYLAEALRPRLRAerkakeEAlrGAAFafL HHHHHHHHHLLLLLEELLHHHHHHLLLLEEEE G A R R GD 20. 1eft 71.73 G...LAPETRKTVVTgrKTLQEGIAGDNVGLL L...LLLLLEEEEEEELEEELEEELLLEEEEE 9 119 104 ________________________________________________________________________________ ----------------------------------------------------------------------------- - PredictProtein (PP): News 1999 - ----------------------------------------------------------------------------- - - - PP home: - New York http://cubic.bioc.columbia.edu/predictprotein - - - PP mirrors: - Australia (ANGIS) http://molmod.angis.org.au//predictprotein England (EBI) http://www.ebi.ac.uk/~rost/predictprotein Germany (EMBL) http://www.embl-heidelberg.de/predictprotein India (CDFC) http://iris.cdfd.org.in/~www/pp/predictprotein India (Pune) http://202.41.70.33/predictprotein Israel (Beer-Sheva) http://www.cs.bgu.ac.il/~dfischer/predictprotein Italy (Rome) http://obelix.bio.uniroma2.it/www/predictprotein Singapore (BIC) http://embl.bic.nus.edu.sg/predictprotein Spain (CNB) http://www.es.embnet.org/Services/MolBio/PredictProtein Switzerland (Glaxo) http://www.gwer.ch/tools/predictprotein - - - Tools to post-process PP results: - - - - Generate a PostScript (or GIF, or TIFF): - ESPript (New York) http://cubic.bioc.columbia.edu/cgi/pp/ESPript ESPript (Toulouse) http://www-pgm1.ipbs.fr:8080/ESPript - - -----------------------------------------------------------------------------