The following information has been received by the server: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ________________________________________________________________________________ reference predict_h23854 (Feb 16, 2004 11:39:46) reference pred_h23854 (Feb 16, 2004 11:40:11) PPhdr from: arojas@cnb.uam.es PPhdr resp: MAIL PPhdr orig: HTML PPhdr want: ASCII PPhdr password(###) prediction of: - secondary structure (PHDsec)- return no alignment ret phd casp ret concise res # default: single protein sequence description=TARGET1 MLKNTSSNLDAPVARSCDFAMKKMDLRKAFTLIEPGPVTLVTTSAGGTNNVMTISWTMAV DFTPKLAITTGPWNFSYKALTKSRECVIAIPTVDLLDKVVGVGTCSGKDTDKFDTFRLTP IKGKYVEAPLIKECVANIECKVVDIIKKHDIVVLEGVAAYFDTSRKEKRTLHAVGDGTFV VDGRTLDRKKQMRSKLLGIF ________________________________________________________________________________ Result of PROSITE search (Amos Bairoch): ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ please quote: A Bairoch, P Bucher & K Hofmann: The PROSITE database, its status in 1997. Nucl. Acids Res., 1997, 25, 217-221. ________________________________________________________________________________ -------------------------------------------------------- Pattern-ID: ASN_GLYCOSYLATION PS00001 PDOC00001 Pattern-DE: N-glycosylation site Pattern: N[^P][ST][^P] 4 NTSS 74 NFSY Pattern-ID: PKC_PHOSPHO_SITE PS00005 PDOC00005 Pattern-DE: Protein kinase C phosphorylation site Pattern: [ST].[RK] 63 TPK 76 SYK 106 SGK 110 TDK 115 TFR 163 TSR Pattern-ID: CK2_PHOSPHO_SITE PS00006 PDOC00006 Pattern-DE: Casein kinase II phosphorylation site Pattern: [ST].{2}[DE] 7 SNLD 31 TLIE 106 SGKD 164 SRKE Pattern-ID: MYRISTYL PS00008 PDOC00008 Pattern-DE: N-myristoylation site Pattern: G[^EDRKHPFYW].{2}[STAGCN][^P] 46 GGTNNV 101 GVGTCS ________________________________________________________________________________ Result of SEG low-complexity search (JC Wootton & S Federhen): ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ please quote: J C Wootton & S Federhen: Analysis of compositionally biased regions in sequence databases. Meth. in Enzymol. 1996, 266, 554-571. NOTE 1: regions of low-complexity ('simple sequence' or 'compo- sition biased regions') are marked by the letter 'x' in the following output. NOTE 2: The dynamic programming algorithm (MaxHom) does NOT take the SEG information into account, nor do the PHD pre- dictions! !!! --> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! <-- !!! !!! --> WE STRONGLY suggest that you resubmit the regions NOT marked by <-- !!! !!! --> 'x' separately!! <-- !!! !!! --> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! <-- !!! ________________________________________________________________________________ prot (#) default: single protein sequence description=target1 /home/ppuser/server/work/predict_h23854 from: 1 to: 200 prot (#) default: single protein sequence description=target1 /home/ppuser/server/work/predict_h23854 /home/ppuser/server/work/predict_h23854.segNormGcg Length: 200 11-Jul-99 Check: 2818 .. 1 MLKNTSSNLD APVARSCDFA MKKMDLRKAF TLIEPGPVTL VTTSAGGTNN 51 VMTISWTMAV DFTPKLAITT GPWNFSYKAL TKSRECVIAI PTVDLLDKVV 101 GVGTCSGKDT DKFDTFRLTP IKGKYVEAPL IKECVANIEC xxxxxxxxxx 151 xxxLEGVAAY FDTSRKEKRT LHAVGDGTFV VDGRTLDRKK QMRSKLLGIF ________________________________________________________________________________ Result of ProDom domain search (Sonnhammer; Corpet, Gouzy, Kahn): ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - please quote: ELL Sonnhammer & D Kahn, Prot. Sci., 1994, 3, 482-492 ________________________________________________________________________________ --- ------------------------------------------------------------ --- Results from running BLAST against PRODOM domains --- --- PLEASE quote: --- F Corpet, J Gouzy, D Kahn (1998). The ProDom database --- of protein domain families. Nucleic Ac Res 26:323-326. --- --- BEGIN of BLASTP output BLASTP 2.0a19MP-WashU [14-Jul-1998] [Build linux-x86 18:51:39 30-Jul-1998] Reference: Gish, Warren (1994-1997). unpublished. Altschul, Stephen F., Warren Gish, Webb Miller, Eugene W. Myers, and David J. Lipman (1990). Basic local alignment search tool. J. Mol. Biol. 215:403-10. Query= prot (#) default: single protein sequence description=target1 /home/ppuser/server/work/predict_h23854 (200 letters) Database: prodom_00_1 174,952 sequences; 19,895,393 total letters. Searching....10....20....30....40....50....60....70....80....90....100% done Smallest Sum High Probability Sequences producing High-scoring Segment Pairs: Score P(N) N PD002643 p2000.1 (34) // PROTEIN MONOOXYGENASE OXIDOREDUC... 118 9.2e-07 1 PD110067 p2000.1 (1) P96682_BACSU // YDFE PROTEIN 105 0.00029 1 >>PD002643 p2000.1 (34) // PROTEIN MONOOXYGENASE OXIDOREDUCTASE COMPONENT FLAVOPROTEIN PUTATIVE COUPLING 4-HYDROXYPHENYLACETATE 3-MONOOXYGENASE CONSERVED Length = 167 Score = 118 (41.5 bits), Expect = 9.2e-07, P = 9.2e-07 Identities = 40/146 (27%), Positives = 70/146 (47%) Query: 35 PGPVTLVTTSA-GGTNNVMTISWTMAVDFTPKLAITTGPWNFSYKALTK-SRECVIAIPT 92 P VT+VTT GG + MT SW +V F P L + + S +L K S + + + + Sbjct: 16 PTGVTVVTTEEDGGRPHGMTASWFTSVSFEPPLVMVCINKSSSTHSLIKESGKFAVNVLS 75 Query: 93 VDLLDKVVGVGTCSGKDTDKFDTFRLTPIKGKYVEAPLIKE-CVANIECKVVDIIKK--H 149 + ++V + + K+ DKF + K AP+++E +A +EC+V +++ H Sbjct: 76 AEQQEEVAKFFSMTRKEGDKFFGMSWWQVTSKKTGAPVLEEDALAWLECRVESVVEAGDH 135 Query: 150 DIVVLEGVAAYFDTSRKEKRTLHAVG 175 I + E V+ + K L+ G Sbjct: 136 TIFIGEVVSVSVEEEGKPAPLLYRRG 161 >>PD110067 p2000.1 (1) P96682_BACSU // YDFE PROTEIN Length = 207 Score = 105 (37.0 bits), Expect = 0.00029, P = 0.00029 Identities = 38/137 (27%), Positives = 65/137 (47%) Query: 40 LVTTSAGGTNNVMTISWTMAVDFTPKLAITTGPWNFSYKALTKSRECVIAIPTVDLLDKV 99 L T + GT N+ +S + A+ L + G + L + +ECVI +P DL + V Sbjct: 28 LTTLNEDGTTNISPMSSSWALGHYIILGVGLG--GKAIDNLERHKECVINLPGPDLWENV 85 Query: 100 VGVGTCSGKDT--------------DKFDTFRLTPIKGKYVEAPLIKECVANIECKVVDI 145 + + SGK + +K++ LTP++ K V IKEC IE +V I Sbjct: 86 ERISSYSGKKSIPPLKKQIGFTYKKEKYEAAGLTPLQSKTVSPTRIKECPIQIEAEVKHI 145 Query: 146 -IKKHD--IVVLEGVAAYF 161 + +++ ++E A +F Sbjct: 146 RLPEYESSFAIVETQALHF 164 Parameters: E=0.001 B=500 ctxfactor=1.00 Query ----- As Used ----- ----- Computed ---- Frame MatID Matrix name Lambda K H Lambda K H +0 0 BLOSUM62 0.320 0.136 0.399 same same same Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a Query Frame MatID Length Eff.Length E S W T X E2 S2 +0 0 200 200 0.0010 103 3 11 22 0.23 32 31 0.23 35 Statistics: Database: /home/ppuser/server/pub/prodom/mat/prodom_00_1 Title: prodom_00_1 Release date: unknown Posted date: 5:56 PM EDT Jun 21, 2000 Format: BLAST # of letters in database: 19,895,393 # of sequences in database: 174,952 # of database sequences satisfying E: 2 No. of states in DFA: 551 (54 KB) Total size of DFA: 101 KB (128 KB) Time to generate neighborhood: 0.00u 0.00s 0.00t Elapsed: 00:00:00 No. of threads or processors used: 2 Search cpu time: 2.54u 0.01s 2.55t Elapsed: 00:00:02 Total cpu time: 2.57u 0.02s 2.59t Elapsed: 00:00:02 Start: Mon Feb 16 11:42:03 2004 End: Mon Feb 16 11:42:05 2004 --- END of BLASTP output --- ------------------------------------------------------------ --- --- Again: these results were obtained based on the domain data- --- base collected by Daniel Kahn and his coworkers in Toulouse. --- --- PLEASE quote: --- F Corpet, J Gouzy, D Kahn (1998). The ProDom database --- of protein domain families. Nucleic Ac Res 26:323-326. --- --- The general WWW page is on: ---- --------------------------------------- --- http://www.toulouse.inra.fr/prodom.html ---- --------------------------------------- --- --- For WWW graphic interfaces to PRODOM, in particular for your --- protein family, follow the following links (each line is ONE --- single link for your protein!!): --- http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom1=PD002643 ==> multiple alignment, consensus, PDB and PROSITE links of domain PD002643 http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom2=PD002643 ==> graphical output of all proteins having domain PD002643 http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom1=PD110067 ==> multiple alignment, consensus, PDB and PROSITE links of domain PD110067 http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom2=PD110067 ==> graphical output of all proteins having domain PD110067 --- --- NOTE: if you want to use the link, make sure the entire line --- is pasted as URL into your browser! --- --- END of PRODOM --- ------------------------------------------------------------ ________________________________________________________________________________ Since you did set the keyword "return no alignment" in the header, no HSSP alignment is appended. Result of COILS prediction (Andrei Lupas): ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A Lupas: Methods in Enzymology, 1996, 266, 513-525. version 2.2: Rob B. Russell & Andrei N. Lupas, 1999 ________________________________________________________________________________ no coiled-coil above probability 0.5 ________________________________________________________________________________ Result of CYSPRED prediction (Piero Farisell): ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fariselli P, Riccobelli P, Casadio R PROTEINS(1999) 36:340-346 ________________________________________________________________________________ PREDICTION OF BONDING STATE OF CYSTEINES Network N. 1 Conservation+ Entropy N.cys Prob.SS Prob.SH 17 0.063152 0.936849 86 0.172651 0.827348 105 0.139590 0.860410 134 0.013818 0.986182 140 0.589704 0.410297 ################### Network N. 2 Conservation+ Entropy + Charges N.cys Prob.SS Prob.SH 17 0.056250 0.603368 86 0.140452 0.364905 105 0.084989 0.615075 134 0.035703 0.775042 140 0.569967 0.097433 ################### Network N. 3 Charge N.cys Prob.SS Prob.SH 17 0.182775 0.817209 86 0.077382 0.922617 105 0.031596 0.968402 134 0.041876 0.958124 ################### Network N. 4 Conservation+ Entropy+Hydrophobicity N.cys Prob.SS Prob.SH 17 0.032125 0.967874 86 0.095396 0.904606 105 0.060646 0.939354 134 0.021265 0.978734 140 0.554396 0.445600 ################### Network N. 5 Conservation+ Entropy + Charges + Hydrophobicity N.cys Prob.SS Prob.SH 17 0.056526 0.943477 86 0.113916 0.886084 105 0.075339 0.924661 134 0.037543 0.962456 140 0.609607 0.390399 ################### JURY AMONG THE DIFFERENT NETWORKS Prob.SS Prob.SH N.cys BONDED NON-BONDED DISULFIDE 17 0.078 0.854 NO 86 0.120 0.781 NO 105 0.078 0.862 NO 134 0.030 0.932 NO 140 0.465 0.269 YES #==============================================# ________________________________________________________________________________ PHD: Profile fed neural network systems from HeiDelberg ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Prediction of: secondary structure, by PHDsec solvent accessibility, by PHDacc and helical transmembrane regions, by PHDhtm Author: Burkhard Rost EMBL, 69012 Heidelberg, Germany Internet: Rost@EMBL-Heidelberg.DE All rights reserved. The network systems are described in: PHDsec: B Rost & C Sander: JMB, 1993, 232, 584-599. B Rost & C Sander: Proteins, 1994, 19, 55-72. PHDacc: B Rost & C Sander: Proteins, 1994, 20, 216-226. PHDhtm: B Rost et al.: Prot. Science, 1995, 4, 521-533. The resulting network (PHD) prediction is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ________________________________________________________________________________ Publication to reference in reporting results: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Rost, Burkhard; Sander, Chris: Prediction of protein structure at better than 70% accuracy. J. Mol. Biol., 1993, 232, 584-599. Rost, Burkhard; Sander, Chris: Combining evolutionary information and neural networks to predict protein secondary structure. Proteins, 1994, 19, 55-72. Some statistics: ~~~~~~~~~~~~~~~~ Percentage of amino acids: +--------------+--------+--------+--------+--------+--------+ | AA: | T | K | V | L | D | | % of AA: | 10.5 | 10.5 | 10.0 | 8.0 | 7.5 | +--------------+--------+--------+--------+--------+--------+ | AA: | A | G | I | S | R | | % of AA: | 7.0 | 6.5 | 6.0 | 5.0 | 4.5 | +--------------+--------+--------+--------+--------+--------+ | AA: | F | P | E | N | M | | % of AA: | 4.5 | 4.0 | 3.5 | 3.0 | 3.0 | +--------------+--------+--------+--------+--------+--------+ | AA: | C | Y | W | H | Q | | % of AA: | 2.5 | 1.5 | 1.0 | 1.0 | 0.5 | +--------------+--------+--------+--------+--------+--------+ Percentage of secondary structure predicted: +--------------+--------+--------+--------+ | SecStr: | H | E | L | | % Predicted: | 22.0 | 28.5 | 49.5 | +--------------+--------+--------+--------+ According to the following classes: all-alpha: %H>45 and %E< 5; all-beta : %H<5 and %E>45 alpha-beta : %H>30 and %E>20; mixed: rest, this means that the predicted class is: mixed class PHD output for your protein: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Mon Feb 16 11:42:13 2004 Jury on: 10 different architectures (version 5.94_317 ). Note: differently trained architectures, i.e., different versions can result in different predictions. About the protein: ------------------ HEADER COMPND SOURCE AUTHOR SEQLENGTH 200 NCHAIN 1 chain(s) in query data set NALIGN 189 (=number of aligned sequences in HSSP file) Abbreviations: -------------- secondary structure : H=helical trans-membrane regions, L=rest (loop) AA: amino acid sequence PHD: Profile network prediction HeiDelberg Rel: Reliability index of prediction (0-9) detail: prH: 'probability' for assigning helix prL: 'probability' for assigning loop note: the 'probabilites' are scaled to the interval 0-9, i.e. prH=5 means, that the signal at the first output node is 0.5-0.6. subset: SUB: a subset of the prediction, for all residues with an expected accuracy > 95% (see tables in header) note 1: for this subset the following symbols are used: L: is loop (for which above " " is used) ".": means that no prediction is made for this residue, as the reliability is Rel < 5 protein: query length 200 ....,....1....,....2....,....3....,....4....,....5....,....6 AA |MLKNTSSNLDAPVARSCDFAMKKMDLRKAFTLIEPGPVTLVTTSAGGTNNVMTISWTMAV| PHD | HHHHHHHH HHHH HHHHHHHH EEEEEE EEEEEEEE | Rel |986722231457764204521132267887522699649998368997762246443321| detail: prH-|001134434677776546654434577887642100000000000001000000000000 prE-|000000000000011100000000000000122100268998610000113567665554 prL-|987755554321112342234565321111234799730001378887775432223345 subset: SUB |LLLL......HHHH....H......HHHHHH..LLLL.EEEE.LLLLLLL...E......| ....,....7....,....8....,....9....,....10...,....11...,....1 AA |DFTPKLAITTGPWNFSYKALTKSRECVIAIPTVDLLDKVVGVGTCSGKDTDKFDTFRLTP| PHD | EEEEEE HHHHHHH EEEEEE HHHHHHHHHHH HHH | Rel |398669999826752356664157178862552358999997443797511221145221| detail: prH-|000000000000124667765420010001125568999987633101244544321111 prE-|301179898842100001111111478875100000000000000001000011112334 prL-|698720000157775321112467410013664321000001366887644333466444 subset: SUB |.LLLEEEEEE.LLL..HHHH..LL.EEEE.LL..HHHHHHHH...LLLL.......L...| 2...,....13...,....14...,....15...,....16...,....17...,....1 AA |IKGKYVEAPLIKECVANIECKVVDIIKKHDIVVLEGVAAYFDTSRKEKRTLHAVGDGTFV| PHD | HHH EEEEEEEEEEE EEEEEEEE EEEEEE EEE| Rel |366468974221133378999866522697369999862022478995354663377488| detail: prH-|111210012344432110000000000000000000001100000001000000000000 prE-|210000001211112578888877654201379999875443310002566775311688 prL-|567678985434455210000112245798510000113455678986322223587311 subset: SUB |.LL.LLLL........EEEEEEEEE..LLL.EEEEEEE.....LLLLL.E.EE..LL.EE| 8...,....19...,....20...,....21...,....22...,....23...,....2 AA |VDGRTLDRKKQMRSKLLGIF| PHD |E EE | Rel |71761014334342303429 detail: prH-|00000012222325542110 prE-|84114442221001001230 prL-|15774445455563345559 subset: SUB |E.LL...............L| END ________________________________________________________________________________ Since you did set the keyword "return phd casp2" in the header, here a file will be appended that can be used as input for CASP2 PHD predictions in CASP2 format: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ________________________________________________________________________________ PFRMAT SS TARGET /home/ppuser/server/work/predict_h23854.phdRdb AUTHOR your_name and your_email REMARK Automatic usage of PHDsec REMARK CAFASP METHOD SERVERNAME: PredictProtein/PHD METHOD PROGRAM: PHD secondary structure and solvent accessibility prediction METHOD PARAMETERS: DEFAULT METHOD SERVER URL: http://cubic.bioc.columbia.edu/predictprotein MODEL 1 M C 1.00 L C 0.90 K C 0.70 N C 0.80 T C 0.30 S C 0.30 S C 0.30 N C 0.40 L H 0.20 D H 0.50 A H 0.60 P H 0.80 V H 0.80 A H 0.70 R H 0.50 S H 0.30 C C 0.10 D H 0.50 F H 0.60 A H 0.30 M H 0.20 K C 0.20 K C 0.40 M C 0.30 D H 0.30 L H 0.70 R H 0.80 K H 0.90 A H 0.90 F H 0.80 T H 0.60 L H 0.30 I C 0.30 E C 0.70 P C 1.00 G C 1.00 P C 0.70 V E 0.50 T E 1.00 L E 1.00 V E 1.00 T E 0.90 T E 0.40 S C 0.70 A C 0.90 G C 1.00 G C 1.00 T C 0.80 N C 0.80 N C 0.70 V C 0.30 M E 0.30 T E 0.50 I E 0.70 S E 0.50 W E 0.50 T E 0.40 M E 0.40 A E 0.30 V C 0.20 D C 0.40 F C 1.00 T C 0.90 P C 0.70 K E 0.70 L E 1.00 A E 1.00 I E 1.00 T E 1.00 T E 0.90 G C 0.30 P C 0.70 W C 0.80 N C 0.60 F C 0.30 S H 0.40 Y H 0.60 K H 0.70 A H 0.70 L H 0.70 T H 0.50 K H 0.20 S C 0.60 R C 0.80 E E 0.20 C E 0.80 V E 0.90 I E 0.90 A E 0.70 I E 0.30 P C 0.60 T C 0.60 V H 0.30 D H 0.40 L H 0.60 L H 0.90 D H 1.00 K H 1.00 V H 1.00 V H 1.00 G H 1.00 V H 0.80 G H 0.50 T C 0.50 C C 0.40 S C 0.80 G C 1.00 K C 0.80 D C 0.60 T C 0.20 D C 0.20 K H 0.30 F H 0.30 D H 0.20 T C 0.20 F C 0.50 R C 0.60 L C 0.30 T C 0.30 P C 0.20 I C 0.40 K C 0.70 G C 0.70 K C 0.50 Y C 0.70 V C 0.90 E C 1.00 A C 0.80 P C 0.50 L C 0.30 I H 0.30 K H 0.20 E H 0.20 C C 0.40 V C 0.40 A E 0.40 N E 0.80 I E 0.90 E E 1.00 C E 1.00 K E 1.00 V E 0.90 V E 0.70 D E 0.70 I E 0.60 I E 0.30 K C 0.30 K C 0.70 H C 1.00 D C 0.80 I C 0.40 V E 0.70 V E 1.00 L E 1.00 E E 1.00 G E 1.00 V E 0.90 A E 0.70 A E 0.30 Y C 0.10 F C 0.30 D C 0.30 T C 0.50 S C 0.80 R C 0.90 K C 1.00 E C 1.00 K C 0.60 R E 0.40 T E 0.60 L E 0.50 H E 0.70 A E 0.70 V E 0.40 G C 0.40 D C 0.80 G C 0.80 T E 0.50 F E 0.90 V E 0.90 V E 0.80 D C 0.20 G C 0.80 R C 0.70 T E 0.20 L E 0.10 D C 0.20 R C 0.50 K C 0.40 K C 0.40 Q C 0.50 M C 0.40 R C 0.50 S C 0.30 K C 0.40 L C 0.10 L C 0.40 G C 0.50 I C 0.30 F C 1.00 END Result of ASP prediction(Malin Young, Kent Kirshenbaum, Stefan Highsmith) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Kirshenbaum K, Young M and Highsmith S. Prot. Sci.(1999) 8:1806-1815. Young M, Kirshenbaum K, Dill KA and Highsmith S. Prot. Sci.(1999) 8:1752-1764. ________________________________________________________________________________ Ambivalent Sequence Predictor (ASP v1.0) mmy ________________________________________________________________________________ The resulting network (PROF) prediction is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ________________________________________________________________________________ # # # ==================================================================== # # PROF predictions for predict_h23854 # # ==================================================================== # # # # -------------------------------------------------------------------- # # SYNOPSIS of prediction # # -------------------------------------------------------------------- # # # # PROFsec summary # # # # overall your protein can be classified as: # # # # >>> mixed <<< # # # # given the following classes: # # 'all-alpha': %H > 45% AND %E < 5% # # 'all-beta': %H < 5% AND %E > 45% # # 'alpha-beta': %H > 30% AND %E > 20% # # 'mixed': all others # # # # Predicted secondary structure composition: # # +-----------------+-------+-------+-------+ # # | sec str type | H | E | L | # # | % in protein | 16.0 | 31.5 | 52.5 | # # +-----------------+-------+-------+-------+ # # # # # # Predicted solvent accessibility composition (core/surface ratio): # # Classes used: # # e: residues exposed with more than 16% of their surface # # b: all other residues. # # +--------------+-------+-------+ # # | acc type | b | e | # # | % in protein | 41.0 | 59.0 | # # +--------------+-------+-------+ # # # # -------------------------------------------------------------------- # # HEADER information # # -------------------------------------------------------------------- # # # # ................... # # About your protein: # # ................... # # # # prot_id : query # # prot_nres : 200 # # prot_nali : 189 # # prot_nchn : 1 # # prot_nfar : 189 # # # # ......................... # # About the alignment used: # # ......................... # # # # ali_orig : /home/ppuser/server/work/predict_h23854.hsspPsiFil # # # # ..................................... # # Residue composition for your protein: # # ..................................... # # # # +-----------------+-------+-------+-------+-------+-------+ # # | amino acid type | A | C | D | E | F | # # | % in protein | 7.0 | 2.5 | 7.5 | 3.5 | 4.5 | # # +-----------------+-------+-------+-------+-------+-------+ # # | amino acid type | G | H | I | K | L | # # | % in protein | 6.5 | 1.0 | 6.0 | 10.5 | 8.0 | # # +-----------------+-------+-------+-------+-------+-------+ # # | amino acid type | M | N | P | Q | R | # # | % in protein | 3.0 | 3.0 | 4.0 | 0.5 | 4.5 | # # +-----------------+-------+-------+-------+-------+-------+ # # | amino acid type | S | T | V | W | Y | # # | % in protein | 5.0 | 10.5 | 10.0 | 1.0 | 1.5 | # # +-----------------+-------+-------+-------+-------+-------+ # # # # ............................. # # About the PROF methods used: # # ............................. # # # # prof_fpar : acc=/home/ppuser/server/pub/prof/net/PROFboth_best. # # par # # prof_nnet : acc=6 # # # # .................... # # Copyright & Contact: # # .................... # # # # -> Copyright:Burkhard Rost, CUBIC NYC / LION Heidelberg # # -> Email: rost@columbia.edu # # -> WWW: http://cubic.bioc.columbia.edu # # -> Fax: +1-212-305 3773 # # # # ............. # # Please quote: # # ............. # # # # -> PROF: B Rost (1996) Methods in Enzymology, 266:525-539 # # -> PROFsec: B Rost & C Sander (1993) J Mol Biol, 232:584-599 # # -> PROFacc: B Rost & C Sander (1994) Proteins, 20:216-226 # # # # # # -------------------------------------------------------------------- # # ABBREVIATIONS used: # # -------------------------------------------------------------------- # # # # AA : amino acid sequence # # OBS_sec : observed secondary structure: H=helix, E=extended # # (sheet), blank=other (loop) # # PROF_sec : PROF predicted secondary structure: H=helix, E=extended # # (sheet), blank=other (loop) # # PROF = PROF: Profile network prediction HeiDelberg # # Rel_sec : reliability index for PROFsec prediction (0=low # # to 9=high) # # Note: for the brief presentation strong predictions # # marked by '*' # # SUB_sec : subset of the PROFsec prediction, for all residues # # with an expected average accuracy > 82% (tables # # in header) # # NOTE: for this subset the following symbols are used: # # L: is loop (for which above ' ' is used) # # .: means that no prediction is made for this # # residue, as the reliability is: Rel < 5 # # pH_sec : 'probability' for assigning helix (1=high, 0=low) # # pE_sec : 'probability' for assigning strand (1=high, 0=low) # # pL_sec : 'probability' for assigning neither helix, nor # # strand (1=high, 0=low) # # O_2_acc : observerd relative solvent accessibility (acc) # # in 2 states: b = 0-16%, e = 16-100%. # # P_2_acc : PROF predicted relative solvent accessibility # # (acc) in 2 states: b = 0-16%, e = 16-100%. # # O_3_acc : observerd relative solvent accessibility (acc) # # in 3 states: b = 0-9%, i = 9-36%, e = 36-100%. # # P_3_acc : PROF predicted relative solvent accessibility # # (acc) in 3 states: b = 0-9%, i = 9-36%, e = 36-100%. # # OBS_acc : observed relative solvent accessibility (acc) # # in 10 states: a value of n (=0-9) corresponds # # to a relative acc. of between n*n % and (n+1)*(n+1) # # % (e.g. for n=5: 16-25%). # # PROF_acc : PROF predicted relative solvent accessibility # # (acc) in 10 states: a value of n (=0-9) corresponds # # to a relative acc. of between n*n % and (n+1)*(n+1) # # % (e.g. for n=5: 16-25%). # # Rel_acc : reliability index for PROFacc prediction (0=low # # to 9=high) # # Note: for the brief presentation strong predictions # # marked by '*' # # SUB_acc : subset of the PROFacc prediction, for all residues # # with an expected average correlation > 0.69 (tables # # in header) # # NOTE: for this subset the following symbols are used: # # I: is intermediate (for which above ' ' is used) # # .: means that no prediction is made for this # # residue, as the reliability is: Rel < 4 # # # # prot_id : identifier of protein [w] # # prot_nres : number of residues [d] # # prot_nali : number of proteins aligned in family [d] # # prot_nchn : number of chains (if PDB protein) [d] # # prot_nfar : number of distant relatives [d] # # ali_orig : input file # # prof_fpar : name of parameter file, used [w] # # prof_nnet : number of networks used for prediction [d] # # prof_skip : note: sequence stretches with less than 9 are # # not predicted, the symbol '*' is used! # # # # ==================================================================== # # PROF_BODY with predictions for predict_h23854 # # ==================================================================== # # # # --------------------- # PROF results (normal) # --------------------- # ....,....1....,....2....,....3....,....4....,....5....,....6 AA |MLKNTSSNLDAPVARSCDFAMKKMDLRKAFTLIEPGPVTLVTTSAGGTNNVMTISWTMAV| OBS_sec | | PROF_sec | HHHHHHH HHHHHHH EEEEEE EEEEEEEEE | Rel_sec |965553443402331000113424533533212688648886156775311355444322| subset: SUB_sec |LLLLL...................L..H.....LLLL.EEEE.LLLLL....EE......| 3st: O_3_acc |bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb| P_3_acc |eieeieeebeebbeibbeieieeiebeebieibieebbbbbbbieeeeeibbbbbbbbib| Rel_acc |612201323330122303110452205342322122145986220210112334321020| subset: SUB_acc |e....................ee...e.b........bbbbb...........b......| ....,....7....,....8....,....9....,....10.1.,....11.1.,....12.1 AA |DFTPKLAITTGPWNFSYKALTKSRECVIAIPTVDLLDKVVGVGTCSGKDTDKFDTFRLTP| OBS_sec | | PROF_sec | EEEEEE HHHHH EEEEE HHHHHHHHHH | Rel_sec |565416787345442212332058078873252267788763256787633000011200| subset: SUB_sec |LLL..EEEE..L..........LL.EEEE..L..HHHHHHH..LLLLLL...........| 3st: O_3_acc |bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb| P_3_acc |eiebibbbbbieeeibieibeeeiibbbbbbiiebbeibiebbeieeeeieebeeiebei| Rel_acc |211126485311022015262311247946222402437130131016403331323311| subset: SUB_acc |.....bbbb........e.b.....bbbbb...e..e.b........ee...........| ....,....13.1.,....14.1.,....15.1.,....16.1.,....17.1.,....18.1 AA |IKGKYVEAPLIKECVANIECKVVDIIKKHDIVVLEGVAAYFDTSRKEKRTLHAVGDGTFV| OBS_sec | | PROF_sec | HHH EEEEEEEEEEEE EEEEEEEEEEE EEEEEE EEE| Rel_sec |136445653011110356757764410476168888765320467651355630244167| subset: SUB_sec |..L..LLL........EEEEEEE.....LL.EEEEEEEE....LLLL..EEE......EE| 3st: O_3_acc |bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb| P_3_acc |ieieeieibibeebbbibibibiiibeeiebbbbbbbbiiieieeeeiibbibieeiebi| Rel_acc |230421200041110335174332312302116806300301012432222100200313| subset: SUB_acc |...e......b......b.bi...........bb.b.........e..............| ....,....19.1.,....20.1 AA |VDGRTLDRKKQMRSKLLGIF| OBS_sec | | PROF_sec |EE EEE | Rel_sec |62426401213340003668| subset: SUB_sec |E...E............LLL| 3st: O_3_acc |bbbbbbbbbbbbbbbbbbbb| P_3_acc |bibiebeieeeiieebieie| Rel_acc |32123112021042210406| subset: SUB_acc |............i....e.e| # -------------------------------------------------------------------- ________________________________________________________________________________ The resulting prediction of globularity is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ________________________________________________________________________________ --- --- GLOBE: prediction of protein globularity --- --- nexp = 118 (number of predicted exposed residues) --- nfit = 90 (number of expected exposed residues --- diff = 28.00 (difference nexp-nfit) --- =====> your protein appears as compact, as a globular domain --- --- --- GLOBE: further explanations preliminaryily in: --- http://www.columbia.edu/~rost/Papers/98globe.html --- --- END of GLOBE