ExPASy Home page Site Map Search ExPASy Contact us Swiss-Prot Proteomics tools
Hosted by NCSC USMirror sites:Canada China Korea Switzerland Taiwan
Search for

The PeptIdent Tool


I. Function


PeptIdent is a tool that allows the identification of proteins using peptide mass fingerprinting data.

II. Description

Peptide mass fingerprinting involves the digestion of an unknown protein with a proteinase of known cleavage specificity and the measurement of the resulting peptides by mass spectrometry.
Those experimentally measured peptide masses are then compared with the theoretical peptides calculated for all proteins in a protein sequence database.

The PeptIdent tool calculates the theoretical peptides of all proteins in the Swiss-Prot/TrEMBL databases by "cutting" them with the enzyme of choice, and calculating the theoretical masses of generated fragments. These peptides and their masses are stored in a precomputed index. PeptIdent matches the masses of your experimentally observed peptides with all peptide masses in this index. Best-matching database proteins are ranked by the number of hits they have with the observed experimental peptides.

Isoelectric point, molecular weight and a species (or group of species) can be specified in order to restrict the number of candidate proteins and reduce false positive matches.

PeptIdent makes extensive use of the annotations in Swiss-Prot/TrEMBL, and, unlike other peptide mass fingerprinting identification programs, it takes into account post-translational modifications as documented in Swiss-Prot. Therefore, PeptIdent removes signal sequences and/or propeptides (as documented in the Swiss-Prot feature table (FT lines)) before computing pI, Mw and peptide masses for each of the resulting mature forms. The program not only returns a list of likely protein identifications, but also any hits with peptides that are known to carry any of more than 20 different types of discrete post-translational modifications . The program thus offers a degree of protein characterization as part of the identification procedure.
The mass effects of several chemical protein modifications, such as oxidation of methionine or acrylamide adducts of cysteine residues, or desired alkylation products of cysteine residues can also be considered by the program.


DOC Note! The PeptIdent tool does not consider any protein glycosylation apart from O-GlcNAc and C-mannosylation on tryptophan. N-linked and larger O-linked sugar structures are generally of unpredictable mass.

DOC Note! PeptIdent does not do any de novo prediction of post-translational modifications on proteins. All modified peptides shown in the results will be the verification of an event documented in Swiss-Prot. However, PeptIdent can match peptides whose modifications are documented in Swiss-Prot as «potential» or «by similarity», and thus allows predicted post-translational modifications to be validated. (See the document Swiss-Prot annotation: how is biochemical information assigned to sequence entries for the use of the terms «potential», «by similarity» or «probable» in the annotations.)
The FindMod, GlycoMod and FindPept tools can be used subsequently to protein identification with PeptIdent. They allow the

PeptIdent results are displayed on-line or can be sent to you by email, in form of an html table. The result file contains direct links to FindMod, GlycoMod and FindPept to further characterize matching proteins by predicting potential protein post- translational modifications and finding potential single amino acid substitutions, to PeptideMass and to BioGraph for the graphical representation of the theoretical spectrum.




III. How to use the tool


1. Name of the unknown protein
In the « Name of the unknown protein:» text box supply a name or a code number for the query protein. For numerous matches, give different names to each unknown protein. This is helpful if you want to archive your query and identify it later.

2. Select database
Under «Database:.», select the database(s) to use for the search. You have the choice of:
Consult the Swiss-Prot/TrEMBL page, the SPTR README file or the Swiss-Prot/TrEMBL references for more information about these databases.
When searching Swiss-Prot, all alternative splice isoforms described in Swiss-Prot feature tables are included in the search (e.g. Isoform 12S of O43184). For splice isoforms, no processing or post-translational modifications documented in the Swiss-Prot feature table are taken into account, as they are usually documented for the primary isoform described in the original entry.

DOC Note! Peptides with masses >6000 Dalton are not indexed and therefore not considered in the search.

DOC Note! Annotation in the TrEMBL database is done automatically; therefore it is incomplete and not always correct. Where available, TrEMBL annotation is used like for Swiss-Prot to process the proteins into mature chains or peptides. TrEMBL results should therefore be interpreted with care.
DOC Note! Some Swiss-Prot/TrEMBL entries contain ambiguous residues (X = any amino acid, B = Asx = Asp(D) or Asn(N), Z = Glx = Gln(Q) or Glu(E)) Examples for such entries are P19341, O77721. As substitution of D by N or of Q by E induces mass differences of about 1 Dalton, is not possible to compute exact masses for peptides containing one or more residues B, Z or X. Those peptides are therefore not included in the index.


3. pI and pI range
In the « pI:» box, specify the pI of the protein of interest, if known. This should be estimated from a 2-D gel. You can also specify the confidence you have in your pI estimation by selecting the appropriate number under « pI range: ».
If no number is specified, a pI of 0- is assumed. That will cover all pI values and return all proteins within the specified Mw region, regardless of the pI.


DOC Note! For bacterial proteins separated in IPG gradients, a range of ± 0.25 around the estimate is usually sufficient. For eukaryotic proteins, increase this range to ± 0.5 units if the proteins are thought to be unmodified. If there is a high probability that the eukaryotic protein carries charge-modifying modifications, such as sialic acid, the range should be changed to ± 1.

DOC Note! If you only have a vague idea of the protein pI, use a very large range. Even using a pI with a large range can increase the power of your search.


DOC Note! pI cannot be determined for fragments, i.e. proteins for which the sequence information in Swiss-Prot/TrEMBL is incomplete. If you specify a pI estimate for your query, the result will include all fragments from the database which correspond to your other search criteria (cf. the corresponding note for Mw of fragments.)



4. Molecular weight and Molecular weight range
In the « Mw» box you should specify the estimated mass of your protein, in Dalton. This can be estimated from a 2-D gel, or from mass spectrometry (MS) of the entire protein. You should also specify the confidence that you have in your Mw estimate, in percent terms, by selecting the appropriate number in the « within Mw range (in percent): » box. This allows you to limit the search to proteins within the specified molecular weight range. If no number is specified, a molecular weight of 0- is assumed which means that the program searches the whole database, which includes proteins up to a size of more than 2'000'000 Dalton (human Titin, heart isoform, Q10466).



DOC Note! For bacterial proteins larger than 20 KDa, a range of ± 20% around the Mw estimate is usually sufficient. For small proteins, allow a +/- 40% range. For cytoplasmic eukaryotic proteins this range is also usually sufficient, but secreted eukaryotic proteins often carry post-translational modifications that require a range of respectively ± 40% and 100% or more to be inclusive. If masses have been determined with MS, the ranges used can be much smaller. However, note that if MS has been used to determine the size of a glycoprotein or other heavily modified protein, the measured mass this mass may be considerably larger than the mass of the unmodified polypeptide predicted in the database.


DOC Note! If you only have a vague idea of the protein Mw, you can use a large range. However, as the proteins are ranked by the number of matching peptide masses, very large proteins are likely to obtain a high score and appear at the top of the list. Eliminating proteins with high molecular weight can reduce random matches. Whenever you have an idea about the Mw range, it is highly recommended to use this information in the identification to speed up searches and to reduce «false positives».

DOC Note! Mw cannot be determined for fragments, i.e. proteins for which the sequence information in Swiss-Prot/TrEMBL is incomplete. For these proteins, we consider that the Mw of the complete protein is greater or equal the molecular weight of the known sequence portion. Hence, if you specify a Mw estimate for your query, the result will include all fragments from the database except those, for which the upper boundary of your Mw range is greater or equal the mass of the known sequence portion from the database.






5. Species
From the pull-down menu « Species to be searched: », select a species or taxonomic range to limit the search to proteins from the specified organism(s). In this case the peptide mass data is matched only against proteins from the specified organism(s), thus eliminating many irrelevant proteins from unrelated organisms.

To match your peptides against peptides from all species in the database, select "ALL". This option is not recommended without good reason, as it unnecessarily increases the search space and causes a significant number of unrelated false positive matches to appear.

DOC Note! We define "single species matching" where you, for example, have proteins from E. coli which you then match against only the E.coli proteins in the database. This is a good approach to use when the organism you are working with is molecularly well defined, or ideally, the subject of a genome project.
DOC Note! If the source of your proteins is not molecularly well defined, it is best to do "cross-species matching". Thus, for example, if you are working with proteins from Candida albicans, you may wish to either match your proteins against all proteins from fungi or against the fully sequenced yeast Saccharomyces cerevisiae.
DOC Note that when cross-species matching, protein pI is frequently poorly conserved [ref], but protein mass is generally very well conserved. You should take this into consideration when setting your pI and Mw ranges.
DOC Note! Peptide masses are not well conserved across species boundaries. The poor conservation of peptide mass data is expected, as a single amino acid substitution in any peptide can drastically change its mass [1].
DOC Note! Apart from a number of model organisms (e.g. human, bovine, rat, mouse, E.coli, etc.), the pull-down menu also contains groups of species. This is useful if working on, for example, cats, as one can match against all proteins in the database described for mammals. If you are in doubt about the taxonomic classification, you can consult the NEWT Taxonomy Browser.

DOC Note! Proteins that are 100% conserved between different species are merged into a single Swiss-Prot entry, e.g. UBIQ_HUMAN, CALM_HUMAN. In such entries, information about the source of each organism is noted in the OS (Organism Species) lines, e.g. actin, P03996:
OS   Homo sapiens (Human), Mus musculus (Mouse), Rattus norvegicus (Rat),
OS   Bos taurus (Bovine), and Oryctolagus cuniculus (Rabbit).

However, the OC (Organism Classification) lines will only contain the taxonomy of the first listed species:
OC   Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Mammalia;
OC   Eutheria; Primates; Catarrhini; Hominidae; Homo.
In such cases, a PeptIdent query with RODENTIA will miss the P03996 actin entry which describes the mouse actin sequence, but contains the organism classification for Homo sapiens.

6. Peptide mass data of the unknown protein

Enter a list of peptide masses
Enter the experimentally measured peptide masses generated from the unknown protein in the « Enter a list of peptide masses... » text field, and separate them by spaces, tabs or new lines.


DOC Note! You can copy a list of peptides from Excel or other applications and paste them directly into the text field.
DOC Note! Avoid using peptide masses known to be from autodigestion of an enzyme (e.g. trypsin!), or other artefactual peaks (e.g. matrix peaks).


Upload a .pkm, .dta or text file
If the peptide mass fingerprinting data is stored in a file of one of the formats listed below, you can also upload the file directly from your computer:
(1) Click on the on the « Browse...» button
(2) select the file containing the relevant peptide mass data and
(3) click on the « Open» button
The peptide masses will then be extracted automatically from this file.



Supported formats:
.pkm format, produced by the Voyager software of Perseptive Biosystems or the GRAMS software
Example:
OP=0
Center X Peak Y Left X Right X Time X Mass Difference Name
STD.Misc Height Left Y Right Y %Height,Width,%Area,%Quan,H/A
833.319 2189 833.260 833.378 0.016 0 0
C 0.? 0 762 762
854.843 5078 854.769 854.917 0.001 0 0
C 0.? 0 3453 3453
863.419 5108 863.064 863.775 0.001 0 0
C 0.? 0 3567 3567
872.402 12519 872.347 872.456 0.002 0 0
C 0.? 0 11417 11417
874.395 6730 874.331 874.460 0.002 0 0
C 0.? 0 3559 3559
887.786 5903 887.540 888.031 0.003 0 0
C 0.? 0 4131 4131
898.475 3329 898.416 898.534 0.006 0 0
C 0.? 0 1377 1377
904.366 7432 904.199 904.533 0.001 0 0
C 0.? 0 5596 5596
955.300 2598 955.229 955.371 0.011 0 0
C 0.? 0 1089 1089
973.845 16689 973.749 973.941 0.001 0 0


All lines before the line containing ‘H/A’ are ignored. After that, only lines which do not contain any capital letters in the first 20 characters are retained. From the retained lines, the first column is interpreted as the mass value, and the second column (if present) as the peak intensity. The intensity is only important if the BioGraph tool is used subsequently to the identification.

Sequest format
Example:
1.00 1
833.319 2189
844.333 0.0
854.843 5078
863.419 5108
872.402 12519
874.395 6730
887.786 5903
898.475 3329
899.555 0.0
904.366 7432
955.300 2598
973.845 16689

The first line is considered as a comment and is ignored. All subsequent lines are interpreted to contain a mass and an intensity (if any), and mass values are taken into account if the corresponding intensity is > 0.

Any user-created files can be uploaded if they correspond to the following rules:
DOC Note! The upload option only works if you see a 'browse' button next to the text entry field. This should be the case for most recent web browser versions, e.g. Netscape 3.0 or higher, MS Internet Explorer 4.0 or higher.


7. Charge state (ion mode)
You can enter the masses of your peptides as [M] (molecular mass data), [M-H]- (negative ion mode, deprotonated molecular ions) or [M+H]+ (positive ion mode, protonated molecular ions), however you must select the appropriate button.
If you select the [M+H]+ button, all peptide masses calculated from the database will have one proton (mass of 1 unit) added before matching with user-specified peptides, thus giving values for singly charged peptides as found in electrospray and MALDI-TOF mass spectrometers.
With [M-H]- selected, all theoretical peptide masses will have one proton (one mass unit) removed before matching.


8. Mass type, isotopic resolution
Specify whether the experimental mass values are « average» values or « monoisotopic» values by ticking the appropriate box. The theoretical masses of the peptides in the database will be calculated accordingly.



9. Modification of cysteines
If the unknown protein has been reduced and alkylated, you should specify the reagent used for the alkylation under the " with cysteines treated with: " menu. This can be either iodoacetamide (forms carboxyamidomethyl cysteine, Cys_CAM), iodoacetic acid (forms carboxymethyl cysteine, Cys_CM) or 4-vinyl pyridene (forms pyridyl-ethyl cysteine, Cys_PE).

If the protein has not been treated in this manner, you should select the option " nothing (in reduced form) " (default setting).



DOC Note! Proteins are usually subject to reduction and alkylation before they are used to generate peptides. This ensures that all disulfide bonds in a protein are broken.


DOC Note! The program will modify the theoretical masses of Cys-containing peptides accordingly, before matching with the experimental peptide masses.


10. Acrylamide adducts on cysteines
It is common for proteins separated by polyacrylamide gel electrophoresis that reduced cysteines react with free acrylamide monomers. PeptIdent can therefore be used with the option «acrylamide adducts on cysteines». With this option selected, the program considers all cysteine residues in a peptide as potentially modified and forming propionamide cysteine, Cys_PAM.

11. Oxidized methionines
You can request for all methionines in theoretical peptides to be oxidised to form methionine sulfoxide (MSO). If this option is selected, the program will modify the theoretical masses of all Met-containing peptides accordingly, before matching with user-specified peptides.

DOC Note! Proteins separated by gel electrophoresis often show this modification.


General note about artefactual cysteine and methionine modifications :
If more than one cysteine or methionine residue can be found in the peptide, the masses of any number of possible modifications will be calculated. For example, if there are three methionine residues in a peptide, the masses of peptides having zero, one, two or three oxidized methionines will be calculated. The program can also account for post-translational modifications in conjunction with artefactual modifications in a peptide. This can be very useful, however one should be aware that by computing all combination of possible artefactual or post-translational modifications, a considerable amount of «noise» is added to the database of peptide fragments. If you think your peptides are not likely to carry artefactually modified cysteines or methionines, it is recommended not to select any of these modifications in order not to artificially increase the search space.

12. Mass tolerance
In the « Mass tolerance: ±... » text box, enter the mass tolerance to be used around your peptides during matchingSelect whether the mass tolerance is expressed as an absolute value (in Dalton) or relative value (in parts per million; ppm).


DOC Note! If you have a peptide mass of 934.3 Da and specify a tolerance of ± 0.5 Da, a hit will be registered with a protein if one of its peptides in the database has a mass between 933.8 Da and 934.8 Da.

DOC Note! If you have a peptide mass of 934.3 Da and specify a tolerance of ± 100 ppm, a hit will be registered with a protein if one of its peptides in the database has a mass between 934.207 Da and 934.393 Da.

DOC Note! The mass tolerance should reflect the known accuracy of your mass spectrometer (MS). Both MALDI and ES machines are now capable of achieving single decimal point mass resolution, however, this may depend on the care that has been taken in machine calibration and use of internal standards. We recommend the use of a tolerance of 0.2 Da or 200ppm or better whenever it is possible.
ESI-TOF mass spectrometers or MALDI-TOF apparatus equipped with delayed extraction and ion reflectors are ideal for this, since most can deliver monoisotopic masses below ±40 ppm, when two point internal calibration is used.
Less accurate peptide mass data will require a larger mass tolerance and will result in a lower accuracy of your search.


DOC Note! Mass spectrometers typically have a mass dependent error associated with mass measurements, which cannot be uniformly expressed in Dalton. The use of ppm can therefore be more accurate.


13. Cleavage agent
Under «Enzyme:» you can specify the enzyme that you used to generate your peptides. See here for the cleavage rules.
DOC Note! The current version of PeptIdent only supports tryptic cleavage.

14. Missed cleavage sites
In order to take into account partial cleavages, you can specify a maximum number (0, 1 or 2) of missed cleavage sites to be allowed.

If the maximum number of missed cleavages entered is 1, all concatenations of two adjoining peptides are also added to the list of theoretical peptides under consideration.

DOC Note! If you are confident that your digest was complete, with no partial fragments present, choose the setting 0 (default setting). This will give maximum discrimination and keep the number of random matches low.

DOC Note! If experience shows that your digest usually includes some peptides with missed cleavage sites, you should specify a setting of 1, rarely 2. However, keep in mind that each additional level of missed cleavages increases the number of calculated peptide masses to be matched against the experimental data and the number of random matches.

15. Minimal number of peptide matches required
Under « Report only proteins with... » specify the minimum number of peptide mass hits you require a matching protein to show for it to be included in the result list. The default value is 4.

16. Maximum number
Limit the number of matching proteins displayed in the result report by selecting the appropriate value in the « Display a maximum of... » menu. The default value is 20.

17. Print sequence information
You can specify if you would like the result to include, for each high-scoring protein, information about the sequence portion covered by the matching peptides. If this is selected, the protein sequence will be displayed, and all matching peptides will be highlighted in color and upper case letters.


18. Send result by e-mail
PeptIdent results are displayed on-line in your browser window or can be sent by e-mail,
If the results should be sent back to you by e-mail, tick the « Send the result by e-mail » box. In the « Your e-mail:» text field you should enter the correct e-mail address (e.g. name@unknown.ch) to where the results should be sent. The email option is recommended, in particular for queries with a high number of peptide masses or for searches against large sections of the database. This avoids timeouts («document contains no data») which can occur for the on-line option: the browser interrupts the connection with the program if the search is not terminated after a certain time (usually about 3 minutes).

DOC Note! If you select the e-mail option, your information will be sent to the ExPASy computer, which then undertakes the matching and returns the results to you by e-mail. It operates in batch mode, which means several searches can be sent successively, without having to wait for the result of the preceding query.
DOC Note! In batch mode, only a very limited number of requests can be treated simultaneously, and your query will be queued for processing. Usually, results are sent back within a few minutes. However, if the batch queue already contains a number of requests, it will take longer (even up to a few hours in the worst case) for your query to be returned. Please allow for a certain time and do not unnecessarily resubmit your request. If you do not receive any results, it is possible that you made an error when you specified your email address. If you think there is a problem with the server, contact the server administrator and specify the time of submission, details about your search parameters and whether you got any error messages. Do not forget to specify which of the ExPASy mirror sites you were using.


19. Start PeptIdent
Once you have filled in the form, click on the « Start PeptIdent » button to start the program.
If you have made a mistake and would like all fields to be reset to their default values, press the « Reset» button.


IV. Result Output


1. Summary of search parameters


The top part of the page provides a summary of all user-specified search parameter, as well as the date of the query, the database release number and current number of entries, and a button to perform a new PeptIdent search.

2. List of matching database entries

The second part of the result page contains a summary of the best-matching protein from the database, with a «quick jump» link to detailed peptide information provided further down in the same page (see the following section). This summary provides the following information for each of the high-scoring proteins:

Score: The score or hit-rate for peptide mass fingerprinting is simply the number of peptides that match the theoretical peptides from a database entry divided by the total number of peptide masses specified for the search.

# peptide matches: Number of peptides that matched those from a database entry

AC: Swiss-Prot/TrEMBL accession number (AC)

ID: Swiss-Prot/TrEMBL entry name (ID)

Description: Name of the matching protein (Swiss-Prot/TrEMBL DE line). If the matching sequence results from cleavage of a larger precursor molecule, the name of the chain or peptide (Swiss-Prot/TrEMBL FT CHAIN or PEPTIDE line) will be displayed.

pI: theoretical isoelectric point of the matching protein.

Mw: theoretical molecular weight of the matching protein in Dalton.


3. Details of each match


user mass: Experimentally measured molecular mass of the peptide as provided by the user.

matching mass: Theoretical molecular mass of the matching peptide as calculated from the database entry. This mass may already include modification (post-translational or artefactual) – in this case the type of modification is detailed in the «modification» column.

Delta mass: Difference between experimentally measured and theoretical peptide masses. This mass difference is given in Da or ppm, depending on the mass tolerance unit specified by the user.

#MC: number of missed cleavage sites considered for the calculation of the theoretical molecular mass of the peptide.

modification: modification(s) of the peptide that were considered for the calculation of the peptide mass.
The following format is used to describe these modifications:
The abbreviations used for the different types of modifications are listed here.


position: sequence position of the matching peptide in the database entry. If the protein under consideration is the result of post-translational processing into a mature chain or peptide, the position information used corresponds to the numbering in the underlying Swiss-Prot/TrEMBL entry, i.e. to the numbering relative to the precursor sequence.
example:
The position given for the N-terminal tryptic peptide ANSFLEELRPGNVER of the mature chain ‘Protein C light chain (40-194)’ of PRTC_BOVIN (P00745) is 40-54, and not 1-15.


peptide: amino acid sequence of the matching peptide

Delta pI : difference between user specified/ estimated Mw and theoretical pI of the matching protein in the database.

Delta Mw : difference between user specified/ estimated Mw and theoretical Mw of the matching protein in the database.

% of sequence covered: Percentage of the protein sequence covered by the matching peptides. To calculate this percentage, the number of amino acids contained in at least one of the matching peptides is divided by the length of the protein / mature peptide.

Sequence: The sequence of the matching protein, printed in lower case letters. The regions of the sequence that are matching the query peptides are highlighted in red and printed in capital letters. The sequences can start from positions higher than position 1. This reflects the removal of propeptides and signal sequences.


V. Links


The PeptIdent output contains links to a number of related programs on ExPASy. These links are available both from the on-line result page and from the html file returned by email. Relevant input data and/or information about the matching database entry are automatically transferred to those programs.

PeptIdent
A new PeptIdent search can be launched directly from the result output. This allows to


DOC Note for users who wish to use BioGraph visualization tool subsequently to a second search submitted via the «New Search» button: If peptide masses were initially specified through the upload of a file which also contained peak intensities (e.g. pkm or dta format), the peptide masses from the original file will appear pasted into the mass window of the new search form, where they can be modified if desired (e.g. peaks which are believed to be «noise» or caused by a contaminant or autodigestion of Trypsin can be removed). Peak intensities however, as read from the pkm or dta file, will be transferred from the first to the second PeptIdent query as hidden parameters. They have to be taken over from the original query and cannot be modified for a second search. It is therefore not recommended to modify / remove / add a mass before resubmitting data from an earlier query if you intend to use the BioGraph link to visualize the theoretical spectrum. Modification / removal / insertion of a mass would only be effective for the mass value itself, whereas the peak intensities from the original spectrum would remain unchanged. In case of insertion or deletion of a mass, all intensities would be shifted and no longer be associated with the original mass values. For the BioGraph tool to work correctly, it is more advisable to apply modifications directly to the mass/intensity file, and to upload the file again.

Furthermore, direct links to a number of characterization / visualization tools are available for each matching candidate protein:

FindMod
The FindMod tool can be used to predict post-translational modifications or single amino acid substitutions. This is done by comparing experimental peptide masses that did not match with the protein against those calculated from the assigned protein sequence, seeking mass differences that may be due to post-translational modifications. A number of rules are applied, trying to determine whether a post-translational modification suggested by mass difference is likely to occur in the peptide under consideration.

GlycoMod
The GlycoMod tool can be used to predict the possible oligosaccharide structures that occur on proteins from their experimentally determined masses. The program can be used for free or derivatized oligosaccharides and for glycopeptides.

FindPept
The FindPept tool can be used to identify peptides that result from unspecific cleavage of proteins from their experimental masses, taking into account artefactual chemical modifications, post-translational modifications (PTM) and protease autolytic cleavage.

PeptideMass
The PeptideMass tool can be used to simulate a theoretical digest of the matching database protein

BioGraph
The BioGraph tool allows a graphical and interactive representation of the PeptIdent results. A theoretical spectrum is displayed, in which peaks corresponding to matching and unmatching peptides are shown in different colors.


VI. Comments


The output of a peptide query contains a list of proteins ranked by the number of peptides shared with the unknown protein, where the correct identification for an unknown protein is likely to be that with the largest number of peptide «hits».


Confidence in identification is achieved by looking for a significant difference in the number of matching peptides between the top and second ranked protein, and a good sequence coverage of the top ranked protein with the experimentally determined peptides.

Peptide mass data can represent a starting point for the examination of protein modifications or processing: Peptides from a protein that do not match those from a database may carry post-translational or artefactual modifications, or may have undergone amino acid substitution or truncation.

Peptide mass fingerprinting will rarely find matches for all peaks in a spectrum. This is because some peptides, especially those that are large and/or very hydrophobic, are either not extracted or not quantitatively extracted from a gel or blot, and others are not ionised efficiently during mass spectrometry.




VII. References


Wilkins M.R, and Williams K.L. (1997) Cross-species identification using amino acid composition : a theoretical evaluation. J. Theor. Biol. 186, 7-15.

Last modified 14/Jun/2002 by ELG
ExPASy Home page Site Map Search ExPASy Contact us Swiss-Prot Proteomics tools
Hosted by NCSC USMirror sites:Canada China Korea Switzerland Taiwan