Patterns and Profiles

Tools & databases

Seaching patterns and profile in sequences (data: sequence; result patterns)
  • ScanProsite - Scans a sequence against PROSITE or a pattern against SWISS-PROT and TrEMBL
  • PPSEARCH - Scans a sequence against PROSITE (allows a graphical output); at EBI
  • PROSITE scan - Scans a sequence against PROSITE (allows mismatches); at PBIL
  • ProfileScan - Scans a sequence against protein profile databases (including PROSITE)
  • Frame-ProfileScan - Scans a short DNA sequence against protein profile databases (including PROSITE) 
  • PSI-BLAST Position-Specific Iterated BLAST. The BLAST algorithm generalised to use an arbitrary position-specific score matrix in place of a query sequence and associated substitution matrix.
  • PRATT - Interactively generates conserved patterns from a series of unaligned proteins
  • SMART - Simple Modular Architecture Research Tool; at EMBL
  • MOTIF - Search motif libraries for motifs in a query sequence (PROSITE Pattern , PROSITE Profile, BLOCKS,  ProDom, PRINTS, Pfam or User-defined Profile library)
Searching sequences that have a given pattern or profile (data: pattern; result: sequences)
  • ScanProsite - Scans a sequence against PROSITE or a pattern against SWISS-PROT and TrEMBL
  • PATTINPROT - Scans a protein sequence or a protein database for one or several pattern(s); at PBIL
  • PatScan - The PatScan pattern matcher is being offered to allow you to search protein or DNA sequence archives for instances of some pattern. You must provide the pattern, along with some indication of which protein or DNA sequences you wish to scan. 
  • PmotifSearch of given protein motif in requested protein and nucleotide database.
  • MOTIFS: Search protein sequence databases for a given motif (HMM or prosite)
Tools that make life easier

 
 
Databases, protein sequence information
  • PROSITE: - Dictionary of protein sites and patterns. PROSITE is a method of determining what is the function of uncharacterized proteins translated from genomic or cDNA sequences. It consists of a database of biologically significant sites, patterns and profiles that help to reliably identify to which known family of protein (if any) a new sequence belongs. PROSITE references and references on profiles from PROSITE
  • Pfam: Pfam is a large collection of multiple sequence alignments and hidden Markov models covering many common protein domains.Version 5.0 of Pfam (January 2000) contains alignments and models for 2008 protein families, based on the Swissprot 38 and SP-TrEMBL 11 protein sequence databases.
  • BLOCKS:  Blocks are multiply aligned ungapped segments corresponding to the most highly conserved regions of proteins. Block Searcher, Get Blocks and Block Maker are aids to detection and verification of protein sequence homology. They compare a protein or DNA sequence to a database of protein blocks, retrieve blocks, and create new blocks, respectively.
  • PRINTS: PRINTS is a compendium of protein fingerprints. A fingerprint is a group of conserved motifs used to characterise a protein family; its diagnostic power is refined by iterative scanning of a SWISS-PROT/TrEMBL composite. Usually the motifs do not overlap, but are separated along a sequence, though they may be contiguous in 3D-space. Fingerprints can encode protein folds and functionalities more flexibly and powerfully than can single motifs, full diagnostic potency deriving from the mutual context provided by motif neighbours. References
  • MOTIFS: A set of motif libraries and search programs (Kyoto Univ., Japan) for retrieval and analysis of protein sequence and structural motifs. The program currently availables
  • ProDom: The ProDom protein domain database consists of an automatic compilation of homologous domains detected in the SWISS-PROT database by the DOMAINER algorithm (Sonnhammer, E.L.L. & Kahn, D., 1994, Protein Sci. 3:482-492). It has been devised to assist with the analysis of the domain arrangement of protein. Last release of ProDom families was generated automatically using PSI-BLAST with a profile built from the seed aligments of Pfam-A 3.4 families. 
  • GeneFIND (Gene Family Identification Network Design) is an integrated database search system that combines several search/alignment tools and ProClass database to provide rapid and accurate gene family classification with enriched family information. The objectives are to improve speed and sensitivity, differentiate global and motif similarities, and provide collective information in an integrated platform that alleviates human annotation effort. It was used to identify several thousands of new ProSite members, which have been incorporated into out ProClass_Motif sub-database

 
REFERENCES
Every web page has its own reference list. Anyway you can consult: