![]() |
Site Map | Search ExPASy | Contact us |
Hosted by NCSC US | Mirror sites: | Canada | China | Korea | Switzerland | Taiwan |
HTML - BLAST native output format with hyperlinks and some formatting.
NiceBlast - View with full descriptions and organism sources.
Plain Text - Text format with no links.
Programs available on ExPASy |
|
blastp | compares a protein query sequence against a protein sequence database. |
tblastn | compares a protein query sequence against a nucleotide sequence database dynamically translated in all reading frames. |
Programs available elsewhere |
|
blastn |
compares a nucleotide query sequence against a nucleotide sequence database. |
blastx | compares a nucleotide query sequence translated in all
reading frames against a protein sequence database. Available at EMBnet Switzerland |
tblastx |
compares the six-frame translations of a nucleotide query sequence against
the six-frame translations of a nucleotide sequence database. |
PSI-BLAST | Position Specific Iterative BLAST detects weak homologs
by building a profile from a multiple alignment of the highest scoring hits
in an initial BLAST search. Available at NCBI |
PHI-BLAST |
Pattern-Hit Initiated BLAST combines matching of regular expressions
with local alignments surrounding the match. |
xblast | a totally unrelated game. Available at xblast center |
SWISS-PROT | Manually annotated protein sequence database (over 100000 entries). Includes weekly updates and splice variants. |
SWISS-PROT, TrEMBL and TrEMBL-new | TrEMBL is an computer-annotated supplement to SWISS-PROT with some redundancy (over 600000 entries). TrEMBL-new contains the translations of the newest submissions to the EMBL database. Contains all consolidated proteins and ORFs, with weekly updates and annotated splice variants. |
complete microbial proteomes | Non-redundant sets of all the proteins from complete genome sequencing projects, compiled from SWISS-PROT and TrEMBL. |
Translated EST | Protein sequences derived from EST sequencing data (human, mouse, rat, zebrafish, drosophila, bovine, arabidopsis). This database contains many potential errors because of the low quality of the data. |
All databases are subdivided into taxonomic sections, selectable from the Taxonomic groups drop-down list.
All EMBL + GSS | All entries from the EMBL database (equivalent to GenBank and DDBJ). |
HTG | Unverified data from high-throughput genomic sequencing. Usually in the form of cosmids. |
dbEST | Expressed sequence tag database from the NCBI. |
EST contigs | Database of contigs based on EST clusters from Unigene (human, mouse, rat, bovine, zebrafish) and SwissClusters (Drosophila melanogaster, Arabidopsis thaliana). |
Unigene EST | Database of EST clusters (list of ESTs known to match the same cDNA) from the NCBI (updated occasionally). This database contains also useful information like STS matches, tissue distribution, or transcript map. |
Complete genomes | Genomes released in the form of a complete, assembled sequence. |
For blastp, you may enter either a numeric NCBI TaxID (e.g. 10090), or a taxon (e.g. Bacteria), or a species name either in Latin or in English. For the list of known species names and synonyms, see SWISS-PROT species list. As the hits will be filtered in a post-processing stage, this may result in a significant delay.
A display of the BLAST hits as a taxonomic tree is also available from the
result page, by clicking on the "Taxonomic view of BLAST hits" button.
E-mail address
Enter your e-mail address to receive the results by e-mail. Otherwise, they will
arrive interactively in your browser. The e-mail option is recommended for tblastn
searches on big databases such as EMBL. If your interactive search is too long,
you will receive an error message requiring you to resubmit via e-mail.
Options
Comparison matrix
The matrix assigns a probability score for each position in an alignment. The
BLOSUM matrix assigns a probability score for each position in an alignment that
is based on the frequency with which that substitution is known to occur among
consensus blocks within related proteins. BLOSUM62 is among the best of the available
matrices for detecting weak protein similarities. The PAM set of matrices is also
available.
If the "Auto-select" option is selected (default), the matrix will be selected
depending on the query sequence length, based on the following (empirically
constructed) table:
Query length | Substitution matrix |
---|---|
<35 | PAM-30 |
35-50 | PAM-70 |
50-85 | BLOSUM-80 |
>85 | BLOSUM-62 |
The expectation value (E) threshold is a statistical measure of the number of expected matches in a random database. The lower the e-value, the more likely the match is to be significant. E-values between 0.1 and 10 are generally dubious, and over 10 are unlikely to have biological significance. In all cases, those matches need to be verified manually. You may need to increase the E threshold in the following cases :
BLAST Frequently Asked questions at NCBI (includes error messages)
The Statistics of Sequence Similarity Scores by Altschul
![]() |
Site Map | Search ExPASy | Contact us |
Hosted by NCSC US | Mirror sites: | Canada | China | Korea | Switzerland | Taiwan |