[<=
Back]
Practical: Alignments, Motifs and Profiles
(An example: a DOF ZINC FINGER PROTEIN from ARABIDOPSIS
THALIANA)
Step 1 Extract the Sequence of any Arabidopsis
DOF gene using SRS.
-
Open SRS (EMBL
v. 5) & click START.
-
Select SPTREMBL databank and Continue.
-
Write DOF in any AllText box and select Organism
and write Arabidopsis in another one. Click Do Query.
-
Click on any entry (e.g. SPTREMBL:O82155)
-
Save the result as text:
ID O82155 PRELIMINARY;
PRT; 194 AA.
AC O82155;
DT 01-NOV-1998 (TrEMBLrel. 08, Created)
DT 01-NOV-1998 (TrEMBLrel. 08, Last sequence update)
DT 01-NOV-1998 (TrEMBLrel. 08, Last annotation update)
DE DOF ZINC FINGER PROTEIN.
GN ADOF1.
OS Arabidopsis thaliana (Mouse-ear cress).
OC Eukaryota; Viridiplantae; Embryophyta; Tracheophyta;
Spermatophyta;
OC Magnoliophyta; eudicotyledons; core eudicots; Rosidae;
eurosids II;
OC Brassicales; Brassicaceae; Arabidopsis.
OX NCBI_TaxID=3702;
RN [1]
RP SEQUENCE FROM N.A.
RC STRAIN=COLUMBIA; TISSUE=SEEDLING;
RA Itagaki T., Kisu Y., Esaka M.;
RT "cDNA cloning and gene expression of Dof zinc finger
protein in
RT Arabidopsis thaliana.";
RL Submitted (SEP-1998) to the EMBL/GenBank/DDBJ databases.
DR EMBL; AB017564; BAA33196.1; -.
SQ SEQUENCE 194 AA; 21682 MW;
960CB9CF86C63BEA CRC64;
MQDLTSAAAY YHQSMMMTTA KQNQPELPEQ EQLKCPRCDS
PNTKFCYYNN YNLSQPRHFC
KNCRRYWTKG GALRNIPVGG GTRKSNKRSG SSPSSNLKNQ
TVAEKPDHHG SGSEEKEERV
SGQEMNPTRM LYGLPVGDPN GASFSSLLAS NMQMGGLVYE
SGSRWLPGMD LGLGSVRRSD
DTWTDLAMNR MEKN
//
Step 2 Blast searching using Dof sequence.
-
Open the Avanced WU-BLAST2 Search Web page (EMBL)
-
Enter the Dof sequence (Copy & Paste)
-
Select the nrdb database.
-
Change the Descriptions and Alignments values
to 100. [Look at this FIGURE]
-
Now, you can launch the search (click Submit Query),
but the results page, to save time, is also HERE.
-
Results analysis.
Take a look to the sequences (they are linked to SRS)
and to the different blast results fields.
The "lines" graphic is very informative. What it means?
The query sequence has been filtered (some residues have
been substituted by X) Why?
-
Click on Get the selected sequences to obtain the
ones homologous to Dof.
Save this sequences as text. (Again, the results are
HERE.)
Step 3 Alignment of the Dof sequences using
ClustalW.
-
Open the ClustalW Server (at EBI)
Look at the different Parameters and Help links.
-
Enter the sequences previously obtained. [Look at this FIGURE]
-
Same as before, you can launch the program (click RUN
CLUSTALW), but the results are HERE.
-
Alignment analysis.
What can we say about the Dof Proteins?
Is this result compatible with the "lines" graphic from
the Blast search?
-
To visualize the alignments using good programs makes the
analysis easier. As an example, take a look to:
Then, we have found the clear homologous to the Dof protein
and we have defined a conserved region into the family sequences alignment.
Another question to be resolved could be: There are some
other protein families far related to Dof?
Step 4 Analysis of Dof Sequence using Prosite
(Profiles database)
-
Open the Prosite server (at ISREC)
-
Select Prosite profiles (NScore) and Prosite patterns
(no score)
-
Enter a Dof sequence into the text field. [Look at
this FIGURE]
-
Click on Run ProfileScan (the results are HERE.)
Results analysis: There is any informative pattern hit?
Step 5 Pfam analysis of the Dof sequence
(hidden Markov models)
Joining this results to the predicted Secondary Structure
of Dof (simmilar to GATA's one):
AA LKCPRCDSPNTKFCYYNNYNLSQPRHFCKSCRRYWTKGGALRNVPVGGGSRKN..ATKRSTSS
PHD_sec EEEE HHHHHHHHHEE EEEE
Rel_sec 999999999995476449999997134423443114799465637999997699999998999
SUB_sec LLLLLLLLLLLL.EE..LLLLLLL............LLL.EEE.LLLLLLLLLLLLLLLLLLL
then, some interesting conclusions could be obtained (as
in this ARTICLE).
THIS IS THE STRUCTURE OF GATA:
Look at the Cys residues coordinating the Zn atom.
Other figures are:
All
Dof genes table.
Dof
proteins phylogenetic tree.
The
tree alignment.
PGP-17-10-00