Ptactical lesson 4
Sequence Retrieval System, part 1
By Manuel J. Gómez, PDG, CNB, CSIC.
The aim of these exercises is to get acquainted to the use of the SRS
to retrieve information and to launch applications.
A. Searching sequences and launching applications.
-
Open Netscape or a similar browser and make a connection to the SRS server
at EBI (http://srs.ebi.ac.uk/ ).
-
You will access to the main page of SRS, from which you could perform a
simple search, using the box labeled as QUICK search, in a number of databases
selected by default. From this main page you could also choose to start
a "Permanent Project".
-
Select the tab labeled as Library Page. From this page you can perform
more refined searches, using the "Standard" or "Extended" forms (although,
again, you can also start a QUICK search). However, BEFORE selecting any
of the forms, you have to select the databases in which you want to perform
the search. The databases appear grouped into several categories, which
appear as a tree whose branches can be expanded or collapsed. The most
popular databases appear in the first branches, which are expanded, by
default.
-
Select the EMBL database. To obtain a description of a given database,
click on its name.
-
Select Standard Query Form.
-
You will be now in the Query form tab. In the upper left corner
you will be able to check the databases in which you are going to perform
the search (in this case, EMBL). It is possible to enter several terms,
and to specify the fields in which the search should be performed. At this
moment, we will search in all fields, so leave the option All text.
-
Enter the term rhizobium
-
Click on Search.
-
You will find that there are about 17000 entries that contain the term
rhizobium in some field..
-
Select one of them in the column labeled as EMBL .
-
You will get the entry in EMBL format..
-
Return to the previous page.
-
Select the same entry, in the column labeled as Accession.
-
You will obtain information about the successive versions of the entry.
-
Return to the previous page.
-
Select a few sequences, by clicking on the checkboxes on the left (for
example, the first four).
-
Select FastaSeqs from the menu Display Options on the left,
and click on Apply Display Options.
-
You will get a document with the sequence of the four entries in FASTA
format. They can be saved by clicking on the Save button on the
left.
-
You can select also several applications to analyze the selected sequences,
from the menu Launch analysis tool.
-
Select NClustalW and then click on Launch.
-
You will see the NClustalW form, with all the selected sequences,
each in a different textbox. At this stage you could exclude some of the
sequences or select only a segment for some of the sequences. You could
also modify several parameters that control the alignment algorithm.
-
Enter a name for the process that you are about to start (in the upper
box, labeled Job name), do not modify the algorithms parameters,
and click on Launch.
-
You will be informed that the process will be executed in batch mode. You
will be able of launching another processes before the first one ends.
By clicking on the clock-shaped link (upper left corner) you will enter
in the Batch Job Status Page, in which you will have the list of
processes that you are currently running. They will appear with a check
mark once they have finished.
-
Select the job that you want to visualize, using the check-boxes on the
left.
-
Select Complete entries, on the left side menu, and click on View,
to take a look at the multiple sequence alignment.
-
If you want to perform another search in the same database(s), click on
the Query Form tab. In the upper left corner you will be able to
check the databases in which you are going to perform the search.
-
However, you will perform a new search in a different database. Therefore,
return to the Library Page.
-
Select UniProt/SwissProt.
-
Click on Query forms: Standard.
-
Choose Creation Date and enter 01-Sep-2003:29-Feb-2004, to
make a search that is, first, restricted to a specific field, and second,
for a range of values (in this case, dates).
-
Click on Search.
-
You will get the entries that have been introduced in UniProt/SwissProt
during the last months.
-
Select one (the first, for example) and launch BlastP, to find similar
sequences.
-
Once the batch job has finished, we could retrieve the sequence that have
been identified.
-
By clicking on the tab labeled as Results, you will obtain a list
of the searches and application processes that you have realized in tour
current SRS session. From this page you will be able to combine searches
or to recall previously obtained results.
-
By clicking on the tab labeled as Projects, you will visualize again
a list of the searches and application processes that you have realized
in tour current SRS session. This time, however, this list is only useful
as an index or reminder of what you have been doing, because from this
page you have the option of saving the session locally, in your computer,
as a compressed file that you can open in the future. If you had opened
the current SRS session as a Permanent Project (in the main page),
your queries and results would be saved in the SRS server, for future access.
B. Some questions to practice
-
Is there any patented slug sequence? What about spider sequences?
-
How many sequences were deposited by F.M. Ausubel, in the EMBL database,
that belong to Rhizobium?
-
Try to find all Bacillus subtilis protein sequences that could be related
with penicillin binding, and that are shorter than 300 residues, and align
them.
-
When was the HIV-1 nucleotide sequence submitted by Gallo to the nucleotide
database?
-
How many sequences were deposited in SwissProt in the year 1993?
-
Which EMBL database entry refers to Fidel Castro and to one of his cigars?
February 2004