SRSWWW Exercises
A. Simple Queries
-
Search SWISS-PROT for proteins involved in photosynthesis. First
search in the Description field then try other fields such as Keywords
(try using wildcard '*').
-
Search for the SWISS-PROT entry with accession n'umber P00221 (and follow
some of the associated links).
-
Search for all authors with surname "Smith" in SWISS-PROT.
-
Search for all SWISS-PROT proteins involved in photosynthesis which were
published by someone named "Smith". Do this search first as a combination
of searches in the "Query Manager" and again from the query form.
-
How many non-EST sequences exist in EMBL? ( Use the 'Division' index)
-
How many tomato sequences exist in EMBL and SWISS-PROT. What is the problem
with just using 'tomato' in the query form with standard setting?
-
Find all sequences in EMBL which were released this year ( Select the 'Date'
field and use dd-mmm-yy(yy): format).
-
Find all sequences in both SWISS-PROT and SWISSNEW that were created between
1. January 1996 and 1 July 1996
-
Find the EMBL sequences that were published in Nature Vol. 408, pages
157 to 158 in 2000. (Use the field information page to get further help)
-
Search all dihydrofolate reductases in SWISS-PROT.
-
Search all dihydrofolate reductases in SWISS-PROT with sequences of length
between 500 and 700.
-
Search 'kinase' in the 'Description' index of SWISS-PROT. Why are some
of the found entries not kinases? Find at a word that when found together
with 'kinase' strongly indicates that the protein is not a kinase at all.
(Select the 'Description' field to be displayed in the entry list)
B. Browsing Indices
-
How many spellings exist in the 'Keywords' index of EMBL for the name(s)
of the ribulose bisphosphate carboxylase. (Use lots of the wildcard '*'
anywhere in the search word)
-
Find out if the 'Description' and 'Keywords' indices of SWISS-PROT contain
any words with spaces. ( Use a search expression with a wildcard ('*')
at the end and the beginning)
-
What is the shortest author name in SWISS-PROT? ( Use wildcards '?')
-
Search 'homeobox*' in 'AllText' of all sequence databanks. How many indices
are searched alltogether? Is the query 'homeobox*' suitable to find all
proteins containing a homeobox?
-
Use a regular expression search to find all words consisting of 'nif' and
another character ( Don't forget to put the regular expression within '/'s)
(Help available at "SRS users manual 8.1.3")
-
For which species exist at least 1000 entries in SWISS-PROT ( Use the fact
that organism names contain a space which higher level taxa mostly don't
have)
C. Subentry Queries
-
How many SWISS-PROT sequences exist with transmembrane regions? ( Use the
'transmem' feature key)
-
How many transmembrane regions exist in SWISS-PROT?
-
How many transmembrane regions in SWISS-PROT are shorter than 10 amino
acids or longer than 50?
-
How many pseudo genes annotated as CDS (CoDing Sequence) exist in EMBL?
-
Retrieve the set of all spinach transmembrane segments and save them to
your directory using the view "FastaSeqs".
D.Using Views
-
Create a view for EMBL in the query form that displays the ID and Description
line and the sequence in PIR format then perform a new search for all spinach
sequences.
-
Create another view for EMBL in the query form that displays the ID, AccNumber,
Description and DBOrigin in a table and again search for all spinach sequences.
-
Search in the Description field of SWISS-PROT for all "photosystem II"
sequences from spinach and compare the hydrophilicity plots. (Use the 'proteinChart'
view)
E. Performing Links
-
For how many 'spinach' SWISSPROT entries do we know its tertiary structure?
-
How many 'Arabidopsis thaliana' transmembrane proteins are included into
the ENZYME databank?
-
For how many unique reactions catalysed by an enzyme do we have its tertiary
structure in PDB? (Enzyme reactions are described in the databank ENZYME)
-
How many PDB entries have calcium binding sites?
-
Search the dihydrofolate reductase family in PROSITE and link it to SWISSPROT.
If you compare with the set from exercise A.10, are the two sets the same?
