![]()
Abreu-Goodger C., Jáuregui R., Ontiveros, N., Ciria R., Oliver P. and Merino E.*
Instituto de Biotecnología, UNAM. Ave. Universidad 2001, Col. Chamilpa, Morelos, 62210 Cuernavaca, Morelos, México.
merino@ibt.unam.mx
Based on the fact that common regulatory mechanisms are employed by several orthologous genes, we devised a new genome-wide method for the identification of conserved regulatory motifs in fully sequenced genomes. For each Cluster of Orthologous Groups of genes in the COG database (Tatusov, et al., 1997), we obtained all the corresponding intergenic upstream regions. These DNA sequences were thereafter analyzed based on:
a) Primary sequence.
b) Secondary mRNA structure.
c) DNA structure.
In the first case, we used MEME (Bailey and Elkan, 1994), to find over-represented sequences that might correspond to regulatory motifs. The frequency matrices for the significant sequences in a COG were used to search for new potential members in the complete set of the upstream DNA sequences. From these results new and more representative matrices were built. This protocol was repeated iteratively for each COG, until the search converged.
To find conserved regulatory regions based on mRNA structure, we evaluated the termination-antitermination regulatory process, often called transcription attenuation, in the upstream regions of our intergenic sequence dataset. In this case, we used the FoldRNA program (Zuker and Stiegler, 1981) to determine the possible secondary structures of the 5' leader regions. We used a set of rules based on size, free energy, composition and distance of these structures to discern if they could act as both transcription terminator and antiterminator elements (Merino and Yanofsky, 2002). The predicted attenuators were thereafter grouped based on orthology criteria.
Finally, we also look for transcription regulatory signals based on static DNA curvature, which is known to play an important role in the transcription regulation of some genes. Using an implemented version of the BEND program (Goodsell and Dickerson, 1994), we identified upstream sequences with a statistically significant DNA curvature. Similar criteria as before, allowed us to identify representative groups of orthologous genes modulated by this regulatory element.
The results of our search will be presented and discussed from a biological point of view.
References:
Goodsell,D.S. and Dickerson,R.E. 1994. Bending and curvature calculations in B-DNA. Nucleic Acids Res. 22:5497-5503.
Jáuregui, R., Abreu-Goodger, C., Moreno-Hagelsieb, G. , Collado-Vides, J. and Merino, E. 2003. Conservation of DNA curvature signals in regulatory regions of prokaryotic genes. Nucleic Acids Research. 31:6770-677.
Tatusov,R.L., Galperin,M.Y., Natale,D.A. and Koonin,E.V. 2000. The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Res., 28:33-36.
Timothy L. Bailey and Charles Elkan. Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994.
![]()