Ponomarenko M.P., Titov I.I., Ponomarenko J.V., Kolchanov N.A., Mazin A.V.1, Kowalczykowski S.C.1
Institute of Cytology & Genetics, 630090, Novosibirsk, Russia; FAX: +7(3832)356-558; E-mail: pon@bionet.nsc.ru;
1University of California, Davis, California 95616-8665, USA
The RecA protein plays a key role in both DNA repair and homologous recombination. When RecA is binding ssDNA, a nucleoprotein RecA-filament is formed, which is essential for all RecA-mediated biologically important reactions. It has been commonly accepted that vital importance of the RecA-promoted functions for the entire E. coli genome exclude the RecA-filament preference of any DNA sequences. Therefore, the recent discovery that the RecA filament binds preferentially to certain sequences (Mazin, Kowalczykowski, 1996) was quite unexpected and requires its explanation. That is why, we processed these data (Mazin, Kowalczykowski, 1996) by the computer system ACTIVITY (Kolchanov, 1998). The ssDNA/RecA-filament affinity was found maximal for the sequence devoid of the trinucleotide DRV={AAA, AAC, AGA, AGC, GAA, GAC, GGA, GGC, TAA, TGA, AAG, AGG, TAG, TGG, GAG, GGG, TAC, TGC} and decreasing with DRV concentration in the vicinity of the ssDNA 5'end. This concentration is calculated:
;
where is a given sequence; R=A/G, V=A/G/C, and D=A/T/G; if x=y, if ; weight(i) is exponentially decreasing with position i. In ten training sequences with known ssDNA/RecA filament affinities, simple regression was optimized:
.
Significance of this regression was tested in six control sequences. The obtained linear regression coefficient was r=0.812; significance, . Thus, this regression is reliably predicting the ssDNA/RecA filament affinity from the ssDNA sequence. Then, the trinucleotide DRV and the genetic code were superimposed. It resulted that the DRV trinucleotide corresponds to the codons of lysine, cystein, serine, tyrosine, glycine, asparagine, tryptophan, arginine, and both glutamic and aspartic acids. In proteins, these residues are reliably frequent on surface and seldom in domain nuclei (Karlin, 1989). Protein surfaces are commonly associated with functional sites. Hence, the RecA-filament is likely to ignore the gene regions encoding protein functional sites to prefer domain nuclei. Consequently, structurally similar proteins may differ in the order of their domains, while their functional sites should be the most conservative. These both fenomena are, indeed, well known facts.
This work was granted by Russian Basic Research Foundation.