Está en la página 1de 43

Fuerzas que participan en las

interacciones DNA-Protena
EXPRESION GENICA o el Flujo
de Informacin horizontal
Jorge Arevalo
2014
Textbook of Biochemistry with Clinical Correlations, 7e edited by Thomas M. Devlin 2011 John Wiley & Sons, Inc.
I|gure 8.7 Genes of tryptophan operon of !" $%&'"
uescrlbed ln ?anofsky, C. !"#$%& ($ )#$#*+ ,-. /012 ,--3+
Protein-DNA interactions
usually involve some degree of sequence specicity
Textbook of Biochemistry with Clinical Correlations, 7e edited by Thomas M. Devlin 2011 John Wiley & Sons, Inc.
I|gure 8.21 1he 1A1A-b|nd|ng prote|n (18) has been cocrysta|||zed w|th DNA.
llgure reproduced wlLh permlsslon from voeL, u., voeL, !., and rau, C. W. 45$%67#$*68& 9: ;(9<=#7(&*">+ new ?ork: Wlley, 1999. (1999) !ohn
Wlley & Sons, lnc.
Structural information is now available on over 400 distinct
DNA-protein complexes, from a wide range of eukaryotic and
prokaryotic sources. Studies of the proteins themselves rarely
provide sufficient insight into the processes of recognition.
The determination of the human genome sequence in 2001 has
enabled reliable estimates to be made for the numbers of genes
with particular functions. Of the 30000 in total, 13.5 per cent (2308)
are proposed to be
involved in nucleic acid
binding, of which 6
per cent (1850) are
estimated to be
transcription factors.
Non-specific binding
Electrostatic forces are long range and not
very specific. They rule the attraction between
the positively charges protein surface
(all DNA-binding domains have exposed
basic side chains) and the negatively charged
DNA phosphate backbone. Once protein and
DNA are nearby due to electrostatic
interactions, the other forces,
which are shorter-range, become effective.
These, and predominantly
hydrogen bonds between amino acid side
chains and nucleic acid bases,
determine the protein-DNA binding specificity.
Once protein
and DNA are
nearby due
to electrostatic
interactions,
the other forces,
which are
shorter-range,
become effective.
These, and
predominantly
hydrogen bonds
between amino
acid side
chains and
nucleic acid bases,
determine
the protein-DNA
binding specificity.
Specific binding
(universal amino acid-base interactions)
The only regions where the bases are available for interaction are at
the floor of the grooves. These are paved with nitrogen and oxygen atoms
that can make hydrogen bonds with the side chains of a protein.
1. Hydrogen bonds
Possible binding sides of DNA base pairs
Donadores ( +) y Aceptores
de puentes de hidrgeno( !).
Existen posiciones que no
forman puentes de hidrgeno
pero pueden participar de
interacciones electrostticas
( )
The B-DNA major groove is the richer of the two groove of the duplex DNA,
both in information content per se, and in its ability to facilitate discrimination
between DNA sequences, which is essential if the appropriate genes are
to be transcribed. Thus, the major groove is generally the site of direct
information readout. Nonetheless, the minor groove is an important target
for some regulatory and structural proteins, especially those that able
to deform DNA so that the minor groove becomes greatly expanded.
For GC pair, the major groove exposes a hydrogen-bond acceptor, G N7,
another acceptor, G O6, a hydrogen-bond donor, C NH4, and finally,
a hydrogen atom at C5. The minor groove displays a hydrogen-bond
acceptor G N3, a donor G NH2 and an acceptor C O2.
For the AT pair, the major groove gives the following sequence : an
acceptor A N7, donor A NH6, acceptor T O4, and a methyl group at T5.
The minor groove displays an acceptor, a hydrogen atom and a donor.
These patterns for potential hydrogen bonds are clearly quite different for
the different base pairs in the major groove, and they could easily be
recognized and distinguished by a protein molecule.
Clearly, the major groove is a much better candidate for
sequence-specific recognition than the minor groove for two reasons.
First, the major groove is wider than the minor, and the bases are thus
more accessible to a protein molecule. Second, the pattern of possible
hydrogen bonds from the edges of the base pairs to a protein are
more specific and discriminatory in the major groove than in the minor.
Only a rather limited number of base pairs is needed to provide
unique and discriminatory recognition sites in the major groove.
The above figure gives the color codes for the hexanucleotide recognition
sites of three different restriction enzymes - Eco RI, Bal I and Sma I. It is
clear that these patterns are quite different, and each can be uniquely
recognized by specific protein-DNA interactions.
Puentes de hidrgeno de Arg, Gln y Aspn
There is no general 1:1 amino acid : DNA base correspondence, and
recognition can sometimes occur in a wide variety.
Here the distribution of amino acid-base interactions in 129 protein-DNA
Structures (Luscombe et., 2001) :
Gua Cyt Ade Thy #sum
-----------------------------------------------------------------------------------
Arginine (R) 98 8 19 24 149
Lysine (K) 30 6 4 9 49
Serine (S) 12 2 1 3 18
Asparagine (N) 7 10 18 7 42
Glutamine (Q) 6 2 16 2 26
Glutamate (E) 1 10 1 0 12
#sum 154 38 59 45 296
The majority of interactions involve O6 and/or N7 atoms of guanine bases
forming hydrogen bonds with the charged ends of long flexible side chains
from the basic residues arginine or lysine, the amide residues glutamine and
Asparagine or the hydroxyl group of a serine.
Arg Gua : a perfect H-bonding association
(33% of the total of amino acid-base pair interactions)
DNA-binding domain
of Tc3 transposase
from C elegans
residue : Arg C236
PDBcode: 1tc3
R = 2.45
R-factor = 0.234
1.82
1.96
2 H-bond
acceptors
2 H-bond
donors
guanidinium
moeity
Asn/Gln Ade : another frequent H-bonding association
(11% of the total of amino acid-base pair interactions)
formamide
group
one H-bond
acceptors
one H-bond
donors
one H-bond
donors
one H-bond
acceptors
Pit-1 Pou domain
residue : Asn A44
PDBcode : 1au7
R = 2.30
R-factor = 0.230
2.13
1.96
Recently, Cheng et al. (2003)
have calculated all
geometrically
plausible H-bonding
arrangements between amino
acid and nucleic acid base
(DNA and RNA recognition).
They have found 32 possible
interactions, with 17
of which have been observed
in complex structures.
(The number of observed
Cases are indicated
here in red).
2
18 84
5
1
2
6
183 26
3 10
1
5
2
3
25
7
PDBcode: 1tc3
R = 2.45
R-factor = 0.234
DNA-binding domain
of Tc3 transposase
from C elegans
residues : Arg C236-A7-A8
Cation-!/H-bond stair motif involve two nucleobases and an amino acid side chain.
Its encompass three different types of interactions :
!-! stacking, H-bond and cation-! interactions.
Zinc finger protein
PDBcode : 1mey
R = 2.20
R-factor = 0.224
Methyltransferase
PDBcode : 6mht
R = 2.05
R-factor = 0.186
Sap-1 ets domain
PDBcode : 1bc8
R = 1.93
R-factor = 0.220
Homeodomain
From drosophila
PDBcode : 1fjl
R = 2.00
R-factor = 0.198
Protein can bind the DNA through the base, sugar, and
the phosphate group
Hydrogen bonds with phosphate are not specific, but
with great importance in stabilizing the protein-DNA
complexes
Guanine exposes the greatest number of potential
hydrogen-bonding atoms on the base edge(4 positions)
The polar and charged residues of amino acids play a
central role
Arg > Lys > Ser > Thr; Asn and Gln
Acidic residues are used sparingly Asp and Glu
Only Gly makes a significant number of interaction
Few interactions are produced by hydrophobic residues
Favored amino acid-base hydrogen
bonds
Arg and Lys --- G, Asp and Glu --- A, Ser and His --- G
80% of Ser and Thrs interactions are with the DNA
backbone
Hydrogen bond geometries
Single 36.9%
Bidentate 33.8% ( two or more hydrogen bonds are
made with a base or base pair)
Complex 34.1% ( a protein residue binds more than
one step simultaneously)
Example: Bidentate interaction with Arg
2. Van der waals contacts
Comprise 64.9% of all protein-DNA
interactions
Interactions with the DNA backbone
( sugar and phosphate) are most
prominent
Interactions with the phosphate group
dominate due to their high exposure on
the DNA surface
T>A>G>C
Arg, Thr, Phe, Ile, His, Cys
Phe and His may have ring stacking
interactions with the base ring
Cys in coordinating proteins has a high
propensity to contact the DNA backbone
Glu, Ala, Leu, and Asp are less favored:
Glu and Asp: electrostatic interactions with
DNA
Ala and Leu: shortness of their side chains
3. Water mediated bonds
Nearly as common as direct hydrogen bonds
14.9% of all protein-DNA interactions
70% are with the DNA backbone, mostly
phosphate group
Interactions with purine are common than with
pyrimidine
Polar and charged amino acids are frequently
used: Arg, Lys, Asp, Glu, Ser and Thr
3. Water mediated bonds
Nearly as common as direct
hydrogen bonds
14.9% of all protein-DNA
interactions
70% are with the DNA
backbone, mostly phosphate
group
Interactions with purine are
common than with pyrimidine
Polar and charged amino acids
are frequently used: Arg, Lys,
Asp, Glu, Ser and Thr
Summery
Amino acids Mode of interaction Recognized base
Hydrogen bond
Arg, Lys
His
Ser
Asn Gln
Asp, Glu
Van der waals contacts
Phe, Pro
Thr
Gly, Ala, Val, Leu, Iso, Tyr
No Base contact
Cys, Met, Trp
Multiple-donor
Multiple-donor (bifurcate)
Multiple-donor (bifurcate)
Acceptor + donor
Acceptor + donor
Multiple-acceptor
Ring-stacking
Methyl contact
G/complex
G
G
Complex
A/complex
Complex
A, T
T
Many (nonspecific)
Figure 2
Trends in Biochemical Sciences 2014 39, 381-399DOI: (10.1016/j.tibs.2014.07.002)
Copyright 2014 Elsevier Ltd Terms and Conditions
Base and shape readout contribute to TFDNA
binding specificity. (A) Base readout describes
direct interactions between amino acids and
the functional groups of the bases. Whereas
the pattern of hydrogen bond acceptors (red)
and donors (blue), heterocyclic hydrogen
atoms (white) and the hydrophobic methyl
group (yellow) is base pair-specific in the major
groove, the pattern is degenerate in the minor
groove. (B) Shape readout includes any form of
structural readout based on global and local
DNA shape features, including conformational
flexibility and shape-dependent electrostatic
potential. The DNA target of the IFN-"
enhanceosome (PDB ID 1t2k; top) varies in
mi n o r g r o o v e s h a p e . T h e h u ma n
papillomavirus E2 protein binds to a DNA
binding site (PDB ID 1jj4; bottom) with intrinsic
curvature. (C) Most DNA-binding proteins use
interplay between the base- and shape-readout
modes to recognize their DNA binding sites.
However, the contribution of each mechanism
to protein-DNA binding specificity might vary
across TF families. Shape readout dominates
for the minor groove-binding high motility group
(HMG) box protein (PDB ID 2gzk; left). Base
readout is a major contribution in DNA
recognition by the bHLH protein Pho4 (PDB ID
1a0a; right). Both readout modes are more or
less equally present in the DNA binding of a
HoxExd heterodimer (PDB ID 2r5z; center).
Ejemplos de contactos DNA Protena
Los contactos pueden ser por
dos una cara del DNA
Textbook of Biochemistry with Clinical Correlations, 7e edited by Thomas M. Devlin 2011 John Wiley & Sons, Inc.
I|gure 8.24 ne||x-turn-he||x prote|ns use one he||x to b|nd |n the ma[or groove wh||e the other supports that
b|nd|ng through hydrophob|c |nteracnon.
8edrawn from AlberLs, 8., 8ray, u., Lewls, !., 8a, M., 8oberLs, k., and WaLson, !. ?98#<586" ;(989@> 9: *=# A#88+ B#C D9"E. )6"86$%2 FGG3+
Represor lac
Factores con
Homeodominios
Los dedos de Zn se estabilizan por el metal
Textbook of Biochemistry with Clinical Correlations, 7e edited by Thomas M. Devlin 2011 John Wiley & Sons, Inc.
I|gure 8.2S 1wo d|erent 2n hnger monfs are found |n transcr|pnon factors.
(6H I#J"9%5<#% C(*= J#"7(&&(9$ :"97 K9#*2 L+2 6$% K9#*2 M+ )+ ;(9<=#7(&*">2 2d ed. new ?ork: Wlley, 1993. (1993) !ohn Wlley & Sons, lnc. arL
(NH and (<H @#$#"95&8> &5JJ8(#% N> A+ O6N92 ?+P+!+
TFIIIA, SP1, Gal4, Superfamilia de receptores hormonales
esteroideos
Textbook of Biochemistry with Clinical Correlations, 7e edited by Thomas M. Devlin 2011 John Wiley & Sons, Inc.
I|gure 8.26 Leuc|ne z|pper prote|ns b|nd to DNA as d|mers.
Modled from AlberLs, 8., 8ray, u., Lewls, !., 8a, M., 8oberLs, k., and WaLson, !. ?98#<586" ;(989@> 9: *=# A#88+ B#C D9"E. )6"86$%2 FGG3+
Fos
Jun
CREB
Textbook of Biochemistry with Clinical Correlations, 7e edited by Thomas M. Devlin 2011 John Wiley & Sons, Inc.
I|gure 8.27 1ranscr|pnon factor d|mer formanon |s med|ated through he||x |oop he||x |nteracnons.
Modled from AlberLs, 8., 8ray, u., Lewls, !., 8a, M., 8oberLs, k., and WaLson, !. ?98#<586" ;(989@> 9: *=# A#88+ B#C D9"E. )6"86$%2 FGG3+
Ejemplo de unin no especfica:
estructura del Nucleosoma
Figure 1
Trends in Biochemical Sciences 2014 39, 381-399DOI: (10.1016/j.tibs.2014.07.002)
Copyright 2014 Elsevier Ltd Terms and Conditions
Structure-based illustration of multiple levels of TFDNA binding specificity. (A) The basic helix-loop-helix (bHLH) MadMax
heterodimer (PDB ID 1nlw) binds to only a subset of putative DNA binding sites (blue). Some TFBSs are inaccessible owing to
nucleosome formation (PDB ID 1kx5), whereas other accessible TFBSs are not selected by the TF. (B) Higher-order determinants
of TF binding include cooperativity with cofactors (e.g., HoxExd heterodimer; PDB ID 2r5z), multimeric binding (e.g., p53
tetramer; modeled based on PDB IDs 2ady and 1aie [228]), cooperativity through TFTF interactions (e.g., IFN-" enhanceosome;
modeled based on PDB IDs 1t2k, 2pi0, 2o6g and 2o61 [59]), and chromatin accessibility due to nucleosome formation (PDB ID
1kx5) [229].
Reference
N.M. Luscombe et al, Nucl. Acid Res 29,
2860-2874 (2001).
Luger et al, Nature 389, 251-260 (1997)