CHAPTER 1: The Transition from Transcription Initiation to Transcription Elongation: Start-site Selection, Initial Transcription, and Promoter Escape
Published:23 Nov 2021
J. T. Winkelman, B. E. Nickels, and R. H. Ebright, in RNA Polymerases as Molecular Motors: On the Road, ed. R. Landick, T. Strick, and J. Wang, The Royal Society of Chemistry, 2nd edn, 2021, pp. 1-24.
Download citation file:
In transcription initiation, RNA polymerase (RNAP) binds to promoter DNA, unwinds a turn of promoter DNA to yield an RNAP–promoter open complex containing an unwound “transcription bubble,” and selects a transcription start site (TSS). In the next step of initiation, termed “initial transcription,” RNAP remains bound to the promoter and synthesizes an RNA product of a threshold length of approximately 11–15 nucleotides. In the final step of initiation, termed “promoter escape,” RNAP breaks free of the promoter to yield a transcription elongation complex that synthesizes the rest of the RNA product. As a result of research over the last two decades, we now have a detailed mechanistic understanding of TSS selection, and we now understand broad outlines of initial transcription and promoter escape. Here we review the current understanding of TSS selection, initial transcription, and promoter escape, focusing on these processes as they occur in the best characterized example, transcription initiation by Escherichia coli RNAP-σ70 holoenzyme, but also summarizing these processes as they occur in eukaryotic RNAP I, II, and III.
In transcription initiation, RNA polymerase (RNAP) binds to promoter DNA, unwinds a turn of promoter DNA to yield an RNAP–promoter open complex containing an unwound “transcription bubble,” and selects a transcription start site (TSS). In the next step of initiation, termed “initial transcription,” RNAP remains bound to the promoter and synthesizes an RNA product of a threshold length of approximately 11–15 nucleotides (nt). In the final step of initiation, termed “promoter escape,” RNAP breaks free of the promoter to yield a transcription elongation complex that synthesizes the rest of the RNA product. As a result of research over the last two decades, we now have a detailed mechanistic understanding of TSS selection, and we now understand the broad outlines of initial transcription and promoter escape. Here we review the current understanding of TSS selection, initial transcription, and promoter escape, focusing on these processes as they occur in the best characterized example, transcription initiation by Escherichia coli RNAP-σ70 holoenzyme, but also summarizing these processes as they occur in eukaryotic RNAP I, II, and III.
1.2 Transcription Start-Site (TSS) Selection
In contrast to DNA polymerases, RNAP can initiate nucleic acid synthesis using both primer-independent and primer-dependent mechanisms.1–7 In primer-independent transcription initiation, RNAP uses an initiating nucleoside triphosphate (NTP) and an extending NTP or, alternatively, uses a non-canonical initiating nucleotide8,9 [NCIN; an NTP-related compound, such as nicotinamide adenine dinucleotide (NAD+) or its reduced form (NADH)] and an extending NTP. In primer-dependent transcription initiation, RNAP uses a short, 2 to ∼5 nucleotide, RNA primer (“nanoRNA”) and an extending NTP.10–13
In both primer-independent and primer-dependent transcription initiation, TSS selection entails two steps: (1) placing the start-site nucleotide (position +1) and the next nucleotide (position +2) of the template DNA strand into the RNAP active-center product site (“P site”) and addition site (“A site”), respectively; and (2) placing the initiating entity—the initiating NTP or NCIN in primer-independent initiation, or the 3′ nucleotide of the RNA primer in primer-dependent initiation—in the P site and the extending NTP in the A site, respectively.
The position of the TSS relative to promoter core elements (promoter −35 element and −10 element for E. coli RNAP-σ70 holoenzyme) is variable.14–23 The results of a comprehensive analysis of TSS selection by RNAP-σ70 holoenzyme indicate that TSS selection most frequently entails placing the template-strand position located 7 nucleotides downstream of the promoter −10 element into the RNAP active-center P site to serve as position +1 14,22,24 (see Figure 1.1). However, the TSS can vary over a 5 bp window, allowing placement of template-strand positions located 6, 7, 8, 9, or 10 nucleotides downstream of the promoter −10 element into the RNAP active-center P site to serve as position +1.14,22,24
The variability in TSS selection raises the structural question of how placement of the template strand relative to the RNAP active center can vary by up to 5 nucleotides, which corresponds to variation by up to at least ∼17 Å (5 nucleotides × ∼3.4 Å per nucleotide). In principle, this variability could be accommodated by differences in RNAP conformation, differences in DNA conformation, or both.
Results from a series of experiments have now revealed, definitively, that variability in placement of the template strand relative to the RNAP active center is mediated by differences in DNA conformation, specifically differences in the extent of transcription-bubble unwinding22,23,25,26 (see Figure 1.1). At most promoters, the energetically most favorable configuration of the RNAP–promoter open complex is one that contains an unwound transcription bubble 13 nucleotides in length and that places the template-strand position 7 nucleotides downstream from the promoter −10 element in the RNAP active-center P site. In order for TSS selection to occur at positions 8, 9, or 10 nucleotides downstream of the promoter −10 element, the downstream DNA duplex is further unwound by an additional 1, 2, or 3 bp, respectively; the unwound DNA is pulled into and past the RNAP active center, and the unwound DNA is accommodated as single-stranded DNA bulges within the transcription bubble, yielding a “scrunched” complex (TSS = 8, 9, or 10 in Figure 1.1). In order for TSS selection to occur 6 nucleotides downstream of the promoter −10 element, the opposite occurs: downstream DNA is rewound by 1 bp, downstream DNA is extruded from the RNAP active center by 1 bp, and the extrusion of DNA from the RNAP active center is accommodated by stretching DNA within the unwound transcription bubble, yielding an “anti-scrunched” complex (TSS = 6 in Figure 1.1).
Scrunching and anti-scrunching during TSS selection have two defining, experimentally detectable, hallmarks: (1) as the position of the TSS changes, the position of the RNAP leading edge relative to DNA changes, but the position of the RNAP trailing edge relative to DNA does not change; and (2) as the position of the TSS changes, the size of the transcription bubble changes. To demonstrate that TSS selection exhibits the first hallmark of scrunching and anti-scrunching, a protein–DNA photocrosslinking approach was used to determine the RNAP leading-edge and trailing-edge positions relative to DNA, in vitro and in vivo, on promoters that use different TSS positions.22,23,25 This approach revealed that, for promoters that use different TSS positions, the RNAP leading-edge position changes, but the RNAP trailing-edge position does not change. To demonstrate that TSS selection exhibits the second hallmark of scrunching and anti-scrunching, a single-molecule magnetic-tweezers approach was employed that allowed direct assessment of transcription-bubble sizes for promoters that use different TSS positions.25 This approach revealed a one-for-one correlation between TSS position and transcription-bubble size.
Following the placement of the template-strand +1 and +2 in the RNAP active-center P and A sites, the initiating entity and the extending NTP bind, through complementary base pairing interactions, to template-strand positions +1 and +2.27,28 In primer-independent transcription initiation, binding of the initiating entity to the RNAP active-center P site is facilitated by a module of the σR3–σR4 linker termed the “σ finger”,29–31 which reaches into the RNAP active-center cleft and interacts with the DNA template strand.29,30,32–34 As a result of interactions with the σ finger, the DNA template strand in the immediate vicinity of the RNAP active centre adopts an A-form helical conformation that facilitates binding of NTPs or NCINs.29
In primer-dependent transcription, a 2 to ∼5 nucleotide RNA binds to the template strand, with its 3′ nucleotide base-paired to template-strand position +1 in the RNAP active-center P site. Primer-dependent transcription initiation can bypass the requirement for the σ finger, presumably because base pairing of an RNA primer to the DNA template strand pre-organizes the template strand as one strand of an RNA–DNA double helix and constrains the template strand to interact with the RNAP active center, facilitating binding of the extending NTP.31–33
1.2.2 Sequence Determinants
It has been known for over four decades that promoter DNA sequence in the TSS region is a key determinant of TSS selection in primer-independent transcription: specifically, there is a strong preference for a purine:pyrimidine base pair (R:Y, nontemplate strand:template strand) at position +1 in primer-independent transcription,35–41 and there is a moderate preference for a pyrimidine:purine base pair (Y:R) at position −1 in primer-independent transcription.35,42 The strong preference for R:Y at position +1 is attributable to a strong preference of the enzyme active-center P site for a purine NTP or purine-containing NCIN as initiating entity.35–38,43,44 The moderate preference for Y:R at position −1 is attributable to the energetically favorable inter-chain base-stacking interaction that occurs between R at template-strand position −1 and an initiating NTP base-paired to template-strand position +1.34 An analogous preference for Y:R, and an analogous energetically favorably inter-chain base-stacking interaction by template-strand R, are observed in primer-dependent initiation;13 in this case, inter-chain base-stacking occurs between R at the template-strand position immediately upstream of the primer binding site and a primer base-paired to the primer binding site.13
A comprehensive analysis of TSS selection on a promoter library containing all 47 (∼16 000) sequences at positions 4–10 bp downstream of the −10 element of a consensus σ70-dependent promoter confirmed and quantified the preferences for R:Y at position +1 and Y:R at position −1 in primer-independent transcription.14
An analogous comprehensive study of TSS selection, in this case on a promoter library containing all 4 10 (∼1 000 000) sequences at positions 1–10 bp downstream of the −10 element, revealed that a nontemplate-strand segment immediately downstream of the promoter −10 element that makes sequence-specific protein–DNA interactions with σ,29,45–47 termed the “discriminator element,” also contributes to TSS selection.22 Purine-rich discriminator nontemplate-strand sequences favor TSS selection at upstream positions; pyrimidine-rich discriminator nontemplate-strand sequences favor TSS selection at downstream positions.22 The effect of the discriminator element on TSS selection explains the atypical TSS selection at the E. coli ribosomal RNA promoter rrnB P1.23 The results of protein–DNA photo-crosslinking studies mapping RNAP leading-edge and RNAP trailing-edge positions relative to DNA, magnetic-tweezers studies measuring transcription-bubble size, and crystal structures of complexes containing purine-rich and pyrimidine-rich discriminator elements indicate that discriminator elements manifest their effects on TSS selection by modulating the energetics of scrunching and anti-scrunching.22,23,25
At least one other aspect of promoter sequence contributes to TSS selection: namely, a nontemplate-strand segment at the downstream part of the transcription bubble, termed the “core-recognition element (CRE)”.29,48 The CRE makes sequence-specific protein–DNA interactions with RNAP core that affect formation and stability of the RNAP-promoter open complex.29,49,50 The role of the CRE in TSS selection48 can be understood in terms of effects of RNAP–CRE interactions in defining the downstream edge of the transcription bubble in the RNAP–promoter open complex, which, in turn, defines the extent of scrunching or anti-scrunching in the RNAP–promoter open complex.
1.2.3 Bacterial RNAP vs. Eukaryotic RNAP
Eukaryotes possess three nuclear RNAPs—RNAP I, RNAP II, and RNAP III—each of which is a member of the same multi-subunit RNAP family as bacterial RNAP.51 RNAP I and RNAP III show the same pattern of TSS variation, over a 5 bp window, as does bacterial RNAP, and most likely employ the same scrunching–anti-scrunching mechanism as bacterial RNAP.52–57 Eukaryotic RNAP II, in the absence of accessory factors, also shows the same pattern of TSS variation, over a 5 bp window, as does bacterial RNAP. However, in the presence of the ATP-dependent DNA translocase transcription factor IIH (TFIIH), RNAP II can exhibit TSS variation over a >100 bp window.58,59 The results of optical-tweezers and cryo-electron microscopy (cryo-EM) studies indicate that long-range TSS variation in the presence of TFIIH likely involves a scrunching–anti-scrunching mechanism analogous to that of bacterial RNAP, but having greater range due to the ability to access energy from TFIIH-dependent ATP hydrolysis in addition to energy available from the thermal bath.60,61 Magnetic-tweezers studies, in contrast to the optical-tweezers and cryo-EM studies, have not detected long-range scrunching by RNAP II in the presence of TFIIH,62 but differences in reagents and reaction conditions preclude direct comparison of the different studies.
Eukaryotic RNAP I, RNAP II, and RNAP III, like bacterial RNAP, are able to perform both primer-independent and primer-dependent transcription initiation.51–53 In primer-independent transcription initiation by RNAP I, a structural module of a transcription initiation factor—the Rrn7 zinc ribbon and B-reader—carries out a function analogous to that of the σ finger in bacterial primer-independent transcription initiation, reaching into the RNAP active-center cleft and interacting with the DNA template strand in the immediate vicinity of the RNAP active center, causing it to adopt an A-form helical conformation that facilitates binding of NTPs or NCINs.63–65 In primer-independent transcription initiation by RNAP II, the same role is performed by the transcription factor IIB (TFIIB) zinc ribbon and B-reader,66–70 and, in primer-independent transcription initiation by RNAP III, this role is performed by the Brf1 zinc ribbon.71,72 Surprisingly, although the transcription-factor modules that make these interactions in primer-independent transcription by RNAP I, RNAP II, RNAP III are homologous to each other, the modules show no sequence or structural homology to the bacterial σ finger.73
1.3 Initial Transcription
1.3.1 Mechanism: On-pathway Reactions
Initial transcription follows TSS selection.74,75 Initial transcription starts differently for primer-independent and primer-dependent transcription initiation. Primer-independent initial transcription starts with formation of a first phosphodiester bond between the initiating NTP or NCIN and the extending NTP, bound in the RNAP active-center P and A sites, to yield a nascent RNA of 2 nucleotides length (or x + 1 nucleotides for an NCIN comprising x nucleotides), and then proceeds into sequential extension of the nascent RNA to a threshold length of ∼11–15 nucleotides (see Figure 1.2). Primer-dependent initial transcription starts with formation of a first phosphodiester bond between the primer 3′ nucleotide and the extending NTP, bound in the RNAP active-center P and A sites, respectively, to yield a nascent RNA of length y + 1 nucleotides (where y is the length of the primer), and then proceeds into sequential extension of the nascent RNA to a threshold length of ∼11–15 nucleotides.
In both primer-dependent initiation and primer-independent initiation, following formation of the first phosphodiester bond, each subsequent nucleotide-addition step requires translocation of the template strand by one nucleotide step (∼3.4 Å). In order to extend the nascent RNA to a threshold length of 11 nucleotides, the DNA template strand must translocate relative to the RNAP active center by 9 nucleotide steps (∼31 Å) for primer-independent transcription initiation and by 10 − y (where y is the length of the primer) nucleotide steps (e.g., ∼27 Å for a 2 nucleotide primer) in primer-dependent transcription initiation.76
Strikingly, RNAP remains stationary on promoter DNA, anchored through sequence-specific interactions with promoter elements, during the entire process of initial transcription76–80 (see Figure 1.2). Accordingly, in order to reconcile the translocation of DNA template strand relative to the RNAP active center by at least ∼31 Å without movement of RNAP relative to promoter DNA, there must be a large change of RNAP conformation, a large change of DNA conformation, or both, during initial transcription.74,76–81
We now know, definitively, that, following the first nucleotide-addition step, each subsequent nucleotide-addition step involves a large change in DNA conformation: namely, DNA scrunching.76,80,82 The scrunching that occurs in initial transcription is similar to the scrunching that occurs in TSS selection, but, in this case, takes place concurrently with extension of the nascent RNA (see Figures 1.1 and 1.2). Following the first nucleotide addition, each subsequent nucleotide addition entails scrunching of DNA by 1 bp—unwinding DNA downstream of the RNAP active center, pulling the unwound DNA into and past the RNAP active center, and accommodating the unwound DNA as bulges in the unwound transcription bubble (“bubble expansion”)—followed by binding of an extending NTP to the RNAP active-center A site, formation of a phosphodiester bond, and pyrophosphate release.76,80
The scrunching that occurs in initial transcription serves as the mechanism to capture and store the free energy required to break the sequence-specific protein–DNA interactions anchoring RNAP on the promoter, thereby enabling promoter escape.76,80 The free energy anchoring RNAP on promoter DNA is estimated to be ∼13–16 kcal mol−1.83–86 The free energy anchoring RNAP on promoter DNA greatly exceeds the free energy available from the thermal bath (∼2 kcal mol−1) and the chemical energy from a single nucleotide addition event (∼3 kcal mol−1). The scrunching that occurs in initial transcription enables capture of free energy from multiple nucleotide additions and stepwise storage of the captured free energy in the form of stepwise increases in the amount of DNA unwinding (see Figure 1.2, RPitc,2 to RPitc,11). In each scrunching-dependent nucleotide-addition step, 1 bp of DNA is unwound. This unwinding of 1 bp of DNA has an energetic cost of ∼2 kcal mol−1 (∼1 kcal mol−1 for A–T base pairs and ∼1 to ∼3 kcal mol−1 for G–C base pairs25,87 ). Therefore, in each scrunching-dependent-nucleotide addition step, chemical energy available from nucleotide addition (∼3 kcal mol−1) is captured and stored by unwinding 1 bp of DNA (∼2 kcal mol−1). For example, after 9 scrunching-dependent nucleotide-addition steps, ∼18 kcal mol−1 has been captured and stored (see Figure 1.2, RPitc,11). This captured and stored free energy is similar to the binding free energy anchoring RNAP on promoter DNA. Upon rewinding of the upstream part of the unwound DNA (−10 element through TSS), the free energy captured and stored in initial transcription is accessed to drive promoter escape (see Figure 1.2; RDe,11).
As the nascent RNA is extended in initial transcription, the RNA 5′ end first approaches, and then collides with, the σ finger inside the RNAP active-centre cleft.32,34,88,89 The RNA 5′ end then progressively displaces the σ finger from the RNAP active-center cleft during subsequent extension of the RNA to lengths of 6–11 nucleotides.88 The collision with, and displacement of, the σ finger provides a second mechanism for stepwise capture of chemical energy from multiple nucleotide additions and stepwise storage of captured energy—in this case, in the form of an energetically unfavorable protein conformation (a compressed “protein spring”), and likely assists in driving promoter escape.
Scrunching in initial transcription exhibits the same two defining hallmarks as scrunching in TSS selection: (1) the position of the RNAP leading edge relative to DNA changes, but the position of the RNAP trailing edge relative to DNA does not change; and (2) the size of the transcription bubble changes.76,80,90 The results of single-molecule fluorescence resonance energy transfer (FRET) experiments and protein–DNA photocrosslinking studies establish that the RNAP leading edge moves, and the RNAP trailing edge does not move, during initial transcription.80,82,90 The results of magnetic-tweezers studies establish that the size of the transcription bubble changes during initial transcription.76,82 The results of protein–DNA photocrosslinking studies of complexes engaged in initial transcription indicate that scrunched nontemplate-strand DNA is accommodated between the −10 element and downstream double-stranded DNA, and that scrunched template-strand DNA is accommodated upstream of the RNA–DNA hybrid.90 Published crystal structures of complexes engaged in initial transcription show local disorder of DNA template and nontemplate strands at positions corresponding to the positions of scrunched nucleotides indicated by protein–DNA photocrosslinking.34,89 Unpublished crystal structures define, in detail, the conformation of scrunched nontemplate strand DNA having 1 or 2 nucleotides of scrunched DNA (Y Zhang and RH Ebright unpublished).
1.3.2 Mechanism: Off-pathway Reactions
One important off-pathway reaction during initial transcription is termed “abortive initiation”74,75,91–94 (see Figure 1.3). Abortive initiation entails the synthesis and release of RNA products shorter than the threshold-length RNA product74,75,91–94 (i.e., RNA <11–15 nucleotides in length). Following each nucleotide-addition step in initial transcription, the RNAP–promoter initial transcribing complex either can proceed to the next nucleotide-addition step or can release the RNA product and reform an RNAP–promoter open complex. Following the first nucleotide-addition step in initial transcription, release of the RNA involves breakage of base-pairing with the template strand and diffusion of the RNA product out of RNAP through the NTP-entrance channel. Following subsequent nucleotide-addition steps in initial transcription, release of the RNA product entails the reversal of scrunching—relaxing the scrunched state to yield an unstressed, “unscrunched” state—followed by breakage of base-pairing to the template strand and diffusion of the RNA product out of RNAP, through the NTP-entrance channel (see Figure 1.3). Relaxing the scrunched state to yield an unscrunched state involves translocation analogous to the “backtracking” that occurs in transcription elongation,95–97 and results in extrusion of the RNA 3′ end into the NTP-entrance channel and shortening of the RNA–DNA hybrid (to just 2 bp in the case of primer-independent transcription initiation with an NTP as the initiating entity).
Abortive initiation is a significant modulator of promoter activity both in vitro and in vivo.94 Abortive initiation is a necessary, unavoidable consequence of the “stressed-intermediate” strategy used by RNAP, involving the stepwise capture and storage of energy during initial transcription and, upon reaching a threshold level, the single-step release and recovery of the stored energy to drive promoter escape.76,78,80
Another important off-pathway reaction during initial transcription is “reiterative transcription initiation,” also referred to as “transcriptional stuttering,” or “pseudo-templated transcription”.98–101 Reiterative transcription initiation occurs both in vitro and in vivo at promoters that have homopolymeric sequences at, or immediately downstream of, the TSS and results in synthesis of RNA products having 5′ end sequences that contain additional nucleotides not complementary to the DNA template.14,102–106 RNA products synthesized by reiterative transcription initiation can be released from the RNAP–promoter initial transcribing complex102–105 (“non-productive reiterative transcription”) or can be extended to yield full-length RNA products14,98 (“productive reiterative transcription). Reiterative transcription initiation provides a means to modulate promoter activity and is a target for transcriptional regulation.14,18,103,107–114 Classic examples of genes whose expression is regulated by reiterative transcription initiation are Escherichia coli pyrBI and Bacillus subtilis pyrG.18,110
It has been hypothesized that reiterative transcription initiation involves “slippage” of the RNA product relative to the DNA template.14,98–100,103,105,106,112,114–116 According to this hypothesis, the RNA translocates—slips—relative to both the DNA template strand and RNAP, in a manner that repositions the RNA 3′ end from the RNAP active-center A site into the RNAP active-center P site, enabling binding of an extending NTP, formation of a phosphodiester bond, and pyrophosphate release. An implication of this hypothesis is that RNA extension in reiterative transcription initiation, unlike RNA extension in productive transcription initiation, might not involve scrunching. (Scrunching is required for translocation of DNA and RNA together relative to RNAP, but would not be required for translocation of RNA, but not DNA, relative to RNAP.) Recent work—involving a combination of protein–DNA photocrosslinking, single-molecule DNA nanomanipulation, X-ray crystallography, and cryo-EM—has confirmed the slippage hypothesis, directly demonstrating that RNA, not DNA, translocates relative to RNAP in reiterative transcription initiation, and directly demonstrating that RNA extension in reiterative transcription initiation does not involve DNA scrunching.106
1.3.3 Sequence Determinants
The sequence of the initial-transcribed region, between promoter positions +1 and +15, affects the kinetics of initial transcription.74,75,117–121
A first example of sequence-dependent effects on initial transcription is provided by dinucleotide sequences associated with inefficient RNAP translocation122 (e.g., GU, AU, CU, and UU dinucleotide sequences), which exhibit inefficient initial transcription and greater susceptibility to abortive initiation.120
A second example of sequence-dependent effects on initial transcription is provided by sequence-dependent reiterative transcription initiation.98–101 As described above, reiterative transcription initiation is observed primarily for templates that contain homopolymeric sequences at, or immediately downstream of, the TSS. Reiterative transcription initiation also is observed at promoters that contain short repeat sequences at, or immediately downstream of, the TSS.14,112 According to the slippage hypothesis described above, homopolymeric and repeat sequences enable RNA to translocate—slip—relative to both DNA and RNAP and, after translocation, to re-establish base-pairing interactions with DNA.
A third example of sequence-dependent effects on initial transcription is provided by sequence-dependent initiation pausing.121,123–126 Nontemplate-strand YG sequences can result in sequence-dependent pausing in transcription elongation.50,127,128 Nontemplate-strand YG sequences can also result in pausing during initial transcription when present in initial transcribed regions.121,124 At such sequences, RNAP pauses on seconds to tens-of-seconds time scales (resulting in dwell times ∼20 to ∼100 times longer than dwell times at other sequences). As with pausing during transcription elongation,95–97 pausing during initial transcription also can allow entry into “backtracked” states, in which RNAP moves backward relative to DNA and RNA, misaligning the RNAP active centre and the RNA 3′ end.124,125 The results of a comprehensive analysis of promoter sequences indicate that initial transcription pauses occur in many promoters and can occur at many positions within the promoter initial transcribed region.121
1.3.4 Bacterial RNAP vs. Eukaryotic RNAP
Available evidence indicates that eukaryotic RNAP I, II, and III perform initial transcription through a DNA scrunching mechanism analogous to that used by bacterial RNAP.60,69,81 DNA footprinting studies have demonstrated transcription-bubble expansion in RNAP II-dependent initial transcription.81 Optical-tweezers experiments on RNAP II-dependent initial transcription have provided evidence for movement of the RNAP leading-edge position, but not the RNAP trailing-edge position, characteristic of DNA scrunching.60 Cryo-EM structures of RNAP II initial transcribing complexes show regions of disorder in the transcription-bubble template and non-template strands61,69 similar to those seen in crystal structures of bacterial initial transcribing complexes engaged in scrunching.34,89,106 In the light of the evidence for scrunching in initial transcription by eukaryotic RNAP, it appears likely that a scrunching mechanism mediates the energy capture and storage required for breakage of RNAP–promoter interactions in promoter escape by eukaryotic RNAP.
1.4 Promoter Escape
Promoter escape entails breaking the sequence-specific protein–DNA interactions that anchor RNAP at the promoter and forming a transcription elongation complex.74,75,77–79 As described above, the primary mechanism to capture, store, and release the energy required to break the interactions that anchor RNAP at the promoter is DNA scrunching.76,80 DNA scrunching enables stepwise capture of chemical energy from multiple nucleotide additions, stepwise storage of captured energy by unwinding DNA to expand the transcription bubble, followed by single-step release of captured and stored energy by rewinding DNA to contract the transcription bubble76,80 (see Figure 1.2). An additional mechanism to capture, store, and release energy to break anchoring interactions may involve stepwise RNA extension and concomitant displacement of the σ finger, followed by single-step release of captured and stored energy by displacing the σ finger out of the RNAP active-center cleft and the RNAP RNA-exit channel.32,34,88
Promoter escape is thought to involve a series of mechanically coupled, probably concerted, reactions involving: (1) extension of the nascent RNA to a threshold length, which at most promoters is 11–15 nucleotides; (2) entry of the RNA 5′ end into the RNAP RNA-exit channel; (3) displacement of the σ finger from the RNAP RNA-exit channel, driven by steric clash with the RNA 5′ end; (4) disruption of protein–DNA interaction between σR4 and the promoter −35 element; and (5) rewinding of the upstream half of the transcription bubble, from the −10 element through the TSS, a process termed “unscrunching”32,76,129 (see Figure 1.2). The product of this series of reactions is a transcription elongation complex containing a threshold length of 11–15 nucleotides of RNA and having an altered RNAP-σ interface, in which interactions previously made between RNAP and the σ finger and between RNAP and σR4 are lost.130,131 (Cryo-EM structures of such early transcription elongation complexes have been determined recently.132,133 ) Because most RNAP-σ interactions are lost upon promoter escape, the affinity of RNAP for σ decreases upon promoter escape, and σ can, and typically does, dissociate from the transcription elongation complex in a time-dependent fashion.130,134–140
1.4.2 Sequence Determinants
The sequences of the promoter initial transcribed region, the core promoter, and upstream promoter affect multiple quantitative parameters of promoter escape.74,75,119,126,141–144 Three key parameters are: (1) the abortive-to-productive ratio (APR; a measure of the efficiency of promoter escape); (2) the maximum size of abortive transcripts (MSAT; a measure of the number of steps of RNA synthesis, and thus the number of steps of scrunching, required before promoter escape; 10–14 nucleotides at most promoters, but as high as 19 at some promoters118,145,146 ), and (3) kclear (also called kE; the composite rate constant for the set of reactions from formation of a catalytically competent RNAP–promoter open complex through promoter escape119,143,144 ).
The sequence of the initial transcribed region can affect APR, MSAT, and kclear through effects on sequence-dependent translocational bias,120 as discussed above. The sequence of the initial transcribed region also can affect APR, MSAT, and kclear through effects on the energetics of scrunching.147 Sequences that require more energy for DNA unwinding in the scrunching process are expected to be associated with higher APR and lower kclear. Sequences that, when unwound during scrunching, store less energy are expected to be associated with higher MSAT.
A large body of evidence indicates that altering the strength of protein–DNA interactions between RNAP and core promoter elements and upstream promoter elements affects promoter escape. One specific instance of a core promoter element that affects promoter escape is the discriminator, the nontemplate-strand segment immediately downstream of the promoter −10 element.23,148 As described above, different discriminator sequences result in different preferences for TSS selection, due to different free energies of scrunching in TSS selection.23,148 Promoters with discriminator sequences that result in an unusually downstream shifted TSS—i.e., a TSS 9 nucleotides downstream of the promoter −10 element—exhibit unusually high amounts of scrunching for TSS selection and tend to exhibit unusually low, nearly zero, APR.23,148 It has been suggested that the unusually high amount of scrunching during TSS selection at these promoters (“pre-scrunching”) results in a lower requirement for scrunching in subsequent initial transcription and thus results in a higher efficiency of promoter escape.23,148
More generally, the strength of core promoter elements and upstream promoter elements correlates positively with APR and MSAT.142,143,149–153 This implies that increasing the strength of the interactions anchoring RNAP on promoter DNA tends to decrease the efficiency of escape and to increase the requirement for scrunching and scrunching-dependent energy storage.142,143,149–153 A striking consequence of this effect is that a fully consensus promoter, paradoxically, can exhibit little, or even no, transcriptional activity, despite having a very high affinity for RNAP and a very high rate of formation of RNAP–promoter open complex, due to having a low, or even zero, rate of promoter escape.143,154,155 Similarly, protein–protein interactions between sequence-specific DNA binding proteins bound to promoter DNA and RNAP bound to promoter DNA can increase APR and MSAT (“repression through protein–protein contact”), again indicating that increasing the strength of the interactions anchoring RNAP on a promoter tends to decrease the efficiency of escape and to increase the requirement for scrunching and scrunching-dependent energy storage.156,157
Efforts have been made to generate mathematical models of promoter escape based on promoter sequence, primarily initial transcribed sequence, but, so far, these have had only limited success.147
1.4.3 Bacterial RNAP vs. Eukaryotic RNAP
Less information exists about the mechanisms of promoter escape for eukaryotic RNAP I, II, and III. Eukaryotic RNAP I, II, and III all have an active-center cleft that can accommodate an RNA–DNA hybrid of 9–10 bp.158–160 In all three cases, as in bacterial RNAP, a transcription factor enters the RNAP active-center cleft and interacts with the template strand to facilitate primer-independent transcription initiation, threads through the RNAP RNA-exit channel, and must be displaced out of the RNAP RNA-exit channel upon synthesis of an RNA product of ∼11 nucleotide or greater in length.158–160 Accordingly, it seems possible, even likely, that promoter escape by eukaryotic RNAP I, II, and III, like promoter escape by bacterial RNAP, involves a series of mechanically coupled, probably concerted reactions, involving extension of the RNA to a threshold length of ∼11 nucleotides, entry of the RNA 5′ end into the RNAP RNA-exit channel, displacement of an initiation-factor module from the RNAP RNA-exit channel driven by steric clash, disruption of a subset of transcription factor–DNA interactions, and rewinding of the upstream half of the transcription bubble.
A key objective for future work is the determination of high-resolution structures defining the conformations of scrunched nontemplate-strand and template-strand DNA segments at each step of initial transcription. Published structural data for RNAP–promoter initial transcribing complexes show regions of local disorder of nontemplate and template strands that indicate locations of scrunched DNA segments, but do not define conformations of scrunched DNA segments.34,69,89
Another key objective for future work is a comprehensive elucidation of the sequence determinants for each quantitative parameter for initial transcription and promoter escape. Based on the information obtained to date, it is expected that the promoter sequence will influence initial transcription and promoter escape through effects on DNA unwinding energies. It also is expected that the promoter sequence will influence initial transcription and promoter escape through effects on protein–DNA interactions with core promoter elements and upstream promoter elements. In principle, promoter sequence also may influence initial transcription and promoter escape through effects on formation of DNA hairpins, or other multi-stranded DNA secondary structures, within the transcription bubble of an RNAP–promoter open complex or within the expanded transcription bubble of an RNAP–promoter initial-transcribing complex.
Another key objective for future work is the analysis of the regulation of initial transcription and promoter escape. Much data indicates that these processes are regulated, but next to nothing is known about how they are regulated.
Finally, elucidation of TSS selection, initial transcription, and promoter escape by eukaryotic RNAPs is a priority. In particular, clarification whether or not RNAP II, together with TFIIH, drives long-range scrunching in long-range TSS selection, as indicated by optical-tweezers and cryo-EM analyses,60,61 but not by magnetic-tweezers analyses,62 is an important priority.
- Transcription start site (TSS)
the promoter position that occupies the RNAP active-center product site (P site) during transcription initiation and, as such, base pairs with the initiating entity in primer-independent transcription initiation and with the primer 3′ end in primer-dependent transcription initiation.
- TSS selection
the process of selecting a TSS by placing the TSS template-strand nucleotide in the RNAP active-center P site during transcription initiation.
- Initial transcription
the process of synthesizing the first ∼11–15 nucleotides of an RNA product by a promoter-anchored RNAP initial transcribing complex.
- Promoter escape
the process of breaking interactions anchoring RNAP on promoter DNA upon synthesis of a nascent RNA having a threshold length of ∼11–15 nucleotides, enabling transformation of an initial transcribing complex into a transcription elongation complex.
- Transcription bubble
the region of unwound DNA within a transcription initiation complex, an initial transcribing complex, or a transcription elongation complex.
expansion of the transcription bubble by unwinding downstream DNA, pulling unwound DNA into the RNAP active-center cleft past the RNAP active center, and accommodating the unwound nontemplate- and template-strand single-stranded DNA as bulges within the transcription bubble. Scrunching occurs in TSS selection when selecting TSS positions downstream of the modal TSS position. Scrunching also occurs in initial transcription during each nucleotide-addition step subsequent to the first. Scrunching during initial transcription enables stepwise capture and storage of free energy for subsequent use in driving promoter escape.
contraction of the transcription bubble by rewinding downstream single-stranded DNA and extrusion of the rewound DNA from the RNAP active-center cleft. Anti-scrunching occurs in TSS selection when selecting TSS positions upstream of the modal position.
contraction of the transcription bubble by rewinding of upstream single-stranded DNA and extrusion of the rewound DNA from the RNAP active-center cleft. Unscrunching occurs in promoter escape and enables single-step release of captured and stored energy.
- Primer-independent transcription initiation
transcription initiation without a primer, using either a nucleoside triphosphate (NTP) or a non-canonical initiating nucleotide (NCIN) as the initiating entity.
- Primer-dependent transcription initiation
transcription initiation using an RNA primer.
The authors acknowledge support from a Helen Hay Whitney fellowship (JTW) and NIH grants GM041376 (RHE) and GM118059 (BEN).