- 1.1 Why Triplexes?
- 1.1.1 Triplets and Triplex Motifs
- 1.1.2 Base, Sugar and/or Phosphate Modifications
- 1.2 Stabilising Triplexes
- 1.2.1 Enhancing Stacking and Hydrophobic Interactions
- 1.2.2 Locking the Sugar Pucker
- 1.2.3 Adding Positive Charge(s)
- 1.2.4 Removing Negative Charge(s)
- 1.2.5 Triplex-binding and Cross-linking Agents
- 1.3 Decreasing pH Dependence
- 1.3.1 Pyrimidine Analogues
- 1.3.2 Purine Analogues
- 1.4 Recognising Pyrimidine–Purine Base Pairs
- 1.4.1 Null Bases and Abasic Linkers
- 1.4.2 Natural Bases
- 1.4.3 Analogues for CG Recognition
- 1.4.4 Analogues for TA Recognition
- 1.4.5 Other Approaches
- 1.5 Towards Mixed Sequence Recognition at Neutral pH
- 1.6 Outlook
Chapter 1: DNA Recognition by Parallel Triplex Formation
-
Published:08 Mar 2018
-
Special Collection: 2018 ebook collectionSeries: Chemical Biology
K. R. Fox, T. Brown, and D. A. Rusling, in DNA-targeting Molecules as Therapeutic Agents, ed. M. J. Waring, The Royal Society of Chemistry, 2018, ch. 1, pp. 1-32.
Download citation file:
Triplex-forming oligonucleotides (TFOs) are sequence-specific DNA-recognition agents that bind within the duplex major groove by forming Hoogsteen hydrogen bonds with exposed groups on the base pairs, generating a triple-helical, or triplex, structure. The unique recognition properties of these molecules have been exploited in such diverse areas as gene-targeting, diagnostics and bionanotechnology. However, the applications of TFOs containing natural nucleotides is somewhat restricted by their low affinity and slow association kinetics at physiological pH, as well as a requirement for oligopurine–oligopyrimidine duplex target sequences. In recent years these limitations have been overcome by using TFOs incorporating novel base, sugar and backbone modifications, and triplex formation at mixed sequence duplex targets with high affinity at physiological pH is now achievable. This review will discuss the structural properties and solution requirements of DNA triplexes, with reference to state-of-the-art of modifications used to improve the DNA-recognition properties of TFOs.
1.1 Why Triplexes?
Triplex-forming oligonucleotides (TFOs) bind in the duplex major groove by forming hydrogen bonds with exposed groups on the Watson–Crick (W–C) base pairs, generating a triple-helical structure (e.g., Figure 1.1). The unique base–base recognition properties of these molecules can be exploited as a means to target duplex sequences present or embedded within natural or synthetic DNA.1,2 Unlike most DNA-recognition agents, such as polyamides, TFOs are capable of targeting extended sequences, with a relatively low propensity to bind to non-target sites. In this way TFOs have been exploited as gene-targeting agents for modulating gene expression,3,4 as a means to detect and/or isolate plasmid and genomic DNA for molecular biology or diagnostics,5,6 and as a tool to introduce functionality into DNA nanostructures engineered for bionanotechnology or synthetic biology.7 Despite this, the applications of TFOs that contain natural nucleotides are often restricted by their low binding affinity and slow association kinetics at neutral pH, as well as a requirement for oligopurine–oligopyrimidine duplex target sequences. To overcome these limitations a variety of base, sugar and phosphate modifications have been developed to allow triplex formation at mixed-sequence targets with high affinity at neutral pH. This chapter will review the developments and current state-of-the-art of nucleotide modifications used to improve the triplex-forming properties of oligonucleotides.
Triplex-directed DNA recognition. (a) Structure of a parallel DNA triplex (PDB code: 1D3X); (b) Chemical structures of C+–GC and T–AT base triplets (R is deoxyribose); (c) Triplex sequence used to characterise the triplex-forming properties of an oligonucleotide containing a single nucleotide analogue (at position X) against each of the four base pairs (at position ZY) by fluorescence melting using molecular beacons (F is a fluorophore: Q is a quencher). In each case the third strand is shown in red and the duplex in black.
Triplex-directed DNA recognition. (a) Structure of a parallel DNA triplex (PDB code: 1D3X); (b) Chemical structures of C+–GC and T–AT base triplets (R is deoxyribose); (c) Triplex sequence used to characterise the triplex-forming properties of an oligonucleotide containing a single nucleotide analogue (at position X) against each of the four base pairs (at position ZY) by fluorescence melting using molecular beacons (F is a fluorophore: Q is a quencher). In each case the third strand is shown in red and the duplex in black.
1.1.1 Triplets and Triplex Motifs
Triplexes were first observed experimentally over 60 years ago by Rich and co-workers after mixing the polyribonucleotides poly-U and poly-A in a 2 : 1 ratio.8 Additional studies demonstrated that poly-C and poly-G could generate a similar structure under low-pH conditions,9 and since then a variety of DNA and RNA triplexes have been identified.10–13 The binding of an oligonucleotide within the major groove is asymmetric and can occur in either a parallel or antiparallel orientation relative to the oligopurine-containing strand of the target duplex. Pyrimidine-rich oligonucleotides bind in a parallel orientation under slightly acidic conditions (pH<6.0), with T and protonated C forming Hoogsteen hydrogen bonds with AT and GC base pairs, generating the base triplets T–AT and C+–GC, respectively (Figure 1.1b).1,2 (In this chapter the notation X–ZY refers to a triplet, in which the third strand base X interacts with the duplex base pair ZY, forming hydrogen bonds to base Z.) In contrast, purine-rich oligonucleotides bind in an antiparallel orientation, with A and G forming reverse-Hoogsteen hydrogen bonds with AT and GC base pairs respectively, generating A–AT and G–GC triplets.14,15
In theory, both triplex motifs could be usefully exploited for the recognition of unique duplex sequences but the greater stability of the parallel motif has meant it has been more widely adopted. Parallel triplexes are intrinsically more stable than their antiparallel counterparts because T–AT and C+–GC triplets are structurally isomorphic; that is, if the C-1′ atoms of their W–C base pairs are superimposed, the positions of the C-1′ atoms of the third strand are almost identical.16 This minimises backbone distortions of both the third strand and duplex between adjacent triplets. In contrast, antiparallel triplets are not isomorphic and lead to structural distortions at the junctions between consecutive triplets. The use of the antiparallel motif is also hampered by the tendency of purine-containing oligonucleotides to self-associate into structures such as G-quadruplexes and GA-duplexes, which compete with triplex formation and reduce the effective TFO concentration. It should also be noted that both G–GC and T–AT triplets can be generated in both binding motifs, and GT-containing oligonucleotides can therefore be designed to bind in either orientation. However, the non-isomorphic nature of these two triplets means that the most stable orientation is dependent on the number of GpT and TpG steps.17 This chapter will therefore focus on triplexes generated through the parallel binding motif using pyrimidine-rich oligonucleotides.
1.1.2 Base, Sugar and/or Phosphate Modifications
We and others have characterised a variety of novel base, sugar and phosphate modifications designed to improve the triplex-forming properties of oligonucleotides and a wealth of data has been generated on the affinity, kinetics and selectivity of triplexes containing these modifications. However, it is often hard to compare the effectiveness of a given modification between studies, since its properties will depend on its positioning within the third strand, the sequence context, length of third strand and/or duplex, as well as the pH and other solution conditions, such as the presence of divalent cations. Most experiments have involved the characterisation of a single substitution within a third strand by examining its interaction with duplexes containing each of the four base pairs at the same position. In this way an X–ZY triplet is generated, where X is the analogue under study and ZY is either an AT, TA, GC or CG base pair in turn (e.g., Figure 1.1c). Often the modification is compared with the most effective natural nucleotide at the same position, i.e., with T, C+, G or T opposite an AT, GC, TA or CG base pair respectively. The formation of G–TA and T–CG triplets for recognising pyrimidine–purine base pairs will be discussed in Section 1.4.2.
The simplest and most common means for characterising triplex stability is by ultraviolet melting, in which the triplex thermal stability is determined from the temperature-dependent change in absorbance at 260 nm, generating a melting curve, from which the melting temperature (Tm) is estimated. However, the analysis of such melting curves is not always straightforward, since the triplex–duplex and duplex–single-strand transitions often overlap. We find that a better approach is to use synthetic oligonucleotides that contain molecular beacons and to measure the fluorescence melting curves.18 This works best when the fluorescence quencher (e.g., dabcyl) is attached to the TFO and the fluorescent group (e.g., fluorescein) is attached to one of the duplex strands (as shown in Figure 1.1c). In this way, the concentration of the TFO can be varied without altering the background fluorescence. We have used this strategy to characterise 15 of the nucleotide analogues described in this chapter using the same model triplex (Figure 1.1c), with experiments undertaken using the same buffer conditions. (Experiments were performed in 50 mM sodium acetate buffer (pH 6.0) containing 200 mM sodium chloride using a temperature gradient of 0.2 °C min−1 and no hysteresis between melting annealing curves was observed.) To allow the reader to make a useful, and unbiased, comparison between these modifications we have included a table of Tm values later in this chapter (Table 1.1).
Comparing nucleotide modifications. Melting temperatures (Tm) determined for various parallel triplexes containing a central X–ZY triplet, where X is a natural or modified nucleotide, and ZY is each base pair in turn (sequences shown in Figure 1.1c). Experiments were undertaken as described in the main text and (–) indicates that the melting temperature was lower than 25 °C.
ZY (base pair) . | |||||
---|---|---|---|---|---|
. | X (TFO) . | AT . | GC . | CG . | TA . |
Natural nucleotides | A | 28.2 | 30.0 | – | – |
G | – | 28 | – | 30.6 | |
C | – | 40.5 | 28.4 | – | |
T | 39.4 | 27.7 | 28.3 | – | |
AT recognition | APdU | 41.8 | 30.9 | 29.5 | – |
GPdU | 43.7 | 32.3 | 29.9 | – | |
DMAPdU | 43.8 | 30.7 | 28.4 | – | |
BAU | 47.5 | 36.5 | – | – | |
BGU | 47.4 | 35.0 | – | – | |
GC recognition | APdC | 28.8 | 41.1 | 33.7 | – |
AP | – | 44.6 | – | – | |
AE-N7G | – | 40.3 | – | – | |
CG recognition | MPP | – | – | 31.5 | – |
APP | – | – | 32.1 | – | |
AE-AP | 39.9 | 31.1 | 32.4 | – | |
TA recognition | APdG | – | 30.2 | – | 31.0 |
AE-G | – | 36.5 | – | – | |
S | 30.6 | 30.4 | 33 | 34.6 | |
AE-S | 31.3 | 31.2 | 33.6 | 36.5 |
ZY (base pair) . | |||||
---|---|---|---|---|---|
. | X (TFO) . | AT . | GC . | CG . | TA . |
Natural nucleotides | A | 28.2 | 30.0 | – | – |
G | – | 28 | – | 30.6 | |
C | – | 40.5 | 28.4 | – | |
T | 39.4 | 27.7 | 28.3 | – | |
AT recognition | APdU | 41.8 | 30.9 | 29.5 | – |
GPdU | 43.7 | 32.3 | 29.9 | – | |
DMAPdU | 43.8 | 30.7 | 28.4 | – | |
BAU | 47.5 | 36.5 | – | – | |
BGU | 47.4 | 35.0 | – | – | |
GC recognition | APdC | 28.8 | 41.1 | 33.7 | – |
AP | – | 44.6 | – | – | |
AE-N7G | – | 40.3 | – | – | |
CG recognition | MPP | – | – | 31.5 | – |
APP | – | – | 32.1 | – | |
AE-AP | 39.9 | 31.1 | 32.4 | – | |
TA recognition | APdG | – | 30.2 | – | 31.0 |
AE-G | – | 36.5 | – | – | |
S | 30.6 | 30.4 | 33 | 34.6 | |
AE-S | 31.3 | 31.2 | 33.6 | 36.5 |
1.2 Stabilising Triplexes
Triplex stability stems from the formation of two Hoogsteen hydrogen bonds between each base in the third strand and its duplex partner as well as favourable stacking interactions between consecutive bases. Under low-pH conditions the stability of a parallel triplex can be greater than that of its underlying duplex, i.e., the affinity of a third strand for its duplex target is greater than the affinity of a duplex strand for its W–C partner.19 However, the majority of applications proposed for TFOs require that they bind with high affinity at neutral pH. Although a variety of base analogues have been used to alleviate, at least in part, the pH dependence of the C+–GC triplet (see Section 1.3), the affinity of the third strand can also be improved by increasing the stability of the canonical T–AT and C+–GC triplets.
1.2.1 Enhancing Stacking and Hydrophobic Interactions
Base stacking and hydrophobic interactions are important factors that influence the structure and stability of both duplex and triplex DNA. Consequently, several thymine analogues have been prepared with additional aromatic rings across the 4–5 or 5–6 positions, which should increase the aromatic surface area of the base without affecting the hydrogen bonding groups.20–22 However, and somewhat surprisingly, triplexes containing these analogues did not demonstrate any enhanced stability. The best of these, a non-natural pyrido[2,3-d] pyrimidine nucleoside (F; Figure 1.2a), was shown to recognise AT base pairs with an affinity that was similar to, but not greater than, that of unmodified T.20 However, these studies were undertaken with isolated substitutions and it is likely that only multiple adjacent substitutions will improve stability through stacking interactions. A more successful strategy has been to introduce hydrophobic substituents at the 5-position of the base to increase hydrophobic interactions within the major groove. The simplest addition is a methyl group and probably explains why T–AT triplets are more stable than U–AT and MeC+–GC triplets are more stable than C+–GC.23,24 The addition of a propynyl group (PdU; Figure 1.2b) further extends the hydrophobic surface and it has been shown that each PdU substitution increases the Tm of the triplex by ca. 2.5 °C relative to an unmodified third strand.25–29 Subsequent NMR studies revealed that, as expected, the extended aromatic electron cloud of the PdU nucleotide stacks well over the 5′-neighboring nucleotides, and is most probably the cause of the increased stabilisation.26 A further study examined the properties of four different C5-amino modified deoxyuridines and showed that the order of stability produced by 5-substitutions is alkyne>E-alkene>alkane>Z-alkene.29 This order must result from steric factors as well as stacking interactions. The same strategy cannot be applied for increasing the affinity of C for a GC base pair, since the addition of a propynyl group to the 5-position lowers the pKa of the base and increases the pH dependence of the triplet.28 Indeed the attachment of a propargylamino chain to the 5-position (APdC), a similar modification that is discussed further below, generated triplexes of equivalent stability to those formed by cytosine at pH 6.0 (Table 1.1).
Chemical structures of base modifications that increase stacking/hydrophobic interactions. (a) F; (b) PdU; (c) s2T; (d) m5s2C; (e) s8A. For each structure R is deoxyribose unless otherwise stated in the main text.
Chemical structures of base modifications that increase stacking/hydrophobic interactions. (a) F; (b) PdU; (c) s2T; (d) m5s2C; (e) s8A. For each structure R is deoxyribose unless otherwise stated in the main text.
More recently the introduction of thiocarbonyl groups to the 2-position of thymine (s2T; Figure 1.2c) and 5-methylcytosine (m5s2C; Figure 1.2d), as well as the 8-position of adenine (s8A; Figure 1.2e), has proved to be a useful strategy for increasing triplex stability.30,31 Molecular modelling of a parallel triplex containing s2T in the third strand indicated that the 2-thiocarbonyl group of the 5′-upstream base could interact with the nitrogen atom at the 1 position of the 3′-downstream pyrimidine ring and result in strong stacking effects.28 Indeed triplexes containing multiple substitutions of s2T led to a Tm increase of around 5 °C per modification at pH 7.0. A further enhancement in affinity was seen when combined with m5s2C and particularly evident for TFOs containing multiple, adjacent substitutions. The m5s2C and s8A analogue were developed for the pH-independent recognition of GC base pairs and are discussed further below (see Sections 1.3.1 and 1.3.2, respectively).
1.2.2 Locking the Sugar Pucker
The affinity of a TFO for its duplex target is affected by its ability to adopt N- or S-type sugar pucker conformations and it has been proposed that the former require less distortion of the duplex purine strand upon triplex formation.32 This is thought to explain why TFOs composed of ribonucleotides exhibit a higher affinity for duplex DNA than those composed of deoxyribonucleotides.33–37 In general, oligonucleotide modifications that favour N-type sugars produce more stable triplexes than their S-type counterparts. The addition of an electronegative group at the 2′-position of the sugar, as in RNA, strongly favours the N-type sugar pucker predominantly due to the gauche effect.38 Consequently various groups have been attached to this position to promote or restrict the sugar to an N-type configuration.
The first chemical moiety to be added to 2′-position that resulted in improved TFO binding was a methoxy group (2′-OMe; Figure 1.3a).34,36 Subsequent NMR studies confirmed that this resulted from a reduced distortion of the duplex purine strand, enhancing the rigidity of the triplex.32 A better modification that locks the sugar pucker in an N-type configuration and reduces the rotational freedom of the sugar phosphate backbone is bridged/locked nucleic acid (BNA or LNA; Figure 1.3b).39,40 This modification exploits a 2′-O,4′-C methylene bridge to constrain the sugar to N-type and was developed independently by the Wengel and Imanishi groups for use in antisense or antigene applications, respectively. TFOs that contain BNA/LNA residues are markedly more stable than their unmodified counterparts but only when substituted every 2–3 nucleotides.40,41 Further BNA/LNA derivatives have been developed to overcome this sequence restriction. Substitution of the bridge with an ethylene moiety (ENA; Figure 1.3c), which contains an additional carbon, allows triplex formation with fully modified TFOs.42 Whilst the introduction of an O–N bond (BNANC; Figure 1.3d) further improved the nuclease resistance of the oligonucleotide.43 TFOs containing the N-methyl derivative of BNANC were stable in serum for over 90 minutes compared with an unmodified oligonucleotide, which was completely degraded in 5 minutes. Such modifications are likely to be useful for any applications that require the use of TFOs within a physiological setting. Further thermodynamic and kinetic studies revealed that the enhancement in affinity stemmed from a decrease in the dissociation constant of the TFO.44 An additional advantage of the sugar analogue is that the presence of the nitrogen atom allows functionalisation with other chemical groups, such as fluorophores. Lastly, 3′-amino-2′,4′-BNA has been developed in which the BNA/LNA modification is combined with the N3′-P5′ modification considered below (Figure 1.3e, see Section 1.2.4).45,46 Although triplexes with this analogue were more stable than their duplex equivalents, they were no more stable than those formed with BNA/LNA alone.46 Further attempts to constrain the sugar pucker to either S-type or N-type include the bicyclo and tricyclo furanose modifications developed by Leumann and co-workers (Figure 1.3f and g).47–49 Bicyclo-DNA contains a 3′-O,5′-C ethylene bridge that locks the sugar in an S-type conformation, while the tricyclo derivative contains an additional cyclopropane unit locking the sugar in an N-type pucker. Studies with TFOs composed of tricyclo-modified thymidine showed an increase in Tm of 2 °C per modification at pH 7.0.49
Chemical structures of sugar modifications that restrict the sugar pucker. (a) 2′-OMe; (b) BNA/LNA; (c) ENA; (d) BNANC; (e) 3′-amino-2′,4′-BNA; (f) bicyclo-DNA; and (g) tricyclo-DNA. For each structure B is the DNA base or base analogue.
Chemical structures of sugar modifications that restrict the sugar pucker. (a) 2′-OMe; (b) BNA/LNA; (c) ENA; (d) BNANC; (e) 3′-amino-2′,4′-BNA; (f) bicyclo-DNA; and (g) tricyclo-DNA. For each structure B is the DNA base or base analogue.
1.2.3 Adding Positive Charge(s)
The formation of a triplex brings three polyanionic strands into close proximity, increasing the negative charge density by 50%, and leads to a high degree of charge repulsion. This can be partially screened using high concentrations of monovalent ions (e.g., up to 200 mM of sodium) and lower concentrations of divalent or polycationic ions (e.g., up to 10 mM magnesium or spermine).1,50 Consequently, the incorporation of positively charged moieties into the TFO by their addition to the phosphate, sugar or base has helped increase triplex stability by alleviating in part, some of this charge repulsion.
1.2.3.1 Phosphate
One means to incorporate charges into a TFO is by their appendage to the phosphodiester backbone. For example, Bruice and co-workers have shown that the addition of positively charged guanidinium linkages (DNG; Figure 1.4a) causes a dramatic increase in TFO affinity.51,52 The synthesis of the ribose derivative has also been reported but to our knowledge this has not yet been studied for its triplex-forming properties.53 Two further modifications that replace the phosphate residues with either cationic dimethylaminopropyl phosphoramidate linkages (PNHDMAP) or N,N-diethyl-ethylenediamine linkages (DEED; Figure 1.4b) have also been characterised.54–56 TFOs with these modifications generated triplexes which were more stable than the underlying duplex at pH 7.0.56 More recently, oligonucleotides containing non-nucleosidic monomers composed of partially protonated amines have been prepared. When incorporated at the TFO termini such modifications lead to a significant enhancement of triplex stability, particularly when positioned at the 5′-end of the TFO.57
Chemical structures of phosphate, sugar and base modifications that introduce positive charge. (a) DNG; (b) DEED; (c) 2′-AE; (d) 4′-AE; (e) pyrrolidine-DNA; (f) US; (g) MeCS; (h) APdU; (i) DMAPdU. For each structure B is a base or base analogue; R is deoxyribose unless otherwise stated in the main text.
Chemical structures of phosphate, sugar and base modifications that introduce positive charge. (a) DNG; (b) DEED; (c) 2′-AE; (d) 4′-AE; (e) pyrrolidine-DNA; (f) US; (g) MeCS; (h) APdU; (i) DMAPdU. For each structure B is a base or base analogue; R is deoxyribose unless otherwise stated in the main text.
1.2.3.2 Sugar
Perhaps the most exploited approach for the introduction of positive charges into a triplex has relied on their addition to the 2′ (Figure 1.4c)58 and 4′-positions (Figure 1.4d)59 of the sugar unit. In both cases, the most stable triplexes were formed by addition of an aminoethoxy side chain with a Tm increase of 3.5 °C and 1 °C per modification at pH 7.0 for the 2′ and 4′ derivatives, respectively. The greater stabilisation afforded by the 2′-derivative has been attributed to the formation of a salt bridge between the positive charge and a pro-R oxygen of the negatively charged phosphate of the purine strand and a favourable N-type sugar pucker as described above.60 For experiments with psoralen-linked oligonucleotides it was suggested that the 2′-aminoethoxy modification is more effective when the positively charged derivatives are clustered together.61 In addition, introduction of these cationic modifications in the 5′-region of the TFOs significantly increased the kon values compared with that of natural TFO, while no enhancement in the rate of triplex DNA formation was observed when the modifications were in the middle and at the 3′-region.62 It is likely that this effect is due to the nucleation zipper mechanism proposed for triplex formation.63 Replacement of the amine with a guanidinium group, which positions three amines in a plane, has also proved useful.64 Guanidinylation can be achieved post oligonucleotide synthesis and offers the advantage that the group is protonated over a greater pH-range than the amine; the pKa of a primary amine is around 9, whilst the pKa of the guanidinium group is 12.5. It is also offers the potential of forming up to five hydrogen bonds. This modification typically gives the same increase in stability as with the primary amine at neutral pH but in principle should give greater triplex stabilisation at higher pH values. Lastly, substitution of the furanose oxygen with nitrogen, generating pyrrolidine oligonucleotides (Figure 1.4e) can be exploited to position a positive charge next to the pro-R non-bridging phosphate oxygen in the purine strand. However, the degree of stabilisation depends on the attached base; the presence of the cytosine mimic pseudoisocytosine resulted in a Tm increase of 2 °C per modification relative to C (see Section 1.3.1),65 whilst uracil was destabilising relative to T.66,67
1.2.3.3 Base
Since it has been observed that the C+–GC triplet is more stable than T–AT, due to the presence of the positive charge, various polyamines have been appended to different positions of the pyrimidines. Attachment of spermine to the 5-position of uracil68 (US; Figure 1.4f) and the N4 position of methylcytosine69 (MeCS; Figure 1.4g) both led to an increase in triplex stability under physiological pH conditions, though the complexes exhibited decreased sequence selectivity. We have also improved the affinity of T for AT base pairs by preparing the base analogue 5-propargylamino dU (APdU; Figure 1.4h).70 This analogue bears a positive charge attached to the 5-position of U rather than in the stacked ring system (as seen with protonated C). The presence of the alkyne moiety is also expected to contribute to triplex stability by enhancing stacking interactions in the major groove (as discussed previously). TFOs containing multiple substitutions of APdU are markedly more stable than unmodified TFOs, though the complexes are still pH-dependent on account of the requirement for protonation. However, runs of adjacent substitutions are not destabilising, in contrast to protonated C. This demonstrates that removing the charge from the π-stack and placing it in the major groove is a useful approach for stabilising triplexes. Each APdU substitution leads to a typical increase in Tm of ca. 2 °C relative to T whilst retaining sequence selectivity (Table 1.1). The guanidinylated version of APdU (GPdU) also led to a slight enhancement in affinity with a typical increase of ca. 4 °C per modification (Table 1.1).
The synthesis of nucleoside analogues that combine a stabilising base modification with a suitable sugar modification has proved very useful. Bis-amino U (BAU) is a nucleoside analogue that contains both a 5-propargylamino modification on the base and a 2′-aminoethoxy side chain on the sugar.71,72 At physiological pH both modifications are protonated and substantially increase the Tm of the complexes by ca. 8 °C per modification (Table 1.1). The guanidylated version of BAU (BGU) also leads to a similar enhancement in affinity (Table 1.1). The two positive charges act in different ways to enhance triplex stability: the 2′-aminoethoxy group interacts with a phosphate on the duplex purine strand, while the 5-propargylamino group interacts with a third strand phosphate.71 Positioning this analogue opposite a duplex mismatch decreased the stability of the complex and demonstrated the requirement for positioning the charges at precise locations within the triplex structure.73 Interestingly, BAU also exhibits a greater sequence selectivity than thymidine, with enhanced discrimination against pyrimidine inversions, and removes the requirement for magnesium ions.72 Further analysis revealed that triplexes containing BAU exhibit very slow binding kinetics, stemming from a decreased rate of dissociation as the modification had little effect on the association reaction. The sequence selectivity is also due to the slower dissociation of BAU from AT than other base pairs.74 The 2′-methoxyethyl derivative of APdU has also been prepared and whilst it generates more stable triplexes relative to those that lack the modification it was not as stabilising as BAU.75 The addition of alkynyl modifications to the 5-position of T has been investigated in the context of BNA/LNA and both ethynyl and propargylamine modifications increased triplex stability by >13.5 °C per modification at pH 7.0.76
One of the drawbacks of using such charged nucleoside analogues is the occurrence of side-reactions at the amines during oligonucleotide synthesis and deprotection, which limits their compatibility with other chemical groups. To overcome this problem we have prepared both 5-dimethylaminopropargyl-dU (DMAPdU; Figure 1.4i) and 2′-dimethylaminoethoxy-U, which contain dimethylamines in place of the amine groups.77,78 Triplexes generated with these analogues are more stable than those containing T but are less stable than those containing the equivalent amine modification. We suggest two possible explanations for their slightly lower affinity: firstly, the addition of the methyl groups could sterically hinder the interaction of the TFO, secondly, the amine group in the parent compound may contribute to a hydrogen bond donor interactions. Analysis of the kinetics of the DMAPdU modification again revealed that the increase in stability stems from a slower dissociation rate of the modified TFO.78 Another problem that stems from using TFOs containing multiple charges is that they can suffer from off-site binding. To examine this further we used a restriction enzyme protection, selection and amplification assay (REPSA) to isolate sequences that are bound by a heavily modified 9-mer TFO containing six adjacent BAU modifications.79 The TFO was capable of interacting with a variety of different sequences that contained An tracts (n=6) even though the surrounding sequence did not match the remainder of the TFO sequence.
1.2.4 Removing Negative Charge(s)
An alternative strategy for decreasing the charge repulsion between the three polyanionic strands is to use TFOs that contain non-charged backbones. Replacement of the phosphate linkage with a 3′-5′-methylphosphonate group (Figure 1.5a) was successfully used for triplex formation using short oligonucleotides containing alternating methylphosphonate and phosphodiester linkers.80 However, subsequent studies with longer fully substituted TFOs showed that this modification was destabilising.81,82 The N3′–P5′ amidate modification, where O3′ of the internucleoside phosphate is replaced by NH83 (Figure 1.5b) increases the binding constant at neutral pH by nearly two orders of magnitude. Triplex binding is probably improved as this modification favours the N-type sugar conformation as discussed above. This modification has also been combined with the addition of a cationic copolymer, which cooperatively stabilises triplex formation and increases association rates by four orders of magnitude.84 Morpholino oligonucleotides are another interesting class of analogues in which the ribose sugar is replaced with a six-membered morpholino ring and the phosphodiester linkage is replaced by a phosphorodiamidate (Figure 1.5c). TFOs containing this modification are less stable than those containing the N3′–N5′ modification at high concentrations of cations but are more stabilising at low ionic strength.85–87
Chemical structures of backbone modifications. (a) 3′–5′-methylphosphonate; (b) N3′–P5′ amidate; (c) morpholino; (d) PNA. For each structure B is a base or base analogue.
Chemical structures of backbone modifications. (a) 3′–5′-methylphosphonate; (b) N3′–P5′ amidate; (c) morpholino; (d) PNA. For each structure B is a base or base analogue.
Perhaps the most extensively employed uncharged backbone modification is peptide nucleic acid (PNA; Figure 1.5d). PNA is composed of repeating (2-aminoethyl)glycine units to which nucleobases are linked by methylene bridges.88,89 PNA usually interacts with duplex DNA via a mechanism of strand displacement and P-loop formation, requiring two molecules of PNA,90 generating a 2 : 1 PNA : DNA triplex. Two pyrimidine-containing PNA molecules form a local triplex with the purine-containing DNA strand. This leaves the pyrimidine DNA strand looped out as a single strand. The resulting triplex is more stable than the equivalent DNA triplex since there is much lower charge repulsion between the three strands. In a few instances PNA can form a 1 : 2 PNA : DNA triplex by simple binding of a PNA third strand to a DNA duplex, though this is usually restricted to cytosine-rich PNAs.91 Although PNAs demonstrates excellent hybridisation properties the lack of charge often makes such oligomers insoluble unless other cationic groups are attached.
1.2.5 Triplex-binding and Cross-linking Agents
Various small-molecules, including edge binders, intercalators and minor-groove binders, have been designed to bind non-covalently to duplex or triplex DNA and increase the stability of these complexes.92 Such ligands can be used free in solution or after their attachment to the 3′ or 5′-end of the TFO via a flexible linker. The latter is most frequently exploited in a physiological setting where it is not possible to add the ligand to the buffer. Triplex-specific ligands are usually composed of extended aromatic ring systems for stacking via intercalation between the base triplets and some of these incorporate a positive charge to help alleviate the charge repulsion problem. The first to be described was benzopyridoindole (BePI)93 and a wide range of such ligands has since been developed and reviewed elsewhere.94 These compounds have also been used to enhance the affinity of weaker triplexes that are formed at oligopurine sequences that contain pyrimidine interruptions.95 Another approach that has received much attention is the attachment of DNA cross-linking agents to the TFO. The most frequently employed photo-cross-linking agent is 4,5,8-trimethylpsoralen (psoralen) which preferentially intercalates at TpA steps, and upon photoactivation with long-wavelength UV light leads to a 2+2 cycloaddition with the adjacent thymidines, cross-linking the TFO to one or both strands of the duplex.96 Initially, cross-linking reactions were restricted to TpA steps located at the 5′-end of an oligopurine tract but we have developed phosphoramidite modifications that allow incorporation anywhere along the TFO, as well as at both ends of the oligonucleotide, generating ‘triplex staples’.97,98
1.3 Decreasing pH Dependence
Triplex formation in the parallel motif suffers from a requirement for low-pH conditions necessary for the protonation of cytosine at N3. Without protonation the C–GC triplet contains a single hydrogen bond between the exocyclic N4 of C and 6-keto group of G. The pKa of cytosine is around 4.5 for the free base but this is elevated within an oligonucleotide and further increased upon triplex formation, particularly in the centre of the triplex.23,32 Runs of contiguous cytosine residues are destabilising as they reduce the pKa of each residue due to competition effects.99–105 Nevertheless, it has been suggested in several reports that under conditions of low pH C+–GC is more stable than T–AT; an effect that has been attributed to electrostatic interactions between the positive charge and the negatively charged phosphodiester backbone and/or favourable stacking interactions between the charged base and the π-stack.30,105,107–109 Interestingly, it has also been shown that a silver ion can displace the N3 proton of C to form the base triplet C(Ag+)–GC which allowed triplex formation at pH 7.0; in a silver-containing buffer a triplex containing five cytosines in the third strand was stabilised by as much as 30 °C.106 In our hands the difference in stability of T–AT and C+–GC at pH 5.0 is around 4 °C, which decreases to about 1 °C at pH 6.0 (Table 1.1). To address the pH-dependence of the C+–GC triplet a variety of pyrimidine or purine analogues have been developed.
1.3.1 Pyrimidine Analogues
Several cytosine analogues have been prepared that exhibit higher pKa values than cytosine. Triplets generated with these analogues have the advantage that they are structurally isomorphic with T–AT and the presence of the charge contributes additional stability. The simplest modification to the pyrimidine nucleus is the addition of a methyl group at the 5-position of C generating 5-methylcytosine (MeC : Figure 1.6a). The pKa of the ring nitrogen of MeC is increased by 0.1–0.2 pH units.99,107 Indeed triplexes generated with MeC exhibit a lower pH dependence and higher affinity relative to cytosine. Although this was first attributed to the increase in pKa it has since been suggested that stabilisation might be entropic in origin, resulting from disruption of the surrounding water structure, greater base stacking, and/or hydrophobic interactions within the major groove.23,24 Alternatively the improved stacking may increase the residence time of the non-protonated base in the uncharged C–GC triplet, thereby increasing its stability.108 The 2-thiolyated version of 5-methylcytosine (m5s2C, Figure 1.2c) gives a further enhancement in affinity with the base exhibiting a much higher pKa of around 6.3–6.7 depending on sequence context.30
Chemical structures of base analogues that reduce pH dependence. (a) MeC; (b) AP (X is H or CH3); (c) ΨC; (d) PyDDA; (e) oxoC (X is H or CH3); (f) isoG; (g) oxoA; (h) N7G; (i) P1; (j) N7I. For each structure R is deoxyribose unless otherwise stated in the main text.
Chemical structures of base analogues that reduce pH dependence. (a) MeC; (b) AP (X is H or CH3); (c) ΨC; (d) PyDDA; (e) oxoC (X is H or CH3); (f) isoG; (g) oxoA; (h) N7G; (i) P1; (j) N7I. For each structure R is deoxyribose unless otherwise stated in the main text.
Another useful analogue to be developed is the C-nucleoside 2-aminopyridine (AP; Figure 1.6b) which was first synthesised independently by the Neidle, Reese and Leumann groups.109–111 AP differs from cytosine by substitution of a carbon at N1 and removal of the 2-carbonyl. Both β and α-anomers were evaluated since the α-anomer is slightly more basic than the β-anomer.109,111 Triplexes containing the β-anomer exhibited a lower pH dependency that was attributed to the increased pKa of around 6.5 for the base, and β-AP generated triplexes that were stable at pH 6.5 even at target sites that contained multiple adjacent GC base pairs.109 A single AP substitution increased the Tm of a triplex by ca. 4 °C at pH 6.0 relative to C (Table 1.1). The 3-methyl and 6-methyl derivatives of AP have also been prepared but did not produce a dramatic improvement in stability.112,113 We have also shown that AP acts cooperatively with the doubly charged thymine analogue BAU to produce triplexes that have nanomolar binding affinities at pH 7.5 in the absence of any divalent metal cations.114 The arrangement of the substitutions is important and oligonucleotides in which these analogues are evenly distributed throughout the third strand bind more tightly than those in which they are clustered together. An even greater increase in affinity can be achieved by using the 2′-aminoethoxy derivative of 3-methyl-AP in combination with 2′-aminoethoxy-T, which generated triplexes with nanomolar binding affinities at pH values as high as 9.0.115 In this instance, the third strand was fully modified with both analogues.
To remove the pH dependency of targeting GC altogether various uncharged analogues of cytosine have been synthesised. The first was pseudoisocytosine (ΨC; Figure 1.6c) and its 2′-O-methyl derivative, which formed stable triplexes at pH 7.0 under conditions where deoxycytidine and 2′-O-methylcytidine did not.116,117 As with 2-aminopyridine this analogue can be successfully employed for targeting contiguous GC base pairs. Several derivatives of this base have also been characterised and its complicated synthesis has been streamlined.118 The deoxyribose derivative exhibits a lower affinity for GC than the 2′-O-methyl analogue, presumably because the former adopts the less favourable S-type sugar conformation.119 The pyrrolidino derivative produced a 2.5–3 °C increase in Tm per modification and can be used to target contiguous guanines.65,119 Pseudoisocytosine has more frequently been employed for the pH independent recognition of DNA by PNA.120 A similar analogue is pyrazine (PyDDA; Figure 1.6d), which possesses a nitrogen at the 6-position (instead of the usual 1-position), and it too can be used to produce stable triplexes at pH 7.0.121
Lastly, 6-oxo cytosine (oxoC; Figure 1.6e) and its 5-methyl derivative have been studied as potential cytosine mimics.122,123 At low pH these analogues produce triplexes with lower stability than protonated cytosine, though binding is much less pH-dependent. Indeed, at physiological pH it is superior to cytosine. Surprisingly, contiguous substitutions of 6-oxo cytosine are also destabilising.124 The lower stability of the oxoC–GC triplet relative to C+–GC is attributed to unfavourable stacking interactions and/or steric hindrance due to the 6-carbonyl group, which lies close to the furanose oxygen in the anti-conformation that is required in triplexes. This has been partially overcome by attaching the base to the backbone via an acyclic linker which gives greater flexibility.123 2′-O-Methyl and ribo derivatives of this base have been synthesised though these produce less stable complexes.122,125 We have since synthesised the 2′-aminoethoxy derivative of this nucleoside but, surprisingly, only a moderate enhancement in affinity was observed.
1.3.2 Purine Analogues
In order to bind to a GC base pair an analogue must present two hydrogen bond donor groups, which can be achieved using a purine nucleus. For example, isoguanine (isoG; Figure 1.6f), which switches the positioning of the amino group to the 2-position of the base and generates stable triplets, presumably through the formation of two hydrogen bonds.126 Moreover, its 8-aza derivative can also be used to bind to G and exhibits fluorescent properties that can be exploited to monitor third strand binding.127 Another useful strategy has been to exploit purine analogues that are designed to present the Hoogsteen face of the base by adopting a syn-conformation. The first purine analogues to demonstrate successful binding were 8-oxo-adenine128 (oxoA; Figure 1.6g) and its N6-methyl derivative.129 The presence of the 8-oxo-group forces a syn-conformation, which presents the 6-amino and N7 protons in a suitable orientation for recognition of G. The 7,8-dihydro derivative exhibits similar properties.130 The same strategy has been exploited using 8-thioadenosine (s8A; Figure 1.2e),31 which offers an additional advantage that the thiocarbonyl can provide stacking interactions in the major groove. In all cases, these analogues recognise GC in a pH-independent fashion and generate triplexes that have the same, or similar, stability as those containing MeC at low pH. However, the triplets formed by these bases are not structurally isomorphic with T–AT triplets and consequently are better for targeting contiguous rather than isolated guanines, leading to a lower distortion of the TFO backbone.
N7-purine derivatives have also been developed for GC recognition and the first characterised was N7-guanine (N7G; Figure 1.6h).131 Essentially these analogues alter the antiparallel G–GC triplet so that it can be incorporated within the parallel binding motif. Experiments revealed that the base offers pH insensitivity but suffers from sequence constraints; triplexes were three orders of magnitude less stable with alternating, compared with contiguous substitutions.132 Similar characteristics were also exhibited by other N7 analogues, (P1; Figure 1.6i)133,134 and N7-inosine (N7I; Figure 1.6j).135–137 N7I lacks the amino function of N7G but surprisingly shows stable recognition of guanine. This was attributed to the formation of an unconventional CH–O bond between the carbonyl group of inosine and the CH of guanine. It was postulated that this interaction gives a small, positive, direct electrostatic contribution to stability.136
Several strategies have been employed to overcome the sequence constraints imposed by the lack of isomorphism of the triplets formed by these bases. An acyclic glycerol derivative was employed to attach N7G to the oligonucleotide backbone, though the increase in flexibility did not alleviate this constraint.138 An alternative method to compensate for this loss in binding energy is to add positive charges, as described for some of the analogues described above. We have shown that a single substitution with the 2′ aminoethoxy derivative of N7G is as stable as cytosine at pH 6.0 (Table 1.1), but when this nucleotide is employed at alternate positions it produces less stable complexes than MeC at pH 7.0. Optimum triplex formation may therefore require the use of a combination of such bases; N7-purines for binding to contiguous GC base pairs and pyrimidine analogues for binding to isolated guanines. An alternative strategy would be to develop an isomorphous N7 derivative (such as N7-adenine) for targeting A, though the propensity for purines to bind in an antiparallel orientation may create problems.
1.4 Recognising Pyrimidine–Purine Base Pairs
Triplex formation requires oligopurine–oligopyrimidine target sequences and the recognition of sites containing ‘inverted’ pyrimidine–purine base pairs is much harder to achieve. Although this restriction may seem to limit the application of TFOs for gene-targeting purposes, oligopurine–oligopyrimidine target sequences are surprisingly abundant within the human genome.139–141 The targeting of pyrimidine bases is hampered since the Hoogsteen face of both C and T offers just a single conventional hydrogen bonding contact within the major groove. Nevertheless, a variety of strategies have been exploited for generating stable triplexes at oligopurine tracts containing pyrimidine inversions.142
1.4.1 Null Bases and Abasic Linkers
Since the binding of a third strand within the duplex major groove is highly asymmetric it is not possible to switch across the groove to recognise the partner base on the adjacent strand, as this would result in a loss of base stacking, and impose conformational strain on the backbone of the TFO. The simplest means for targeting oligopurine duplex sequences containing pyrimidine interruptions would be to bypass the ‘offending’ base by placing a null or universal base analogue opposite the inversion site. Such analogues are usually aromatic rings that lack the capacity for hydrogen bonding and stabilise the helical structure through stacking interactions alone.143 This can also be achieved using an abasic linker, such as 1,2-dideoxy-d-ribose (φ)144 but results in a loss of binding affinity due to the lack of stacking interactions. Neither of these approaches has yielded stable triple helical structures and both cause a loss of specificity at the skipped base, as any base pair can be tolerated at this position.
1.4.2 Natural Bases
Various studies have investigated the stability of all possible triplet combinations composed of natural bases (Table 1.1).145–147 These have demonstrated that the least destabilising combinations for recognising TA and CG base pairs are G–TA and T–CG (or C–CG). For each of these triplets the third strand forms a single hydrogen bond to the target, resulting in complexes that are less stable and selective than the canonical T–AT and C+–GC triplets. It has been determined that a single base mismatch results in a typical free energy penalty of ∼3 kcal mol−1.148,149 The destabilisation is dependent on the nature and position of a mismatch. Central mismatches are more destabilising than terminal ones since they disrupt the cooperative interactions between neighbouring triplets.149,150
The G–TA triplet contains a single hydrogen bond between the exocyclic amino group of G and the 4-carbonyl of T.151 This hydrogen bonding arrangement has been confirmed by comparison with various guanine analogues. Removal of the 2-amino group or the 6-oxo group, generating inosine or 2-aminopurine respectively, produces triplets which are less stable than guanine.152 The latter is more surprising as the 6-oxo group is not thought to be involved in binding, although 2-aminopurine also differs from guanine in lacking a hydrogen atom on N1. The stability of the G–TA triplet is affected by the sequence context and flanking T–AT triplets produce more stable complexes than flanking C+–GC triplets (especially on the 3′-side). This is thought to be due to the formation of a second (weaker) hydrogen bond with the T of an adjacent T–AT triplet.151 Stable complexes can be formed when this triplet is present at every fourth position, so long as the triplex contains some C+–GC triplets and T–AT is located on the 3′-side of each G–TA.153 The interaction is further stabilised by the appropriate use of charged base analogues such as 5-propargylamino-dU. Duplex regions of (AT)n can be targeted with GT-containing oligonucleotides, forming alternating G–TA and T–AT triplets, though this interaction is only observed if this is anchored by a more stable triplex.154,155
The parallel T–CG triplet was first proposed by Yoon et al.146 and has been shown to involve a single hydrogen bond between O2 of the third strand thymine and the free C4-amino proton on the duplex cytosine.156 This hydrogen bonding pattern can also be generated with a third strand cytosine forming the C–CG triplet. Up to three consecutive T–CG or G–TA triplets can be tolerated in the centre of a triplex, if the interaction is stabilised by a triplex-binding ligand.157
1.4.3 Analogues for CG Recognition
1.4.3.1 Pyrimidine Analogues
A number of heterocycles based on pyrimidines have been used as a means to recognise inverted CG base pairs within oligopurine tracts. The efficacy of this approach was first demonstrated within the antiparallel motif using 2-pyridone (P; Figure 1.7a).158 2-pyridone utilises a carbonyl oxygen at the 2-position for hydrogen bonding with the 4-amino hydrogen of C in a similar manner to C or T. However, it lacks the 3-nitrogen atom and the 4-carbonyl or amino group of the pyrimidines which should decrease its ability to bind to either AT or GC base pairs. Imanishi and co-workers were the first to examine the use of P in the parallel motif and prepared its deoxyribose and its 2′,4′-BNA monomer (PB). TFOs positioning a single substitution of PB opposite CG exhibited a Tm that was 9 °C higher than when it was attached to deoxyribose.159 However, the triplet was still less stable than either of the canonical base triplets. Nevertheless, this was the first study to demonstrate that combining a base analogue with a stabilising sugar modification compensates, at least in part, for the loss in binding energy at pyrimidine-purine inversions. PB can also interact with an AT base pair though the binding affinity is much weaker. To improve selectivity the bicyclic analogue 1-isoquinolone (QB; Figure 1.7b) was examined.160 It was reasoned that binding to AT base pairs would be sterically hindered by the close proximity of a 4-hydrogen of Q and the 5-methyl group of T on the opposite side of the major groove. Binding to AT was indeed reduced but so too was the desired interaction with CG. We have synthesised the 2′-aminoethoxy derivative of Q but find that in this context it is not effective for recognising CG base pairs.
Chemical structures of base analogues for CG recognition. (a) P; (b) QB; (c) Py; (d) 2APm; (e) 4HT; (f) APP (X is aminopropyl or other group); (g) QPB; (h) gC; (i) 4PC. For each structure R is deoxyribose unless otherwise stated in the main text.
Chemical structures of base analogues for CG recognition. (a) P; (b) QB; (c) Py; (d) 2APm; (e) 4HT; (f) APP (X is aminopropyl or other group); (g) QPB; (h) gC; (i) 4PC. For each structure R is deoxyribose unless otherwise stated in the main text.
Two further cytosine analogues that have been utilised for the recognition of CG base pairs are 2-pyridine (Py; Figure 1.7c)161 and 2-aminopyrimidine (2APm; Figure 1.7d).162 Since the 2 and 3-nitrogens of these analogues are basic (the pKa of 2APm is about 3.3) they are unprotonated at all practical pH values. Not only does this prevent their interaction with GC base pairs it also allows these nitrogen atoms to act as strong hydrogen bond acceptors for binding to the exocyclic amino group of C. 2APm also generates a triplet that is more isomorphous with the canonical triplets, in contrast to those formed by either T or C. Consistent with this, Py and 2APm produced triplexes that were equally stable and 4 °C more stable than T at pH 7.0, respectively. In addition, conversion of Py to a BNA derivative (PyB) led to a substantial enhancement in stability without altering its selectivity.
One of the first thymine analogues to be prepared for recognition of CG was 5-methyl-pyrimidine-2-one (4HT; Figure 1.7e), which lacks both the 4-carbonyl and 3-NH groups of T.163 As above 4HT exploits the 3-nitrogen as a hydrogen bond acceptor for bonding to the N4 amino group of C. Interestingly, it also positions the 2-carbonyl to form an unconventional C–H–O bond with the 5-hydrogen of C, a similar interaction previously observed for the N7I–GC triplet.140 Studies showed that 4HT had a decreased affinity for AT base pairs, while generating triplexes with a similar stability to those with T–CG at the same position. A further increase in affinity was observed with the 2′-aminoethoxy derivative of 4HT, producing a melting temperature Tm increase of ca. 1.5 °C per substitution relative to T.164 Within the context of fully modified 2′-AE-RNA it has been used to recognise up to five separated CG interruptions at pH 6.5 with a 33% pyrimidine content in the target strand.165 It has also been used to bind to targets containing multiple contiguous CG inversions.166 However, the affinity of the analogue is dependent on sequence context, with the most stable triplexes generated when placed between thymine residues.
We have also developed a series of nucleobases for recognising CG inversions based on methylated 3H-pyrrolo[2,3-d]pyrimidin-2(7H)-one nucleosides (e.g., Figure 1.7f). The core of these structures maintains the hydrogen bonding motif of 4HT, whilst contributing extra base stacking via an additional aromatic ring between the 4 and 5 positions. The simplest analogue, containing a methyl group at the 6-position of the pyrrolo ring (MPP), generated stable triplexes when positioned opposite CG, with an increase of ca. 3 °C relative to a control triplex (Table 1.1).167 We have attempted to improve the affinity of this analogue further by attaching different groups to the 6-position that might make additional contacts across the major groove with the guanine of the CG base pair. The addition of either aminoethyl or aminopropyl groups (APP; Figure 1.7f) at this position resulted in moderate increases in Tm, which are probably caused by protonation of the pendant amine group and charge-stabilisation interactions, rather than hydrogen bonding (Table 1.1). We reasoned that this may be due to the flexibility of the chain and prepared various phenyl-modified derivatives that included attached amino, acetamido, ureido and guanidino groups, but again no dramatic enhancement in affinity was seen.168,169 Lastly, we have synthesised the 2′-aminoethoxy variant of the APP analogue, which resulted in a Tm increase of around 4.5 °C per substitution, though it has not yet been assessed in the context of a fully-modified 2′-AE-RNA, which may improve affinity further (Table 1.1).
Several other groups have designed analogues that can bind to cytosine while simultaneously interacting with the O6 and/or N7 acceptors of guanine. In two ambitious studies by Hari and co-workers more than twenty N,N-disubstituted cytosine derivatives were prepared for CG recognition.170,171 Unlike C, these derivatives were not expected to bind to GC base pairs, due to steric repulsion of substituents on the amino group, while retaining the use of the 3-nitrogen for recognition of cytosine. The 4-[(3S)-3-guanidinopyrrolidino]-5-methylpyrmidin-2-one variant (GPB; Figure 1.7g) exhibited the highest affinity for CG, while discriminating against GC, and its attachment to BNA increased its affinity to that of T for an AT base pair. In a further study the 3-deazacytosine equivalent of GPB showed improved selectivity, through the loss of the 3-nitrogen.172 In a similar fashion the Seidman group have investigated various N4-alkyl-5-methylcytosine derivatives and the N4-(2-guanidoethyl)-5-methylcytosine analogue was the best of these (gC; Figure 1.7h).173 Within the context of a 2′-O-methyl modification, this analogue generated a much more stable triplet than T–CG but was not as stable as T–AT. Various other N4 cytosine derivatives have been developed that utilise the exocyclic nitrogen for recognition of CG.174,175 The best of these is N4-(3-acetamidopropyl)cytosine (4PC; Figure 1.7i) which positions a side chain across the major groove allowing the 3-amino group to form a hydrogen bond to the O6 carbonyl group of guanine. UV melting showed this base to be more stable than C–CG but again less stable than the canonical triplets.
1.4.3.2 Other Heterocycles
Some of the first attempts to recognise CG base pairs exploited extended heteroaromatic nucleobases designed to make contact with both partners of the base pair. The first of these was 4-(3-benzamidophenyl)imidazole (D3; Figure 1.8a) which was designed to match the edges of a CG base pair.176 It was anticipated that the ring nitrogen of the imidazole moiety would form a single hydrogen bond to cytosine, while additional stacking interactions would be possible due to the presence of two aromatic rings positioned across the major groove. Rotational freedom between these two rings could maximise non-bonding interactions. Affinity cleavage experiments showed that D3 bound to CG and TA base pairs with greater affinity than to GC or AT. However, it was later shown that this nucleobase formed triplets that are less stable when they are flanked by C+–GC on the 3′-side. A subsequent NMR study showed that D3 lacked selectivity and intercalated into the adjacent YpR step, thereby skipping the inversion site.177 Two similar carbocyclic ribofuranose analogues, L1 and L2 were also developed, which exhibited a preference for binding at pyrimidine inversions, and are also thought to bind by an intercalative mechanism.178 More recently, a variety of imidazole and triazole heterocycles have been attached to BNA sugars and assessed for their ability to target pyrimidine interruptions. Oxazole (OB; Figure 1.8b) recognised CG slightly better than TA but generated a triplex that was less stable than that with T–AT in the same position,179 most likely through an interaction between the ring oxygen and the exocyclic nitrogen of cytosine. Among the triazole nucleobases examined, a 1-(4-ureidophenyl)triazole (TzB; Figure 1.8c) provided the best enhancement in affinity but again the triplexes were less stable than their unmodified counterparts.180
Chemical structures of various heterocycles for CG recognition. (a) D3; (b) OB; (c) TzB. For each structure R is deoxyribose unless otherwise stated in the main text.
Chemical structures of various heterocycles for CG recognition. (a) D3; (b) OB; (c) TzB. For each structure R is deoxyribose unless otherwise stated in the main text.
1.4.4 Analogues for TA Recognition
The development of base analogues capable of recognising TA inversions is hampered by the presence of a methyl group at the 5-position of T. One strategy for overcoming this problem is to use a short linker that projects the analogue past the methyl group to allow recognition of the 4-carbonyl group of T. To date this has only been attempted within the context of PNA by the attachment of 3-oxo-2,3-dihydrdopyridazine (E; Figure 1.9a) via a β-alanine linker to the backbone. Triplexes containing this analogue exhibited a Tm increase of 5 °C relative to G when positioned opposite TA.181 An alternative strategy is to increase the stability of the G–TA triplet using guanine analogues but this has not been successful. For example, the addition of 2′-aminoethoxy groups to the nucleosides of guanine (AE-G) or 2-aminopurine, which is also capable of binding to guanine, decreased both the affinity and selectivity of the resultant triplets.166 We find that AE-G showed an enhanced affinity for GC base pairs, while having decreased affinity for TA base pairs (Table 1.1). We have also examined the influence of adding a propargylamino group to the 7-position of G (GPdG) but again this resulted in an increase in affinity for GC and not for TA base pairs (Table 1.1).
Chemical structures of base analogues for TA recognition. (a) E; (b) S; (c) DANac; (d) bPB. For each structure R is deoxyribose unless otherwise stated in the main text.
Chemical structures of base analogues for TA recognition. (a) E; (b) S; (c) DANac; (d) bPB. For each structure R is deoxyribose unless otherwise stated in the main text.
Greater success has been achieved with the use of extended heteroaromatic nucleobases capable of binding to both partners of the TA base pair. The first to be developed was the unnatural thiazolyl aniline monomer S (Figure 1.9b) which has the capacity for recognising both the 4-carbonyl of T and the 4- and 6-position nitrogen atoms of A.182 Experiments revealed that the S–TA triplet produced a similar stability to that of the T–AT triplet but also generated an S–CG triplet that was only moderately less stable. We have also shown that the analogue S recognises CG as well as TA at low pH, with little or no discrimination between them, though it binds better to TA at higher pHs. The interaction and selectivity is improved slightly by the addition of a 2′-aminoethoxy group (AE-S) to the sugar; one 2AE-S–TA triplet increases the Tm by 4–6 °C relative to G–TA, and it is 1–2 °C more stable than S–TA (Table 1.1).183 Although the selectivity problem could be attributed to an intercalative mode of binding it has been suggested that this is not the case since the S–TA triplet was less stable when flanked by C+–GC triplets on either side.182 Altered specificity is more likely to originate from the rotational freedom of the linker attaching the base to the sugar and/or linking the two unfused ring systems. Rotations would allow a different hydrogen bond acceptor and donor to be presented. It was therefore suggested that analogues that are more conformationally rigid might improve discrimination between base pairs. However, the first of these analogues (Bt) exhibited a lower affinity and selectivity, indicating that it favours binding by intercalation.184,185 In contrast, the analogue N-acetyl-2,7-diamino-1,8-naphtyridine (DANac; Figure 1.9c), which contains a fused ring system, was shown to recognise TA and CG with different affinities, albeit with a 2–3 °C difference in Tm.186 Lastly, the analogue 4-(3-benzamidophenyl)-2-pyrididone was prepared and shown to exhibit a good affinity for TA, with a difference of 4 °C between its interaction of TA and CG at physiological pH (bPB; Figure 1.9d).187 Since the nucleobase does not present the appropriate hydrogen bonding pattern, it is likely that it binds through intercalation.
1.4.5 Other Approaches
A different approach that allows recognition of both partners of a base pair has arisen from the design and synthesis of a series of 2-aminoquinolone and 2-aminoquinazoline C-glycoside bases (Figure 1.10). These molecules, designated TRIPsides, are designed to bind symmetrically within the major groove, unlike other triplex-forming oligonucleotides, positioning the oligonucleotide backbone in the centre of the groove.188–190 In this strategy only the purine strand of the target is read, but because the backbone is located in the centre of the major groove, either strand can be recognised by choosing the appropriate TRIPside. The antiCG, antiTA and antiGC monomers have been used in combination and allowed the recognition of a 19-mer target site in which the purines switched from one strand to the other four times.189 The antiAT monomer has also been prepared but yet to be characterised in combination with the others.191
Chemical structures of oligoTRIPsides. (a) AntiAT; (b) AntiGC; (c) AntiCG; (d) AntiTA. For each structure R is deoxyribose.
Chemical structures of oligoTRIPsides. (a) AntiAT; (b) AntiGC; (c) AntiCG; (d) AntiTA. For each structure R is deoxyribose.
1.5 Towards Mixed Sequence Recognition at Neutral pH
It is clear that a large number of base, nucleoside and nucleotide analogues have been developed for improving the triplex-forming properties of TFOs but there are few examples in which these have been used in combination to target mixed DNA sequences under physiological pH and ionic conditions. The majority of studies have focused on triplex formation with oligopurine–oligopyrimidine sequences containing a single pyrimidine–purine inversion, or the use of a single cytosine or thymine analogues within the third strand. The first study to demonstrate the selective recognition of a duplex target containing three of the four base pairs was undertaken in the Leumann laboratory.165 Fully modified 2′-aminoethoxy RNA strands were prepared with T, MeC (Figure 1.6a) and 4HT (Figure 1.7e) for the recognition of AT, GC and CG base pairs, respectively. This combination was able to recognise up to five CG inversions in a 15-mer duplex target with high selectivity and good affinity under near physiological conditions. We have also prepared oligonucleotides that contain combinations of modified nucleosides for the recognition of a duplex target containing all four base pairs.192 For this we used BAU to target AT with high affinity, AP (Figure 1.6b) for recognition of GC base pairs at elevated pHs, and APP (Figure 1.7f) and S (Figure 1.9b) for recognising CG and TA base pairs, respectively. With this combination we demonstrated triplex recognition at a 19-mer duplex target that contained four pyrimidine interruptions at neutral pH. Moreover, both footprinting and melting experiments demonstrated that this heavily modified oligonucleotide retained its sequence specificity and that changing a single base pair opposite any one of the modified nucleosides led to a large decrease in affinity. The only exception was with the S analogue, which formed stable complexes opposite both TA and CG base pairs, however, this might be improved by using its 2′-aminoethoxy derivative, which shows better discrimination. More recently, the Sekine laboratory has targeted the same duplex sequence used in our study using an oligonucleotide prepared with different nucleoside modifications.186 2′-O-methyl modified s2T (Figure 1.2c) and s8A (Figure 1.2e) were used for recognising AT and GC base pairs, respectively. Whilst gC (Figure 1.7h) and DANac (Figure 1.9c) were used for the recognition of CG and TA, respectively. Interestingly, their study demonstrated that the use of DANac improved the ability to discriminate between TA and CG base pairs, but the overall stability of the complex was lower than that of the triplex examined in our study, with a 10 °C drop in Tm for the complex under the same experimental conditions.
1.6 Outlook
Triplex-directed DNA recognition has been driven primarily by a desire to use TFOs for therapeutic applications, i.e., as gene-targeting agents for modulating gene expression. However, in recent years the number of studies investigating the therapeutic application of TFOs, as well as other DNA-recognition agents, has started to dwindle. Perhaps the field is being overshadowed by the continued success of CRISPR–Cas technologies, which offer the greatest potential for directing heritable change to the germline of a species. However, the ability to transiently influence gene expression by triplex formation should not be overlooked, since it offers an alternative, reversible approach for addressing faulty genes.3,4 Many of the base, sugar and/or phosphate modifications described in this chapter will be useful in this regard and several of these modifications are now commercially available; removing the requirement for expertise in phosphoramidite synthesis. Recent improvements in our understanding of the pharmacodynamic and pharmacokinetic properties of oligonucleotides, due in part to a re-emergence of antisense technologies, will also aid in their application.193 Triplex-directed DNA recognition is also starting to show promise in other fields, such as bionanotechnology, where it provides a means to introduce functionality into DNA nanostructures by exploiting the sequence addressability of duplex regions assembled by strand exchange.6,194,195
Original work in the authors’ laboratories was supported by BBSRC grants BB/J001694, BBH019219, and BB/C004531.