Skip to Main Content
Skip Nav Destination

The cytochromes P450 constitute a superfamily of haem-thiolate enzymes which are ubiquitous in nature.1,2 Figure 1.1 shows the many fields in which P450s play important roles, thus highlighting their relevance to several branches of biological science. During the course of evolution, the P450 structure developed to bind the following entities: oxygen, carbon-based substrates, a haem group and redox partners, such as an iron-sulphur redoxin, an NADPH-dependent FAD- and FMN-containing flavoprotein reductase and cytochrome b5.1–6  In eukaryotic P450 systems, a membrane phospholipid bilayer such as that present in the smooth endoplasmic reticulum is also able to bind, as summarised in Tables 1.1 and 1.2. Mitochondrial P450s, such as CYP11 in the adrenal cortex, retain some of the prokaryotic P450 characteristics in possessing an iron-sulphur redoxin (specifically adrenodoxin) as a redox partner rather than utilising an NADPH-dependent flavoprotein reductase.7–9  This finding suggests that the mitochondria and, indeed, other cell organelles may have had bacterial origins, as has been reported previously from RNA comparisons.10,11  Over 5000 P450 sequences have been reported (Osamu Gotoh, personal communication) and it appears that these enzymes are present in all five biological kingdoms. Consequently, P450s will have undergone modifications during the course of evolution in order to adapt to the changes in environmental and cellular conditions.

Figure 1.1

The fields of P450 research showing the various inter-relationships between them. A number of the above areas are inter-related and the scheme is not intended to be comprehensive but to provide a flavour of the breadth of the P450 field.

Figure 1.1

The fields of P450 research showing the various inter-relationships between them. A number of the above areas are inter-related and the scheme is not intended to be comprehensive but to provide a flavour of the breadth of the P450 field.

Close modal
Table 1.1

Classification of P450-containing systems from various sources.9,42 

Components in electron transfer pathwayType of systemEnvironment

Notes: 1. In the Bacillus megaterium-3 system, the flavoprotein and haemoprotein domains are fused into a single polypeptide. 2. The FAD and FMN cofactors are generally bound into a single flavoprotein although exceptions exist where they can form separate proteins. 3. Systems involving fused redoxin and FMN domains are also known to exist, in addition to those where the redoxin and haemoprotein domains are fused. 4. CYP55 is an example of a P450 where there are in fact no redox partners involved in the catalytic system, which effects the reduction of nitric oxide to nitrous oxide.

 
 Bacterial Cytosolic 
 Mitrochondrial Membrane-bound 
 Microsomal Membrane-bound 
 Bacillus megaterium-3 Cytosolic 
Components in electron transfer pathwayType of systemEnvironment

Notes: 1. In the Bacillus megaterium-3 system, the flavoprotein and haemoprotein domains are fused into a single polypeptide. 2. The FAD and FMN cofactors are generally bound into a single flavoprotein although exceptions exist where they can form separate proteins. 3. Systems involving fused redoxin and FMN domains are also known to exist, in addition to those where the redoxin and haemoprotein domains are fused. 4. CYP55 is an example of a P450 where there are in fact no redox partners involved in the catalytic system, which effects the reduction of nitric oxide to nitrous oxide.

 
 Bacterial Cytosolic 
 Mitrochondrial Membrane-bound 
 Microsomal Membrane-bound 
 Bacillus megaterium-3 Cytosolic 
Table 1.2

Regions of the P450 structure associated with binding various system components.1 

ComponentRegions of the enzyme structure involved in binding component

Notes: 1. The invariant cysteine which ligates the haem iron is also involved with haem binding and oxygen activation. 2. Conserved acidic residue preceding the conserved distal threonine is involved in the catalytic cycle charge relay system. 3. A conserved tryptophan residue may also be associated with redox partner and electron transfer to the haem, together with a conserved phenylalanine residue proximal to the haem face. 4. A number of ion-pairs in the active site may be associated with proton transfer to the haem-bound oxygen, and a conserved acidic residue in the L helix could effect water ingress to the active site. 5. A number of interhelical loop regions may also be involved in the binding of P450 to the phospholipid membrane in the microsomal system.

 
Haem I and L helices, two conserved basic residues and conserved tryptophan 
Dioxygen I helix ‘kink’ close to conserved distal threonine 
Redox Partner Conserved basic residues close to the proximal haem face 
Substrate Substrate recognition site (SRS) regions involved B′, F and I helices together with β1(4) and β4(2) turn regions 
Phospholipid N-terminal peptide comprising the first 20–40 residues 
ComponentRegions of the enzyme structure involved in binding component

Notes: 1. The invariant cysteine which ligates the haem iron is also involved with haem binding and oxygen activation. 2. Conserved acidic residue preceding the conserved distal threonine is involved in the catalytic cycle charge relay system. 3. A conserved tryptophan residue may also be associated with redox partner and electron transfer to the haem, together with a conserved phenylalanine residue proximal to the haem face. 4. A number of ion-pairs in the active site may be associated with proton transfer to the haem-bound oxygen, and a conserved acidic residue in the L helix could effect water ingress to the active site. 5. A number of interhelical loop regions may also be involved in the binding of P450 to the phospholipid membrane in the microsomal system.

 
Haem I and L helices, two conserved basic residues and conserved tryptophan 
Dioxygen I helix ‘kink’ close to conserved distal threonine 
Redox Partner Conserved basic residues close to the proximal haem face 
Substrate Substrate recognition site (SRS) regions involved B′, F and I helices together with β1(4) and β4(2) turn regions 
Phospholipid N-terminal peptide comprising the first 20–40 residues 

P450 structures are known from X-ray crystallographic determinations12–32  and a list of these is provided in Table 1.3; surprisingly, there are relatively few differences across species ranging from bacteria to mammalia. Such differences include orientations of the B′ and F helices, together with the extent of polypeptide in the loop regions between certain helical motifs, such as those of the F-G loop and the H-I loop. Also, there is an additional N-terminal peptide of some 40 residues which serves as a membrane anchor in eukaryotic P450s, and this is absent in prokaryotic sequences. Furthermore, it is not only in the primary sequence but also in the spatial orientation of secondary structural elements with respect to the haem moiety that P450s can vary. This may explain why homology models do not always correctly reproduce the active site topographies encountered in the actual crystal structures, although a relatively high homology (ie. over 40%) should provide a fairly good match between homology models and crystal structures. Indeed, for enzymes of the CYP2C family, it is found that models of CYP2C8 and CYP2C9, based on the CYP2C5 template, do indeed match closely with the actual crystal structures,33,34  and the relationship between degree of fit and sequence homology is explored further in a following section.

Table 1.3

Unique cytochrome P450 crystal structures (from Protein Data Bank, 2006)

CYPSpeciesResolution (Å)SubstrateProtein data- bank code
101A1 Pseudomonas putida 1.62 camphora 2cpp 
102A1 Bacillus megaterium 1.65 N-palmitoylglycine 1jpz 
107A1 Saccharopolyspora eyrthraea 2.10 6-deoxyerythronolide Ba 1oxa 
108A1 Pseudomonas sp2.30 none present 1cpt 
119A1 Sulfolobus solfataricus 1.50 none presenta 1io7 
119A2 Sulfolobus tokodaii 3.00 none present 1ue8 
121A1 Mycobacterium tuberculosis 1.06 none presenta 1n4o 
152A1 Bacillus subtilis 2.10 myristic acid 1izo 
154A1 Streptomyces coelicolor 1.85 none present 1odo 
154C1 Streptomyces coelicolor 1.92 none present 1gwi 
158A2 Streptomyces coelicolor 1.50 none present 1slf 
165A3 Amycolatopsis orientalis 1.70 none present 1lfk 
167A1 Sorangium cellulosum 1.93 epithilone B 1q5d 
175A1 Thermus thermophilus 1.80 none present 1n97 
51A1 Mycobacterum tuberculosis 1.55 estriola 1x8v 
55A1 Fusarium oxysporum 1.00 nitric oxide 1jfb 
2A6 human 1.90 coumarina 1z10 
2B4 rabbit 1.60 none presenta 1po5 
2C5 rabbit 2.10 diclofenaca 1nr6 
2C8 human 2.70 none present 1pq2 
2C9 human 2.00 flurbiprofen 1r9o 
2D6 human 3.00 none present 2f9q 
3A4 human 2.05 none presenta 1tqn 
CYPSpeciesResolution (Å)SubstrateProtein data- bank code
101A1 Pseudomonas putida 1.62 camphora 2cpp 
102A1 Bacillus megaterium 1.65 N-palmitoylglycine 1jpz 
107A1 Saccharopolyspora eyrthraea 2.10 6-deoxyerythronolide Ba 1oxa 
108A1 Pseudomonas sp2.30 none present 1cpt 
119A1 Sulfolobus solfataricus 1.50 none presenta 1io7 
119A2 Sulfolobus tokodaii 3.00 none present 1ue8 
121A1 Mycobacterium tuberculosis 1.06 none presenta 1n4o 
152A1 Bacillus subtilis 2.10 myristic acid 1izo 
154A1 Streptomyces coelicolor 1.85 none present 1odo 
154C1 Streptomyces coelicolor 1.92 none present 1gwi 
158A2 Streptomyces coelicolor 1.50 none present 1slf 
165A3 Amycolatopsis orientalis 1.70 none present 1lfk 
167A1 Sorangium cellulosum 1.93 epithilone B 1q5d 
175A1 Thermus thermophilus 1.80 none present 1n97 
51A1 Mycobacterum tuberculosis 1.55 estriola 1x8v 
55A1 Fusarium oxysporum 1.00 nitric oxide 1jfb 
2A6 human 1.90 coumarina 1z10 
2B4 rabbit 1.60 none presenta 1po5 
2C5 rabbit 2.10 diclofenaca 1nr6 
2C8 human 2.70 none present 1pq2 
2C9 human 2.00 flurbiprofen 1r9o 
2D6 human 3.00 none present 2f9q 
3A4 human 2.05 none presenta 1tqn 
a

Another crystal structure has been reported for this enzyme which contains a bound inhibitor.

The tertiary structure of P450 possesses certain well-conserved ion-pairs (such as the ExxR motif) and other features (like the polyproline motif near the N-terminus) which determine its overall folding pattern, together with various hydrogen bonded, π-π stacking and hydrophobic contacts between amino acid residues, some of which are associated with haem and substrate binding. However, certain conserved basic residues on the surface of the enzyme tend to be utilised for the binding of redox partners. These charged surface residues are largely conserved across the superfamily for redox partner binding, and this can be illustrated by the example of putidaredoxin (an iron-sulphur redoxin) binding to P450cam,35  and also that of adrenodoxin binding to its reductase36  where it is apparent that the redoxin fits closely within a depression on the reductase surface. The structure of P450 resembles a triangular prism in overall shape with the haem moiety located approximately at the centre of one triangular face. The haem group lies in a depression, which is ideally suited, in terms of shape and complementary surface residues, for the binding of redox partners such as redoxins, cytochrome b5 and reductase.35  However, there are certain subtle distinctions between the entire binding regions of these various redox partners and the corresponding P450 enzymes8  which may explain why certain redox partners tend to bind at particular surface locations on the P450 enzymes.

It is generally thought that life emerged around 3.5 billion years ago, which is about a billion years after the Earth's formation (∼4.55 billion years ago) although it is possible that the development of life may have occurred even earlier.37  P450s are ubiquitous within living systems and the enzymes have been found in all biological kingdoms, but some Archaea (early prokaryotes) such as Escherichia coli, do not appear to contain any P450 enzymes whatsoever. However, two P450s have been isolated and crystallised from the thermophilic bacterial species Sulfolobus solfataricus and Thermus thermophilus.35  These organisms are apparently able to exist in the hot sulphurous conditions encountered in the thermal vents of undersea volcanic fissures, which are thought to represent the likely environment for early life formation around 3.5 billion years ago. Interestingly, the presence of iron and sulphur in such oceanic vents may have assisted in the generation of proteins requiring such elements including, for example, cytochrome P450 and its iron-sulphur ferredoxin redox partner.2 Figure 1.2 represents an abbreviated phylogenetic tree for certain P450s, and shows how this accords with the general development of terrestrial biota.

Figure 1.2

An abbreviated version of the P450 phylogenetic tree showing the parallel development of certain species of biota.

Figure 1.2

An abbreviated version of the P450 phylogenetic tree showing the parallel development of certain species of biota.

Close modal

As the oxygen levels in the atmosphere began to increase around 2.5 billion years ago, protective systems would have developed to ensure species survival in a more aerobic environment, and eurakyotes are thought to have emerged about 2.1 billion years ago.38–40  This would have afforded some protection from the deleterious effects of free oxygen, and it is also possible that an early role of P450 at this time may have been in the detoxification of O2 itself.41  In microbial species, P450s are involved in the biosynthesis of antibiotics, certain toxins and for the generation of secondary metabolites.42  The full characterisation of such activities is currently being investigated, particularly with respect to the development of novel therapeutic agents via protein engineering of bacterial P450 enzymes, such as those from various Streptomyces species.42 

With the development of metazoa,43  the endogenous roles of P450 changed to steroid biosynthesis, together with that of fatty acid and prostanoid/eicosanoid metabolism.44  When animal species started to colonise land areas in the Devonian period45  about 400 million years ago, plants developed toxins to deter animal predators, and many of these toxic compounds are known to be synthesised in part via P450-mediated pathways.44  However, it is thought that animal species began to develop new P450 enzymes specifically for the detoxification of these harmful plant products, and a phylogenetic analysis of P450s across the superfamily appears to support this viewpoint.46–52  Insect species also developed P450s with detoxifying roles. For example, the black swallowtail butterfly, Papilio polyxenes, possesses a specific P450 (CYP6B1) for metabolising the plant toxin methoxsalen (xanthotoxin), which itself may have been biosynthesised via the mediation of P450 enzymes.53  Furthermore, another insect-plant co-evolutionary role of P450 emerged when flowering plants appeared about 125 million years ago, as various P450s are known to be responsible for the biosynthesis of flower pigments such as the anthocyanins.54  It is thought, therefore, that the exogenous roles of P450s may have developed over a geological timescale via a co-evolutionary process, commonly termed plant-animal ‘warfare’ where animals developed certain xenobiotic-metabolising P450s to detoxify the deleterious effects of plant toxins which had been biosynthesised to deter animal predators.

The huge diversity of P450 functionality55  may, therefore, have arisen from the increase in atmospheric oxygen which was harnessed for oxidative metabolism and biosynthesis. For P450 utilises the inherent chemical power of the dioxygen molecule and controls its activation via two consecutive reductions to peroxide, which probably represents the precursor for the active oxygen species that is inserted into carbon-based substrates.56  The development of P450's changing roles mirrors the evolution of terrestrial biota as indicated in Figure 1.2, and it should also be recognised that periodic mass extinctions57  have played their part in the rise of the mammalia and, eventually, to mankind itself. For example, the elaboration of the CYP2 family occurred after the major global extinction event at the end of the Permian period, approximately 250 million years ago.

As mentioned previously (vide supra), P450s are not found in E. coli and other primitive anerobic bacteria (ie. the Archebacteria) although they are present in certain thermophiles.58  These may provide a possible clue to the origins of P450 in abyssal thermal vents where there would have been the relatively high concentrations of iron and sulphur required for the formation for haem-thiolate proteins, as well as the iron-sulphur redoxins which constitute the earliest form of P450 redox partner. The relatively high percentage and clustering of aromatic amino acid residues present in these thermophilic bacterial P450s (from Sulfolobus solfataricus and Thermus thermophilus) are thought to provide a likely explanation for the thermal and high-pressure stability of these enzymes,35  thus enabling them to tolerate such extreme environments as may have occurred extensively in early terrestrial prehistory. The substrates for these P450s remain to be determined but one can realistically assume that oxygen levels would have been relatively low during this stage of biological development, possibly at a fraction of 1%. When atmospheric oxygen levels eventually started to increase, perhaps P450 enzymes had some role in detoxifying oxygen before the development of other biological defense systems for dealing with reactive oxygen species (ROS).41  In evolutionary terms, it would appear that dioxygen was not in the abundance it is today for most of the development of biological systems.59  It is interesting to note that the only fungal P450 to have had its crystal structure determined is, in fact, a nitric oxide reductase (CYP55) from Fusarium oxysporum, which utilises only nitric oxide in a coupling reaction, without the use of oxygen, to form nitrous oxide. However, the vast majority of P450 reactions (of which there are over 50 different types) involve splitting of the dioxygen molecule and subsequent monooxygenation of substrates5,60,61  while the unusual P450 reactions include: reduction, desaturation, oxidative ester cleavage, ring expansion, ring formation, aldehyde scission, dehydration, isomerisation, ipso attack of aromatic rings, one-electron oxidation, coupling reactions, rearrangement of fatty acid and prostaglandin hydroperoxides, and oxidative deamination. The more common P450-catalysed reactions are: aliphatic and aromatic hydroxylation, N-hydroxylation, N- and S-oxidation, O-, S- and N-dealkylation, aromatisation, dehalogenation, dehydro-halogenation, epoxidation, deformylation and the reduction of nitro compounds, N-oxides, quinones, epoxides, azo compounds and certain halogen compounds. It is in the mammalian P450s where a wide diversity of reactions are characterised and, to some extent, this is due to the relatively large number of P450s in such species, together with their various endogenous and exogenous roles.

In the human P450 complement, there is the well-documented situation known as genetic polymorphism, where genetic defects in individual P450s (mainly CYP2D6 and CYP2C19) can give rise to significant changes in metabolic capacity towards certain substrates, including a number of drugs in clinical use.62–73  Consequently, there is considerable current interest in the screening of novel compounds that are destined for human exposure, such that adverse drug reactions can be avoided.

The apoprotein in P450 performs a number of specific functions by virtue of its primary, secondary and tertiary structure; some of these will be described in the remaining sections of this chapter, and Table 1.2 provides a summary. It is evident that the unusual functionalities of P450 enzymes evolved during the course of biological development from prokaryotic to eukaryotic organisms. Being haemoproteins, P450s contain a haem prosthetic grouping which, in common with most haemoproteins, is able to bind dioxygen although, in haem-thiolate enzymes like P450, oxygen becomes activated via the unique properties of the thiolate ligand.74  In addition to the iron-sulphur covalent linkage, the haem moiety is bound by two conserved basic amino acid residues which form ion-pairs with the two haem propionate sidechains. A generally well-conserved tryptophan residue in the C helix, normally encountered in most mammalian P450s, is able to form a hydrogen bond with one of the haem propionate head-groups.1,2  Furthermore, the relatively planar haem group is bound in an essentially hydrophobic cleft within the P450 protein, formed by an intersection of the I and L helices, where a number of complementary residues enter into hydrophobic contacts with the haem structure. Also, a generally well-conserved serine residue in the C helix, located fairly close to the haem group, is thought to represent a phosphorylation site for initiating haem degradation (reviewed in reference 1).

The Fe2+/Fe3+ redox potential in P450 is modulated by substrate binding such that reduction of the enzyme by its redox partner is facilitated.75,76  In an ideal system like that of P450cam (CYP101), the binding of camphor makes the Fe2+/Fe3+ redox potential become less negative, such that P450cam lies on a potential gradient between those of NADH, putidaredoxin and dioxygen, as shown in Figure 1.3. This mechanism prevents reduction of P450cam in the absence of the camphor substrate and there is a degree of similarity to that encountered in other bacterial systems, together with most mammalian P450s, although the situation is not so clear-cut in the microsomal system.77  The binding of redox partners shows both similarities and differences between various P450s, depending on the type of system concerned.35  In P450cam (CYP101), a group of basic residues surrounding the proximal haem face appear to form electrostatic interactions with a complementary cluster of acidic sidechains in the iron-sulphur protein, putidaredoxin, which enables electron transfer from the Fe2S2 centre to the haem via the mediation of a C-terminal tryptophan residue in putidaredoxin.78  A well-conserved proximal phenylalanine in P450 is thought to facilitate electron transport to the haem group, possibly by aromatic π-π stacking interactions, and Phe350 in P450cam is an example of this type of grouping.35  This residue is present in both prokaryotic and eukaryotic P450s, whereas it is possible that a conserved tryptophan (vide supra) which is present in many, but not all, microsomal P450s could have undergone transference from the ferredoxin redox partner during the course of evolution. However, there is also evidence to suggest that this tryptophan plays a role in haem binding rather than solely acting as a conduit for electrons during the reduction process.36 

Figure 1.3

Redox potentials of various components in the P450cam system, including redox partners and oxygen species.

Figure 1.3

Redox potentials of various components in the P450cam system, including redox partners and oxygen species.

Close modal

Interestingly, cytochrome b5 can also reduce P450cam (CYP101), and the way in which the b5 redox partner is likely to interact with CYP101 probably resembles that of the putidaredoxin interaction and this finding is, therefore, suggestive of another evolutionary linkage between prokarytic and eukaryotic P450 systems. Apparently, there is a coupling between the iron redox and spin-state equilibria in certain P450 systems75,76  which can, in certain instances, also exhibit correlations with both substrate binding affinity and metabolic rate of the P450-mediated reactions.79,80  This would indicate that the P450 system has developed over an evolutionary timescale with inbuilt mechanisms for coupling its various biophysical chemistry equilibria to the optimised metabolism of substrates, depending on the cellular environment.

Several basic residues on the surface of P450 are thought to be associated with the interaction of its reductase redox partner.8  Apart from those already discussed, which are situated in the proximal haem region, there is also a conserved surface lysine (situated between the K and K′ helices) that is present in both microsomal and some bacterial P450s (often forming a PKG motif) and which may serve as a point for ion-paired contact with flavoprotein reductase in the microsomal P450s and, possibly, for ferredoxin reductase in certain bacterial P450s.7,8  Similarly, adrenodoxin is able to bind to its reductase via charge-pair interactions,36  and there are certain similarities between the putidaredoxin binding interaction with P450cam and that of the FMN and haem domains in P450BM3,35  thus indicating that P450 redox partner interactions have adapted during the process of evolution by subtle ‘fine-tuning’ of key amino acids involved in these contact points.

One of the major differences between bacterial and microsomal P450s lies in the fact that the former are cytosolic whereas the latter are membrane-bound. There is an N-terminal ‘anchor’ peptide of between 20–40 residues in length, composed of essentially hydrophobic amino acids, which is thought to span the phospholipid bilayer in the microsomal P450 systems, and sequence comparisons indicate that this feature is absent in bacterial forms of the enzyme. A number of surface ‘loop’ regions, that are present in microsomal P450s as opposed to those from bacterial species, may also be associated with membrane phospholipid interactions.35  The binding of phospholipid is, therefore, achieved primarily by the additional stretch of N-terminal peptide relative to bacterial sequences, and this probably forms an essentially helical conformation which is thus able to span the entire thickness of a phospholipid bilayer.35  It is not clear to what extent the rest of the P450 is embedded in the membrane, although it is apparent that both lateral and transverse movement of the enzyme is possible, together with rotation of the P450 about an axis perpendicular to the plane of the membrane itself. Furthermore, there is evidence for either octameric or hexameric clusters of P450 subunits which may be orientated around a central reductase.81  The so-called ‘flip-flop’ or transverse motion of P450 within the microsomal membrane could be related to the conformational and/or flotation changes which may accompany substrate binding, thus leading to a preferred orientation of the enzyme for reduction by the flavoprotein reductase. It is thought that between 35Å and 45Å of the P450 structure is located above the microsomal membrane35,82  which implies that roughly 20% of the enzyme is embedded within the phospholipid itself. Presumably, this situation would favour the binding of essentially lipophilic substrates via passage through the membrane bilayer, and it is clear that lipophilicity plays a role in substrate binding to microsomal P450s.83  The overall structure of a P450 such as CYP101 resembles a triangular prism with a side length of about 60Å15  and microsomal P450s are probably of a similar size. Consequently, one can imagine that around 10Å of the P450 structure may be embedded within the smooth endoplasmic reticular membrane, and this would be consistent with the experimental findings of Schwarz and colleagues.81,84 Figure 1.4 compares the crystal structures of two bacterial P450s with one fungal and one mammalian form of the enzyme, thus showing the overall similarity between their tertiary folds.

Figure 1.4

Comparison between P450 crystal structures for CYP101, CYP102, CYP55 and CYP2C5, showing α-helical and β-sheet regions in red and blue, respectively. The haem groups are shown in yellow, whereas the substrates are magenta. The protein data base codes for these structures are given in Table 1.10.

Figure 1.4

Comparison between P450 crystal structures for CYP101, CYP102, CYP55 and CYP2C5, showing α-helical and β-sheet regions in red and blue, respectively. The haem groups are shown in yellow, whereas the substrates are magenta. The protein data base codes for these structures are given in Table 1.10.

Close modal

Oxygen binding at the haem iron is effected via initial ingress at a ‘kink’ in the I helix, following an initial reduction of the substrate-bound complex, due to the relatively strong affinity of high-spin iron(II) for the electron-deficient dioxygen molecule, which is presumably in the triplet ground state. The distal haem face is ligand-free, following desolvation of the essentially hydrophobic active site region by ingress of the substrate molecule, and oxygen is known to bind to this position.19  The Fe(II)/Fe(III) redox potential of P450 is altered by the binding of substrate, becoming less negative, which thus facilitates reduction by its redox partner. Upon oxygen ligation to the reduced P450, it is thought that an electron is transferred from iron(II) to the dioxygen molecule which, presumably, then forms the superoxide anion, O2. The O2/O2 redox potential is −160 mV under physiological conditions and, therefore, electron transfer from the reduced P450 is thermodynamically favourable when slightly more negative than the oxygen/superoxide couple (Figure 1.3). Residue changes in the oxygen-binding pocket of the distal I helix brings about uncoupling of the distal charge relay and this gives rise to P450s which exhibit unusual reactions, such as those encountered in allene oxide synthase (CYP74), thromboxane synthase (CYP5), prostacyclin synthase (CYP8), erythromycin synthase (CYP107A1) and nitric oxide reductase (CYP55) where the typical pattern of GG x D/ET (where x may be any residue) for the oxygen-binding pocket is fundamentally altered.85  This is, therefore, a further example by which P450 functionality can be ‘fine-tuned’ by subtle changes to the apoprotein.

The ability to bind and metabolise a large number of substrates of diverse chemical class is a fundamental feature of P450 activity. However, it should be recognised that this functionality depends very much on the actual enzyme concerned. For example, the steroidogenic P450s which are involved in the biosynthesis of various steroid hormones (namely: CYP11, CYP17, CYP19 and CYP21, as shown in Figure 1.5) are exquisitely selective towards their specific substrates whereas CYP3A4, which constitutes a significant proportion (∼30%) of the human hepatic P450 complement, is able to metabolise structurally diverse xenobiotics of which over 1000 are currently known.86,87  In addition, there is the possibility of metabolic activation of chemicals in certain instances, depending upon the type of substrate and P450 enzyme involved, especially CYP1 and CYP2E.88  Endogenous roles of several drug-metabolising P450s have also been identified, such as CYP2C989,90  and it has been well established that the induction of many mammalian P450s are, in general, mediated by ligand interaction with various nuclear receptor proteins.91,92  The endogenous roles of P450, such as steroid biosynthesis, probably developed prior to exogenous functions although, in some cases, these may have occurred concurrently. The 57 human P450s exhibit a mixture of both exogenous and endogenous functionality, as summarised in Table 1.4, where the different substrate classes are shown.

Figure 1.5

Steroid hormone biosynthetic pathways in mammalia showing the distinction between conversions carried out in the mitochondria and endoplasmic reticulum.

Figure 1.5

Steroid hormone biosynthetic pathways in mammalia showing the distinction between conversions carried out in the mitochondria and endoplasmic reticulum.

Close modal
Table 1.4

Human cytochrome P450 enzymes (57 in total) and their substrate classes.150 

SubstratesP450 Enzymes involved

Notes: 1. Other known human P450s for which the functionalities remain to be established include: 2A7, 2R1, 2S1, 2U1, 2W1, 3A43, 4A22, 4F11, 4F22, 4V2, 4X1, 4Z1, 20A1, 26C1 and 27C1. 2. Some P450s (eg. 1A2 and 1B1) could be placed in another class, such as steroid metabolism. 3. Over 75% of the human liver P450 complement is comprised of xenobiotic metabolising enzymes.

 
Fatty Acids 4A11, 4B1, 4F12, 2J2 
Eicosanoids 4F2, 4F3, 4F8, 5A1, 8A1 
Vitamins 24A1, 26A1, 26B1, 27B1 
Steroids 7A1, 7B1, 8B1, 11A1, 11B1, 11B2, 17A1, 19A1, 21A2, 27A1, 39A1, 46A1, 51A1 
Xenobiotics 1A1, 1A2, 1B1, 2A6, 2A13, 2B6, 2C8, 2C9, 2C18, 2C19, 2D6, 2E1, 2F1, 3A4, 3A5, 3A7 
SubstratesP450 Enzymes involved

Notes: 1. Other known human P450s for which the functionalities remain to be established include: 2A7, 2R1, 2S1, 2U1, 2W1, 3A43, 4A22, 4F11, 4F22, 4V2, 4X1, 4Z1, 20A1, 26C1 and 27C1. 2. Some P450s (eg. 1A2 and 1B1) could be placed in another class, such as steroid metabolism. 3. Over 75% of the human liver P450 complement is comprised of xenobiotic metabolising enzymes.

 
Fatty Acids 4A11, 4B1, 4F12, 2J2 
Eicosanoids 4F2, 4F3, 4F8, 5A1, 8A1 
Vitamins 24A1, 26A1, 26B1, 27B1 
Steroids 7A1, 7B1, 8B1, 11A1, 11B1, 11B2, 17A1, 19A1, 21A2, 27A1, 39A1, 46A1, 51A1 
Xenobiotics 1A1, 1A2, 1B1, 2A6, 2A13, 2B6, 2C8, 2C9, 2C18, 2C19, 2D6, 2E1, 2F1, 3A4, 3A5, 3A7 

Essentially, the binding and selectivity of P450 substrates are both linked to the particular characteristics of the enzyme's active site region.93  It has been generally established that so-called Substrate Recognition Sites are present in all P450s, and these have been extensively mapped out using the techniques of site-directed mutagenesis94  following multiple sequence alignment studies for selected CYP2 family proteins;95 Table 1.5 shows an alignment of the six SRSs for the human CYP2 family. Furthermore, molecular modelling of various P450s based on sequence homology with those for which the crystallographic coordinates are available, has proved to be extremely useful for understanding and exploration of the structural determinants governing both selectivity and binding affinity towards their respective substrates and inhibitors.2,96–104  It is possible to derive certain methodologies for the estimation of binding affinity between P450s and their substrates using a variety of techniques, such as homology modelling (vide supra), and from quantitative structure-activity relationships (QSARs). In addition, there are several docking programs available for orientating molecules within enzyme active sites and, as detailed later, we have found that AutoDock version 3.05105,106  gives satisfactory binding energies for human P450 substrates, together with reproducing the experimentally observed positions of metabolism in the majority of cases studied thus far.107 Table 1.6 shows that, for CYP2C9 substrates, there is a good correlation between AutoDock-calculated and experimental binding energies. In addition, QSAR analysis of 90 human P450 substrates shows that the combination of substrate log P (octanol/water partition coefficient) number of hydrogen bonds and π-π stacking interactions formed on binding to the enzyme produces very good correlations with binding energy data, where the R values (correlation coefficients) obtained are about 0.98.100  Neural network analysis provides a useful means of examining the parametric contributions to P450 selectivity based on a number of substrate physicochemical quantitites, and this technique has been found to give satisfactory results by correctly discriminating the enzyme involved in 95% of human P450 substrates studied from a dataset of 64 compounds.108  There is evidence that the six SRS regions correspond with four areas of α-helical and two strands of β sheet structure in P450s, and protein sequence analysis from multiple alignment can indicate the likely amino acid residues for further examination via site-directed mutagenesis studies,94,109–112  such that enzyme selectivity determinants can thus be explored. Consequently, the combination of molecular modelling, site-directed mutagenesis and X-ray crystallography can be employed in the important exercise of understanding selectivity via building up substrate templates within the relevant enzymes’ active sites, which also tend to show good agreement with experimental findings on the known routes of P450-mediated metabolism.113,114 

Table 1.5

Alignment of SRS regions in CYP2 family enzymes showing substrate contact residues and SDMs

CYPSRS1SRS2SRS3SRS4SRS5SRS6

Key: Italics = contact residue with substrates in either the crystal structure or model of the enzyme 
    Bold = sdm residue, including those performed on another enzyme in the same subfamily 
     * = corresponds to a change encountered in an allelic variant of the enzyme 
   SDM = site-directed mutagenesis 
Key: Italics = contact residue with substrates in either the crystal structure or model of the enzyme 
    Bold = sdm residue, including those performed on another enzyme in the same subfamily 
     * = corresponds to a change encountered in an allelic variant of the enzyme 
   SDM = site-directed mutagenesis 

 

Notes: 1. CYP2C5, CYP2A6, CYP2C8, CYP2C9 and CYP2D6 have had their crystal structures determined. CYP2B6 and CYP2C19 have been modelled by homology with the CYP2B4 and CYP2C9 crystal structures, respectively. 2. The analogous residues in the CYP2C5 sequence are included for comparison as this enzyme has been used as a crystallographic template for modelling various human P450s from the CYP2 family, including CYP2E1. 3. The protein data bank codes for the relevant crystal structures investigated are as follows: 2C5 (1n6b and 1nr6), 2A6 (1z10), 2C8 (1pq2), 2C9 (1r9o) and 2D6 (2f9q). 4. The CYP2B6 enzyme was modelled from the 1suo (CYP2B4) crystal structure and CYP2C19 was constructed from the 1og5 (CYP2C9) crystallographic coordinates. 5. The CYP2E1 enzyme was produced via homology with the CYP2C5 crystal structure (1n6b) where the sequence identitiy is 59%.

 
2C5 S99 V100 I102 L103 K108 A113 F114 N204 V205 L208 S209 P220 A221 A237 K241 D290 G293 A294 E297 T298 L359 N362 L363 N471 G472 F473 V474 
2A6 E103 Q104 T106 F107 K112 V117 F118 I208 F209 T212 S213 S224 S225 L241 E245 N297 I300 G301 E304 T305 I366 S369 L370 V478 G479 F480 A481 
2B6 K100 I101 M103 V104 R109 I114 F115 T205 F206 I209 S210 S221 G222 L238 N242 S294 F297 A298 E301 T302 L363 G366 V367 C475 G476 V477 G478 
2C8 N99 S100 I102 S103 K108 I113 S114 N204 F205 L208 N219 P220 L221 V237 R241 D293 V296 A297 E300 T301 V362 G365 V366 K474 G475 I476 V477 
2C9 I99 F100 L102 A103 R108 V113 F114 N204 I205 L208 S209 S220 P221 V237 K241 D293 G296 A297 E300 T301 L362 S365 L366 N474 G475 F476 A477 
2C19 H99 F100 L102 A103 R108 V113 F114 N204 I205 V208 S209 P220 T221 L237 E241 D293 G296 A297 E300 T301 I362 S365 L366 N474 G475 F476 A477 
2D6 P103 V104 I106 T107F112 F120 L121 G212L213 E216 S217 V227 P228 Q244 L248 D301 S304 A305 V308 T309 V370 G373 V374F481 A482 F483 L484 
2E1 G101 L103 A105 F106 R110 I115 F116 N206 F207 L210 S211 P222 S223 V239 K243 D295 F298 A299 E302 T303 V364 N367 L368 I476 G477 F478 G479 
CYPSRS1SRS2SRS3SRS4SRS5SRS6

Key: Italics = contact residue with substrates in either the crystal structure or model of the enzyme 
    Bold = sdm residue, including those performed on another enzyme in the same subfamily 
     * = corresponds to a change encountered in an allelic variant of the enzyme 
   SDM = site-directed mutagenesis 
Key: Italics = contact residue with substrates in either the crystal structure or model of the enzyme 
    Bold = sdm residue, including those performed on another enzyme in the same subfamily 
     * = corresponds to a change encountered in an allelic variant of the enzyme 
   SDM = site-directed mutagenesis 

 

Notes: 1. CYP2C5, CYP2A6, CYP2C8, CYP2C9 and CYP2D6 have had their crystal structures determined. CYP2B6 and CYP2C19 have been modelled by homology with the CYP2B4 and CYP2C9 crystal structures, respectively. 2. The analogous residues in the CYP2C5 sequence are included for comparison as this enzyme has been used as a crystallographic template for modelling various human P450s from the CYP2 family, including CYP2E1. 3. The protein data bank codes for the relevant crystal structures investigated are as follows: 2C5 (1n6b and 1nr6), 2A6 (1z10), 2C8 (1pq2), 2C9 (1r9o) and 2D6 (2f9q). 4. The CYP2B6 enzyme was modelled from the 1suo (CYP2B4) crystal structure and CYP2C19 was constructed from the 1og5 (CYP2C9) crystallographic coordinates. 5. The CYP2E1 enzyme was produced via homology with the CYP2C5 crystal structure (1n6b) where the sequence identitiy is 59%.

 
2C5 S99 V100 I102 L103 K108 A113 F114 N204 V205 L208 S209 P220 A221 A237 K241 D290 G293 A294 E297 T298 L359 N362 L363 N471 G472 F473 V474 
2A6 E103 Q104 T106 F107 K112 V117 F118 I208 F209 T212 S213 S224 S225 L241 E245 N297 I300 G301 E304 T305 I366 S369 L370 V478 G479 F480 A481 
2B6 K100 I101 M103 V104 R109 I114 F115 T205 F206 I209 S210 S221 G222 L238 N242 S294 F297 A298 E301 T302 L363 G366 V367 C475 G476 V477 G478 
2C8 N99 S100 I102 S103 K108 I113 S114 N204 F205 L208 N219 P220 L221 V237 R241 D293 V296 A297 E300 T301 V362 G365 V366 K474 G475 I476 V477 
2C9 I99 F100 L102 A103 R108 V113 F114 N204 I205 L208 S209 S220 P221 V237 K241 D293 G296 A297 E300 T301 L362 S365 L366 N474 G475 F476 A477 
2C19 H99 F100 L102 A103 R108 V113 F114 N204 I205 V208 S209 P220 T221 L237 E241 D293 G296 A297 E300 T301 I362 S365 L366 N474 G475 F476 A477 
2D6 P103 V104 I106 T107F112 F120 L121 G212L213 E216 S217 V227 P228 Q244 L248 D301 S304 A305 V308 T309 V370 G373 V374F481 A482 F483 L484 
2E1 G101 L103 A105 F106 R110 I115 F116 N206 F207 L210 S211 P222 S223 V239 K243 D295 F298 A299 E302 T303 V364 N367 L368 I476 G477 F478 G479 
Table 1.6

CYP2C9 substrates: comparison between experimental and calculated binding energies (kcal mol−1)

CompoundRoute of metabolismKm (μM)ΔGbindexptΔGbindcalc
ave = average of several values available in the literature. 
Correlation between experimental and calculated binding energies is as follows: 
ΔGbindexpt = 0.982 ΔGbindcalc s = 0.243; R = 0.948; R2 = 0.90 
 1. Diclofenac 4′-hydroxylation 13.11 (ave) −6.9250 −6.94 
 2. Warfarin (S) 7-hydroxylation 6.0 −7.4070 −7.22 
 3. Tienilic acid 5-hydroxylation 5.0 −7.5193 −7.57 
 4. Flurbiprofen (S) 4′-hydroxylation 1.9 −8.1154 −8.22 
 5. Ibuprofen (S) 3-hydroxylation 21 −6.6353 −7.14 
 6. Tolbutamide 4-methyl hydroxylation 82 −5.7961 −5.75 
 7. Phenytoin 4-hydroxylation 23.3 (ave) −6.5712 −6.47 
 8. Torasemide methyl hydroxylation 40 −6.2383 −6.69 
 9. Celecoxib methyl hydroxylation 3.5 −7.7391 −8.17 
10. Mefenamic acid 3′-methyl hydroxylation −7.3121 −7.35 
CompoundRoute of metabolismKm (μM)ΔGbindexptΔGbindcalc
ave = average of several values available in the literature. 
Correlation between experimental and calculated binding energies is as follows: 
ΔGbindexpt = 0.982 ΔGbindcalc s = 0.243; R = 0.948; R2 = 0.90 
 1. Diclofenac 4′-hydroxylation 13.11 (ave) −6.9250 −6.94 
 2. Warfarin (S) 7-hydroxylation 6.0 −7.4070 −7.22 
 3. Tienilic acid 5-hydroxylation 5.0 −7.5193 −7.57 
 4. Flurbiprofen (S) 4′-hydroxylation 1.9 −8.1154 −8.22 
 5. Ibuprofen (S) 3-hydroxylation 21 −6.6353 −7.14 
 6. Tolbutamide 4-methyl hydroxylation 82 −5.7961 −5.75 
 7. Phenytoin 4-hydroxylation 23.3 (ave) −6.5712 −6.47 
 8. Torasemide methyl hydroxylation 40 −6.2383 −6.69 
 9. Celecoxib methyl hydroxylation 3.5 −7.7391 −8.17 
10. Mefenamic acid 3′-methyl hydroxylation −7.3121 −7.35 

These various findings serve to build up an emerging picture of the P450 active site regions that are responsible for substrate binding and selectivity, particularly with regard to human drug metabolism. The quality of homology models can be evaluated by comparison with the recently determined crystal structures of CYP2C8, CYP2C9, CYP2A6, CYP2D6 and CYP3A4, where the average root-mean-square difference (rmsd) between matched α-carbons on the model and crystal structure displays a well-defined relationship (Figure 1.6) with primary sequence identity relative to the CYP2C5 template, as described in a following section; Table 1.7 provides the relevant data. Although at first sight the relation appears to be approximately linear, careful analysis reveals that it is, in fact, exponential in nature with a function which is similar to that reported by Lesk and Chothia115  for the core residues of various protein structures, including those of hemoproteins.

Figure 1.6

A plot of root-mean-square distance (rmsd) between matched α-carbon atoms versus percentage sequence identity for several mammalian P450s of known crystal structure, using the data presented in Table 1.7.

Figure 1.6

A plot of root-mean-square distance (rmsd) between matched α-carbon atoms versus percentage sequence identity for several mammalian P450s of known crystal structure, using the data presented in Table 1.7.

Close modal
Table 1.7

Percentage sequence identity and Rmsd values for mammalian P450s

CYP modelSequence identity (%)Rmsd (Å)Crystallographic targetCrystallographic template
Rmsd = average root mean square distance between matched α-carbon atoms in the fitted structures. 
Sequence Identity = the percentage primary sequence identity between target and template P450s based on the matched residues involved. 
Equations relating Rmsd with Percentage Sequence Identity for the six structures shown above are as follows:-

1. Rmsd 0.056 (100−% identity) (±0.005) R = 0.983 
2. % identity 98.166−17.135 Rmsd (±1.785) R = 0.979 
3. log Rmsd 0.009 (100−% identity)−0.045 (±0.0004) R = 0.996 
1. Rmsd 0.056 (100−% identity) (±0.005) R = 0.983 
2. % identity 98.166−17.135 Rmsd (±1.785) R = 0.979 
3. log Rmsd 0.009 (100−% identity)−0.045 (±0.0004) R = 0.996 

 
2A6 51.9481 2.4749 2A6 2C5 
2B4 20.2299 4.7034 2B4 102 
2C8 75.7576 1.4291 2C8 2C5 
2C9 77.7056 1.5044 2C9 2C5 
2D6 42.3913 2.7742 2D6 2C5 
3A4 24.3478 4.4240 3A4 2C5 
CYP modelSequence identity (%)Rmsd (Å)Crystallographic targetCrystallographic template
Rmsd = average root mean square distance between matched α-carbon atoms in the fitted structures. 
Sequence Identity = the percentage primary sequence identity between target and template P450s based on the matched residues involved. 
Equations relating Rmsd with Percentage Sequence Identity for the six structures shown above are as follows:-

1. Rmsd 0.056 (100−% identity) (±0.005) R = 0.983 
2. % identity 98.166−17.135 Rmsd (±1.785) R = 0.979 
3. log Rmsd 0.009 (100−% identity)−0.045 (±0.0004) R = 0.996 
1. Rmsd 0.056 (100−% identity) (±0.005) R = 0.983 
2. % identity 98.166−17.135 Rmsd (±1.785) R = 0.979 
3. log Rmsd 0.009 (100−% identity)−0.045 (±0.0004) R = 0.996 

 
2A6 51.9481 2.4749 2A6 2C5 
2B4 20.2299 4.7034 2B4 102 
2C8 75.7576 1.4291 2C8 2C5 
2C9 77.7056 1.5044 2C9 2C5 
2D6 42.3913 2.7742 2D6 2C5 
3A4 24.3478 4.4240 3A4 2C5 

Substrate selectivity (Table 1.8) for the human drug-metabolising enzymes is somewhat complex as some of these enzymes exhibit distinct, but overlapping, selectivities. However, one is now in a position to make certain definitive statements regarding this important area, especially due to the current interest in high-throughput screening programmes for the development of new chemical entities (NCEs) that are destined for human exposure. In this respect, it is now well-established that drug metabolism represents a pipeline ‘bottleneck’ for NCE development and many of the challenges to those engaged in this endeavour appear to centre around pharmacokinetics issues related almost entirely to P450-mediated pathways. Table 1.9 provides a listing of selective human P450 substrates for the major drug-metabolising P450s.

Table 1.8

Characteristics of human drug-metabolising P450 substrates

CYPTypes of substratesAverage log PTypical substratelog P

Notes: 1. Molecular size (Mr) distinguishes CYP2E1 (small) and CYP3A4 (large) from other substrate classes (medium). 2. Molecular planarity (area/depth2) distinguishes CYP1A2 and CYP2E1 (high) from CYP2B6 (low) relative to other substrate classes (medium). 3. Compound acid/base character (pKa) distinguishes CYP2C9 (acidic) and CYP2D6 (basic) from other substrate classes, such as CYP2A6 and CYP2E1 (neutral). 4. Number of fused aromatic rings distinguishes CYP1A2 substrates from other substrate classes, and this is related to molecular planarity. 5. The log P volume is related to a combination of molecular size and compound polarity, which is a function of the number of hydrogen bond donor and acceptor atoms. The average log P for a relatively large number of substrates (∼16 for each enzyme) is close to that of a typical substrate in each case. P, octanol/water partition coefficient; MeIQ, 2-amino-3,4-dimethylimidazo[4,5-f]quinoline.

 
1A2 Planar heterocyclic amines and amides 2.01 MeIQ 1.98 
2A6 Fairly small-sized molecules 1.44 Losigamone 1.46 
2B6 Basic, medium-sized molecules 2.54 Bupropion 2.54 
2C8 Acidic, fairly large-sized molecules 3.38 Rosiglitazone 3.20 
2C9 Acidic, medium-sized molecules 3.20 Naproxen 3.18 
2C19 Medium-sized amines and amides 2.56 Proguanil 2.53 
2D6 Basic, medium-sized molecules 3.08 Propranolol 3.09 
2E1 Structurally diverse small molecules 2.07 4-Nitrophenol 2.04 
3A4 Structurally diverse large molecules 3.10 Nifedipine 3.17 
CYPTypes of substratesAverage log PTypical substratelog P

Notes: 1. Molecular size (Mr) distinguishes CYP2E1 (small) and CYP3A4 (large) from other substrate classes (medium). 2. Molecular planarity (area/depth2) distinguishes CYP1A2 and CYP2E1 (high) from CYP2B6 (low) relative to other substrate classes (medium). 3. Compound acid/base character (pKa) distinguishes CYP2C9 (acidic) and CYP2D6 (basic) from other substrate classes, such as CYP2A6 and CYP2E1 (neutral). 4. Number of fused aromatic rings distinguishes CYP1A2 substrates from other substrate classes, and this is related to molecular planarity. 5. The log P volume is related to a combination of molecular size and compound polarity, which is a function of the number of hydrogen bond donor and acceptor atoms. The average log P for a relatively large number of substrates (∼16 for each enzyme) is close to that of a typical substrate in each case. P, octanol/water partition coefficient; MeIQ, 2-amino-3,4-dimethylimidazo[4,5-f]quinoline.

 
1A2 Planar heterocyclic amines and amides 2.01 MeIQ 1.98 
2A6 Fairly small-sized molecules 1.44 Losigamone 1.46 
2B6 Basic, medium-sized molecules 2.54 Bupropion 2.54 
2C8 Acidic, fairly large-sized molecules 3.38 Rosiglitazone 3.20 
2C9 Acidic, medium-sized molecules 3.20 Naproxen 3.18 
2C19 Medium-sized amines and amides 2.56 Proguanil 2.53 
2D6 Basic, medium-sized molecules 3.08 Propranolol 3.09 
2E1 Structurally diverse small molecules 2.07 4-Nitrophenol 2.04 
3A4 Structurally diverse large molecules 3.10 Nifedipine 3.17 
Table 1.9

Selective human CYP substrates.86,153 

CYP1A2 
Antipyrine 
Caffeine 
Clozapine 
Clomipramine 
Imipramine 
Paracetamol (acetaminophen) 
Phenacetin 
Propranolol 
Tacrine 
Theophylline 
 
CYP2A6 
Coumarin 
Nicotine 
Cotinine 
Fadrozole 
SM-12502 
Losigamone 
4-Nitroanisole 
2,6-Dichlorobenzonitrile 
Quinoline 
Indole 
 
CYP2B6 
7-Benzyloxyresorufin 
7-Ethoxy-4-trifluoromethylcoumarin 
Deprenyl 
Testosterone 
Benzphetamine 
Bupropion 
PNU 249173 
Cinnarizine 
7-Ethoxycoumarin 
Arteether 
 
CYP2C9 
Antipyrine 
Diclofenac 
Phenytoin 
S-Warfarin 
Tolbutamide 
R-Naproxen 
Tienilic acid 
Ibuprofen 
Flurbiprofen 
Mefenamic acid 
 
CYP2C19 
Amitriptyline 
Citalopram 
Clomipramine 
Diazepam 
Imipramine 
Mephobarbital 
Omeprazole 
Proguanil 
Propranolol 
S-Mephenytoin 
 
CYP2D6 
Aprindine 
Encainide 
Mexiletine 
Propafenone 
Metoprolol 
Propranolol 
Perphenazine 
Haloperidol 
Thioridazine 
Zuclopenthixol 
Codeine 
Dextromethorphan 
Dihydrocodeine 
Ethylmorphine 
Hydrocodone 
Tramadol 
Fluoxetine 
Paroxetine 
Amitriptyline 
Clomipramine 
Desipramine 
Imipramine 
N-Desmethylclomipramine 
Nortriptyline 
Trimipramine 
Maprotiline 
MDMA 
Debrisoquine 
Sparteine 
 
CYP2E1 
Chlorzoxazone 
Enflurane 
Halothane 
Paracetamol (acetaminophen) 
Salicylic acid 
Benzene 
Ethanol 
Dimethylnitrosamine 
4-Nitrophenol 
Dapsone 
 
CYP3A4 
Amiodarone 
Lidocaine 
Propafenone 
Quinidine 
Ifosfamide 
Tamoxifen 
Toremifene 
Vinblastine 
Alprazolam 
Diazepam 
Midazolam 
Triazolam 
Diltiazem 
Felopidine 
Nifedipine 
Verapamil 
Cortisol 
Ethynyloestradiol 
Testosterone 
Carbamazepine 
Clomipramine 
Cyclosporin A 
Erythromycin 
Imipramine 
Omeprazole 
Proguanil 
Terfenadine 
CYP1A2 
Antipyrine 
Caffeine 
Clozapine 
Clomipramine 
Imipramine 
Paracetamol (acetaminophen) 
Phenacetin 
Propranolol 
Tacrine 
Theophylline 
 
CYP2A6 
Coumarin 
Nicotine 
Cotinine 
Fadrozole 
SM-12502 
Losigamone 
4-Nitroanisole 
2,6-Dichlorobenzonitrile 
Quinoline 
Indole 
 
CYP2B6 
7-Benzyloxyresorufin 
7-Ethoxy-4-trifluoromethylcoumarin 
Deprenyl 
Testosterone 
Benzphetamine 
Bupropion 
PNU 249173 
Cinnarizine 
7-Ethoxycoumarin 
Arteether 
 
CYP2C9 
Antipyrine 
Diclofenac 
Phenytoin 
S-Warfarin 
Tolbutamide 
R-Naproxen 
Tienilic acid 
Ibuprofen 
Flurbiprofen 
Mefenamic acid 
 
CYP2C19 
Amitriptyline 
Citalopram 
Clomipramine 
Diazepam 
Imipramine 
Mephobarbital 
Omeprazole 
Proguanil 
Propranolol 
S-Mephenytoin 
 
CYP2D6 
Aprindine 
Encainide 
Mexiletine 
Propafenone 
Metoprolol 
Propranolol 
Perphenazine 
Haloperidol 
Thioridazine 
Zuclopenthixol 
Codeine 
Dextromethorphan 
Dihydrocodeine 
Ethylmorphine 
Hydrocodone 
Tramadol 
Fluoxetine 
Paroxetine 
Amitriptyline 
Clomipramine 
Desipramine 
Imipramine 
N-Desmethylclomipramine 
Nortriptyline 
Trimipramine 
Maprotiline 
MDMA 
Debrisoquine 
Sparteine 
 
CYP2E1 
Chlorzoxazone 
Enflurane 
Halothane 
Paracetamol (acetaminophen) 
Salicylic acid 
Benzene 
Ethanol 
Dimethylnitrosamine 
4-Nitrophenol 
Dapsone 
 
CYP3A4 
Amiodarone 
Lidocaine 
Propafenone 
Quinidine 
Ifosfamide 
Tamoxifen 
Toremifene 
Vinblastine 
Alprazolam 
Diazepam 
Midazolam 
Triazolam 
Diltiazem 
Felopidine 
Nifedipine 
Verapamil 
Cortisol 
Ethynyloestradiol 
Testosterone 
Carbamazepine 
Clomipramine 
Cyclosporin A 
Erythromycin 
Imipramine 
Omeprazole 
Proguanil 
Terfenadine 

The substrate selectivities for the eight human P450s which comprise the major drug-metabolising enzymes appear to be largely determined by compound molecular size (Mr), acid/base characteristics (pKa) and the ability to form hydrogen bonds with the enzyme, which relates to the number of hydrogen bond acceptors and donors in the molecule (NHBA+D). These three structural/physicochemical parameters are able to provide a 95% concordance with enzyme selectivity for a dataset of 64 compounds using a neural network analysis approach.108  The reason for this is due to the fact that pKa is sufficient to discriminate CYP2C9 (acidic) and CYP2D6 (basic) substrates, whereas Mr differentiates substrates of CYP3A4 (large) and CYP2E1 (small) from the remaining compound set. The utilisation of the third descriptor, NHBA+D, is more subtle and facilitates distinction between those P450s which exhibit a preference for polar molecules, such as CYP2A6, for example. Figure 1.7 presents the structures of a number of CYP2C9 substrates orientated such that their sites of metabolism and specific binding groups would reinforce if the molecules were superimposed, and a template of such compounds is able to fit within the CYP2C9 active site such that there are common interactions with key amino acid residues. For more precise definition of selectivity determinants, it is necessary to employ some form of active site modelling or undertake an examination of superimposed substrate templates, and these two procedures can be readily combined by building up matched structures of typical substrates within the binding sites of crystal structures and good quality molecular models of human P450s. The so-called SRS regions95  of CYP2 family enzymes have remained a target for site-directed mutagenesis experiments on mammalian P450s for some time (reviewed by Domanski and Halpert94 ) and it is possible to show that there is a match-up between SDMs in the SRSs of CYP2 family P450s and known substrate contacts encountered in the crystallographically resolved three-dimensional structures. Table 1.5 presents a summary of this information, from which it can be appreciated that there are key positions for amino acids determining substrate selectivity in the CYP2 family.

Figure 1.7

A comparison between typical substrates of CYP2C9 showing structural similarities, sites of metabolism (↑) and key interacting groups. Hydrogen bond donor/acceptor atoms are also marked (*). Bu refers to the butyl group, with the n superscript representing normal butyl and t the tertiary butyl form.

Figure 1.7

A comparison between typical substrates of CYP2C9 showing structural similarities, sites of metabolism (↑) and key interacting groups. Hydrogen bond donor/acceptor atoms are also marked (*). Bu refers to the butyl group, with the n superscript representing normal butyl and t the tertiary butyl form.

Close modal

In the P450 catalytic cycle, as shown in Figure 1.8, most intermediary stages have now been fairly well characterised,19,56,116–122  and crystal structures are now available for the substrate-free, substrate-bound, reduced form, oxygen-bound form and oxene intermediate, albeit only for the P450cam (CYP101) system. CYP101 from the bacterium Pseudomonas putida is probably one of the most extensively characterised P450 enzymes and, consequently, this is the one where much biophysical chemistry is known from the employment of a variety of biochemical, physical and theoretical techniques, including, for example, Mössbauer spectroscopy.78,116,123 

Figure 1.8

The P450 catalytic cycle showing key intermediates at the haem locus and the various changes in the iron redox- and spin-states. The two reducing equivalents (2H+, 2e) are supplied by NADPH and mediated via P450 oxidoreductase (flavoprotein) containing FMN and FAD co-factors. There is considerable debate about the precise nature of the active oxygen species which actually performs the final stage of substrate oxygenation. An iron-bound peroxy species and an iron oxene intermediate represent the two main candidates at present, and there is evidence for both of these.

Figure 1.8

The P450 catalytic cycle showing key intermediates at the haem locus and the various changes in the iron redox- and spin-states. The two reducing equivalents (2H+, 2e) are supplied by NADPH and mediated via P450 oxidoreductase (flavoprotein) containing FMN and FAD co-factors. There is considerable debate about the precise nature of the active oxygen species which actually performs the final stage of substrate oxygenation. An iron-bound peroxy species and an iron oxene intermediate represent the two main candidates at present, and there is evidence for both of these.

Close modal

From a consideration of the bond lengths for the haem moiety ligands in the CYP101 crystal structures, it would appear that the dioxygen-bound state probably comprises superoxide coordinated to iron (III), whereas the iron-oxo intermediate (sometimes referred to as an iron-oxene) may be considered as a single oxygen atom bound to iron(III), although there are several ways of formulating this species,119  such as [Fe=O]3+, and some of these can also involve higher oxidation states of iron such as Fe(IV) or even Fe(V) species. However, it is clear that the cysteine ligand present as the thiolate form plays a crucial role in the P450 catalytic cycle, where it is thought to be important for activation of the dioxygen complex following the initial reduction of the substrate-bound enzyme.74  Standard tabulations of atomic and covalent radii concord well with the observed Fe–S and Fe–O distances encountered in the crystal structures of various P450s, although it is apparent that the Fe–S bond length changes somewhat during the course of the reaction cycle, this being largely dependent upon the electronic state of iron; it may also be affected by the presence of inhibitors which are able to ligate the haem iron, such as carbon monoxide.124  A summary of the P450 haem geometry is presented in Figure 1.9.

Figure 1.9

Structural geometry of the haem moiety in P450 showing Fe–S distances at key stages in the catalytic cycle, as obtained from crystallographic studies. 1. A, B, C and D designate the nomenclature of the four pyrrole rings as viewed from above the distal haem face. 2. The Fe–S bond length changes during the course of the catalytic cycle, this being dependent upon the redox and spin-states of the iron atom. 3. In general, the change from low-spin (LS) to high-spin (HS) increases the Fe–S distance slightly, and Fe(III) generally shows a shorter Fe–S bond than for Fe(II). 4. The effect of the changes in iron spin-state on the haem geometry usually involves a movement of the iron atom out of the porphyrin ring plane and towards the cysteine sulphur when iron is in the high-spin state due to its increased ionic radius. 5. The Fe–N distances tend to lie close to a value of 1.95–2.05Å with only slight variations between different P450s and within the reaction cycle itself. 6. The Fe–O bond lengths in the dioxygen-bound and ‘oxene’ complex structures indicate the presence of bound superoxide and a bare oxygen atom, respectively, on the basis of the relevant atomic and covalent radii. Moreover, the O–O bond distance of 1.253Å in the dioxygen complex is consistent with the formation of a superoxide moiety where the O–O bond length is 1.26Å. 7. Haem binding is achieved by an Fe–S covalent bond with the invariant cysteine1 sulphur atom, together with two ionic interactions with conserved basic residues for the propionate head groups, and various hydrophobic contacts with complementary residues within the I and L helices which effectively ‘sandwich’ the haem group.

Figure 1.9

Structural geometry of the haem moiety in P450 showing Fe–S distances at key stages in the catalytic cycle, as obtained from crystallographic studies. 1. A, B, C and D designate the nomenclature of the four pyrrole rings as viewed from above the distal haem face. 2. The Fe–S bond length changes during the course of the catalytic cycle, this being dependent upon the redox and spin-states of the iron atom. 3. In general, the change from low-spin (LS) to high-spin (HS) increases the Fe–S distance slightly, and Fe(III) generally shows a shorter Fe–S bond than for Fe(II). 4. The effect of the changes in iron spin-state on the haem geometry usually involves a movement of the iron atom out of the porphyrin ring plane and towards the cysteine sulphur when iron is in the high-spin state due to its increased ionic radius. 5. The Fe–N distances tend to lie close to a value of 1.95–2.05Å with only slight variations between different P450s and within the reaction cycle itself. 6. The Fe–O bond lengths in the dioxygen-bound and ‘oxene’ complex structures indicate the presence of bound superoxide and a bare oxygen atom, respectively, on the basis of the relevant atomic and covalent radii. Moreover, the O–O bond distance of 1.253Å in the dioxygen complex is consistent with the formation of a superoxide moiety where the O–O bond length is 1.26Å. 7. Haem binding is achieved by an Fe–S covalent bond with the invariant cysteine1 sulphur atom, together with two ionic interactions with conserved basic residues for the propionate head groups, and various hydrophobic contacts with complementary residues within the I and L helices which effectively ‘sandwich’ the haem group.

Close modal

As P450-mediated catalysis can be viewed as the consecutive two-stage reduction of dioxygen to, firstly, superoxide and then peroxide, there is an opportunity for the generation of deleterious reactive oxygen species (ROS) if uncoupling of the cycle occurs such that the substrate is not oxygenated fully. Although most P450 oxygenations can be mechanistically formulated in terms of the electrophilic [Fe=O]3+ species, there is evidence for iron-peroxo and/or iron hydroperoxo complexes as being the active oxygenating intermediates in certain cases, especially where nucleophilic attack is required to concord with experimental observations.56  The first reduction is generally regarded as the rate-determining step, although this stage is clearly dependent upon the initial binding of substrate to low-spin iron(III) in the resting state of the enzyme. Evidence is strong for the rapid, high-affinity binding of dioxygen following the first reduction, and it is likely that an equilibrium exists between iron(II)dioxygen and iron(III)superoxide, with protonation of superoxide representing a pathway favouring the latter state. The second reduction is then thought to occur, and formation of iron(III) as in the previously discussed equilibrium would favour this, with a subsequent equilibrium state emerging between iron(II)superoxide (possibly protonated) and iron(III)peroxide, for which evidence also exists. Probably the iron(III)peroxo species will readily become protonated to form the iron(III)hydroperoxy intermediate, and this latter state has been postulated as possessing both electrophilic and nucleophilic characteristics which would appear to be ideal for many P450 oxygenations.56  However, the loss of a water molecule following a second protonation step yields the iron(III)oxo species which is currently favoured as the prime candidate for the actual oxygenating species in most P450-mediated reactions and, mechanistically, this would appear to be most likely.118,119 

In a series of impressive papers, Shaik and colleagues have shown via ab initio molecular orbital calculations that the postulated active oxygen intermediate, namely the iron(III)oxo species, is a remarkably stable entity that is also capable of performing an energetically favourable oxygenation of selected substrate molecules (reviewed in Shaik and De Visser121 ). In fact, the work of Sligar and colleagues19  on the crystallographic determination of atomic resolution of various stages in the P450cam (CYP101) catalytic cycle indicates the possible formation of an iron-oxene state, and these coworkers have also reported the isolation of the dioxygen-bound complex of reduced P450cam which may be regarded as the iron(III)superoxide, as discussed previously. However, others have demonstrated the possibility of an iron-peroxo species as being capable of carrying out P450-mediated oxygenations in some cases,125–127  although the consensus view is that the iron-oxo species is probably the most common active oxygen intermediate in P450 oxygenations, and this is certainly consistent with the likely reaction mechanisms of many P450-mediated conversions.61,128 

The redox potentials in P450 systems are important to the progress of the catalytic cycle, whereby substrate binding lowers the Eo value of the iron(II)/iron(III) couple (becoming less negative, in fact) thus enabling reduction and electron transfer by the redoxin, reductase or cytochrome b5 redox partner, depending on the type of system under consideration (reviewed in reference 1). In this way P450 becomes optimally positioned for the establishment of a smooth potential gradient from NADPH/NADP (or NADH/NAD) to oxygen/superoxide, irrespective of whether one examines the bacterial, mitochondrial or microsomal P450 systems (see Figure 1.3 for the P450cam redox pathway). Although the FAD/FMN combination in flavoprotein reductase is a better electron transfer mediator than the iron-sulphur cluster arrangement in redoxins, it is found that the bacterial systems are generally somewhat faster in terms of catalytic turnover than are the microsomal systems, and this may be due to the fact that the bacterial P450s are cytosolic whereas the microsomal P450s are membrane bound (reviewed in references 1 and 2). Presumably, the phospholipid in the endoplasmic reticulum system is able to modulate the rate of electron transfer which is likely to be rate-determining. However, the fused P450 system in P450bm3 (CYP102) which contains both a flavoprotein and haem domain is significantly faster (turnovers are in the order of 4000 min−1) in performing substrate oxygenations than, say, P450cam where the redox partner is the iron-sulphur protein putidaredoxin.36,129,130  The relatively rigid arrangement of this fused P450 system would ensure a facile electron transfer between the various cofactors involving the FAD→FMN→haem electron transport pathway in CYP102 relative to that encountered in the microsomal system.36 

The molecular electrostatic potential energy of P450 surface residues may change during the course of the catalytic cycle, especially at those points where substrate binding, redox partner interaction and electron transfers occur. This may well have an important bearing on the mechanism by which the actual reduction process itself is effected, including a rationale for the triggering of redox partner binding at the appropriate stage in the catalytic cycle. Desolvation of the P450 active site by incoming substrate tends to lower the redox potential of P450 (ie. it becomes less negative) which facilitates the first reduction step, as already discussed. Apparently, there is a clear relationship between percentage haem exposure and iron(II)/iron(III) redox potential (Eo value) in haemoproteins, including P450, and this also correlates with oxygen affinity in terms of values for other series of haemoproteins.131 

The initial binding of a substrate molecule will desolvate water from the active site region, thus altering the electrical characteristics of the assembly due to the exposure of ion-pairs and other charged residues from the electrically insulating effect of bound water molecules, which essentially act like a dielectric layer within the haem environment. There is a generally conserved acidic residue close to the invariant cysteine haem ligand which appears to control water access to the active site via a hydrogen-bonded network, so ‘activation’ of this residue by loss of water from the distal haem face possibly acts like a ‘switch’ to turn on the electric field gradient surrounding the proximal haem face, thus attracting either reductase or cytochrome b5 in the microsomal system for triggering reduction of the iron.12  It is feasible that the extent of desolvation at the haem locus, which accompanies substrate binding, may be proportional to the substrate-bound redox potential of the P450 and, hence, the rate of the first reduction process itself. This may help to explain the dependance of substrate lipophilicity upon the overall rate of metabolism which is found in certain series of P450-mediated oxygenations. Although there are other factors involved, when comparing the rate-dependancy for compounds in a congeneric series one can frequently find that compound lipophilic character, in the form of either log P or log D7.4, shows a reasonable correlation with the metabolic rate constant when expressed logarithmically. However, electronic factors are also likely to play key roles in determining the relative rate of P450-mediated reactions.

The enzyme changes its conformation upon substrate binding, as determined by comparison between bound and free P450s. A structural fit between substrate-bound and substrate-free CYP2C5 indicates which residues and general regions of the structure are mainly affected by substrate binding. Figure 1.10 shows the root-mean-square distance (rmsd) value plotted against residue number, with the various SRS regions95  indicated for completeness. It is apparent that only a few residues alter their position markedly upon the binding of a typical substrate, although it is likely that the accompanying conformational change triggers the binding of the redox partner such that reduction is effected, thus leading to dioxygen activation and insertion of a single oxygen atom into the substrate molecule. Orientation of the substrate within the active site and, specifically, relative to the haem will determine the site of metabolism. In this respect, the spatial disposition of a relatively small number (∼6) of key active site amino acid residues is crucial for determining the ultimate course of metabolism and some of these amino acids can also play a role in defining the substrate selectivity of the enzyme. Thus, although most residues are largely conserved across the superfamily, those within the SRSs may differ markedly between individual P450s where they operate in directing the nature of preferred substrates and, ultimately, determine their sites of oxygenation.

Figure 1.10

A plot of root-mean-square distance (RMSD) (Å) versus residue number for substrate- and inhibitor-bound CYP2C5 relative to that of the substrate-free structure. The location of substrate recognition sites (SRS) regions are shown, together with amino acid residues which correspond to the peak rmsd positions showing movement in substrate binding. DMZ = dimethyl-2-phenyl-2H-pyrazol-3-ylbenzenesulphonamide.

Figure 1.10

A plot of root-mean-square distance (RMSD) (Å) versus residue number for substrate- and inhibitor-bound CYP2C5 relative to that of the substrate-free structure. The location of substrate recognition sites (SRS) regions are shown, together with amino acid residues which correspond to the peak rmsd positions showing movement in substrate binding. DMZ = dimethyl-2-phenyl-2H-pyrazol-3-ylbenzenesulphonamide.

Close modal

Three-dimensional structural models of P450s can be obtained via X-ray crystallography and these are generally recognised to be of immense value in understanding the way in which P450s operate. Homology models may also be constructed from a suitable crystallographic template, and the likely level of accuracy in such experiments is essentially determined by the level of sequence identity between template and target sequences. Until the advent of the rabbit CYP2C5 crystal structure, the most promising template for mammalian microsomal P450s was the haemoprotein domain of the unique bacterial P450 from Bacillus megaterium (CYP102) which shares a degree of similarity with many membrane-based eukaryotic P450s in the possession of an FAD/FMN-containing flavoprotein reductase redox partner (reviewed in reference 1). However, the CYP2C5 crystallographic coordinates have proved to be a significant advance on CYP102 as a structural template for homology modelling of human P450s associated with drug metabolism, especially those of the CYP2 family where the sequence identities are greater than 40%.33,34  By and large, it is reasonable to assume that a molecular model, built via homology with a high-resolution crystal structure, can be regarded as a satisfactory paradigm comparable to a crystallographic determination, provided that the sequence identity is greater than about 30%. For example, a plot of root-mean-square distances (rmsd) versus primary sequence identity (%) for matched α-carbons between several mammalian P450 crystal structures and homology models (as shown in Figure 1.6) provides a clear indication of this situation, where there is a strong logarithmic relationship (R = 0.99) between rmsd and sequence homology.

There are now 24 unique P450 crystal structures of which 16 are from bacterial species, one is of a fungal form, and seven are from mammalian species, of which two are rabbit enzymes and five are of human P450s associated with drug metabolism. The overall tertiary structures of these enzymes show a generally well-conserved common core (as shown in Figure 1.4 for selected structures and Table 1.10) which is presumably essential for the basic functionality of P450, irrespective of the species and tissue source. However, there are clearly certain differences between prokaryotic and eukaryotic P450 structures which reflect the changes in redox partner and cellular location, namely: cytosolic, in the case of bacteria, and membrane-bound for mammalia. For example, there is an additional sequence of about 20–40 residues at the N-terminus of mammalian P450s which is not found in bacterial P450s, and this is thought to represent a membrane-anchor peptide that probably spans the phospholipid biolayer via an essentially α-helical structure.35  In general, the prokaryotic P450s require an iron-sulphur redoxin as a redox partner, whereas most mammalian P450s utilise an FAD and FMN-containing flavoprotein reductase to supply reducing equivalents to P450. Exceptions exist, however, as mitochondrial P450s resemble those from bacterial sources in possessing redoxins as redox partners and, in addition, at least one prokaryotic P450 (ie. CYP102 from Bacillus megaterium) is known to employ a reductase domain directly linked to its haemoprotein portion,36  and this is thought to be a main factor in explaining the exceptionally high turnover number exhibited by this enzyme.

CYPResolution (Å)PDB codeDimensionsSecondary structural content
101 1.63 2cpp 60Å × 30Å 45% α-helix 15% β-sheet 
102 2.70 1fag 54Å × 35Å 48% α-helix 12% β-sheet 
2C5 2.10 1nr6 60Å × 40Å 47% α-helix 10% β-sheet 
55 1.00 1jfb 63Å × 48Å 45% α-helix 10% β-sheet 
Rmsd and identity matrix for matched P450s 
CYPResolution (Å)PDB codeDimensionsSecondary structural content
101 1.63 2cpp 60Å × 30Å 45% α-helix 15% β-sheet 
102 2.70 1fag 54Å × 35Å 48% α-helix 12% β-sheet 
2C5 2.10 1nr6 60Å × 40Å 47% α-helix 10% β-sheet 
55 1.00 1jfb 63Å × 48Å 45% α-helix 10% β-sheet 
Rmsd and identity matrix for matched P450s 
Identity % Rmsd (Å)101102552C5

PDB, Protein Data Bank

 
Rmsd, root-mean-square distance. 
101 – 18.9 30.8 23.5 
102 17.2 – 27.6 28.7 
55 26.6 36.0 – 18.1 
2C5 30.6 33.1 15.9 – 
Identity % Rmsd (Å)101102552C5

PDB, Protein Data Bank

 
Rmsd, root-mean-square distance. 
101 – 18.9 30.8 23.5 
102 17.2 – 27.6 28.7 
55 26.6 36.0 – 18.1 
2C5 30.6 33.1 15.9 – 

It is possible to construct three-dimensional molecular models of P450s based on homology with a suitable crystallographic template.108,132,133  The majority of these unique crystal structures (24 in total) stem primarily from bacterial sources, whereas six are from mammalian species (including human) and one represents a fungal form (P450nor). In the latter instance, the P450-catalysed reaction is nitric oxide reduction, which employs no oxygen in the overall process whereby two NO molecules are combined to form nitrous oxide. Over 5000 P450s are currently known from sequence determinations, and it is apparent that considerable variety exists in the P450 complement of different species. For example, there are 18 P450s in Streptomyces coelicolor and 33 in Streptomyces avermitillis, whereas 57 human P450 enzymes are known; the exact functions of some of these still remain to be determined.134  The well-conserved structural core of P450 ensures that homology modelling is likely to yield useful results, particularly where sequence identity is relatively high (ie. greater than 30%). In fact, a consideration of the overall three-dimensional structures of P450s from bacteria to mammalia (Figure 1.4) shows that the changes observed are not particularly large, as can be appreciated by an observed rmsd of 5.9Å between CYP102 and CYP2C5 from their structural overlay based on the sequence homology of matched residues (Table 1.10).

For human P450s constructed from rabbit CYP2C5, a rabbit form and the first mammalian P450 to have its X-ray structure determined, there is an approximately linear relationship between goodness-of-fit (measured as average rms distances between matched α-carbons in model and crystal structure) and primary sequence identity expressed as a percentage. The relevant data are provided in Table 1.7 and a graph of the corresponding relationship is shown in Figure 1.6. For example, the CYP2C8 and CYP2C9 models exhibit very good agreement with crystallographic coordinates, whereas that of CYP3A4 is relatively poor, and these findings are commensurate with the differences in sequence identity or homology. The details of the procedures utilised and the overall methodology for homology modelling is beyond the scope of this chapter and will only be outlined here, although the reader is referred to a recent publication by de Groot and coworkers108  for more information. Briefly, the modelling process involves the generation of a satisfactory sequence alignment, followed by residue replacement, deletion, loop insertion, poor residue contact alleviation and then full energy minimisation using a combination of molecular mechanics and molecular dynamics simulations to explore the conformational flexibility of the enzyme at physiologically relevant temperatures.108  The overall degree of confidence in X-ray crystal structures is shown in Table 1.11, where some of the known P450 structures are listed in certain categories to indicate the likely quality of these determinations.

Table 1.11

Confidencea in structural features of proteins determined by X-ray crystallography.154 

Resolution3Å2.5Å2Å1.5Å
Chain tracing Fair Good Good Good 
Secondary structure Fair Good Good Good 
Sidechain conformations – Fair Good Good 
Orientation of peptide planes – Fair Good Good 
Protein hydrogens visible – – – Good 
P450 crystal structures 2D6 2C9 2C5 101 
conforming to each rangeb  2C8 2A6 119 
  108 3A4 102 
Resolution3Å2.5Å2Å1.5Å
Chain tracing Fair Good Good Good 
Secondary structure Fair Good Good Good 
Sidechain conformations – Fair Good Good 
Orientation of peptide planes – Fair Good Good 
Protein hydrogens visible – – – Good 
P450 crystal structures 2D6 2C9 2C5 101 
conforming to each rangeb  2C8 2A6 119 
  108 3A4 102 
a

The estimates of confidence in the structural features are approximate and it should be appreciated that they actually depend strongly on the crystallographic data.

b

Corresponding P450 crystal structures which are representative of each category.

The fundamental questions facing P450 modellers and other researchers in, for example, the pharmaceutical industry are: what gives rise to substrate selectivity, what directs the course of a compound's metabolism and how can this be predicted, together with which methods represent ways of achieving a quantifiable estimate of binding, catalytic rate and overall clearance values for drug metabolism? In recent years, several of these key aspects of P450 structure, functionality and mechanism have been addressed with varying degrees of success. Homology modelling from P450 crystal structures, both bacterial and mammalian, has facilitated significant exploration of the molecular determinants for substrate selectivity, and has also enabled further insights into the recognition process for substrates and their orientation for metabolism by complementary active site amino acid residues. It is clear that certain key active site residues play important roles in the molecular recognition process and, in some cases, modulate the catalytic functionalities of P450 enzymes. Modelling has the powerful quality of conveying a large quantity of molecular structural information simultaneously into the visual field to provide considerable intuitive support for the observer engaged in the discovery process. It is often a challenge, however, to assimilate such information, and to decide on the appropriate procedures for achieving further understanding of the structure-function relationships involved.

The in silico techniques involved in constructing three-dimensional models of P450s by homology can be outlined108  as follows:

  1. Alignment of the template and target sequences

  2. Replacement of relevant amino acid residues in the crystallographic template

  3. Deletion and insertion of residues, where the latter task requires loop searching of the protein databank (PDB)

  4. Alleviation of unfavourable steric contacts produced by the residue replacement process, and

  5. Energy minimization of the raw structure via molecular mechanics and molecular dynamics.

Although there are several variations on this general scheme, it is apparent that such an approach can yield useful material for further study, including the use of site-directed mutagenesis94  for analysing selected amino acids in order to test out various hypotheses surrounding their role in enzyme activity. Table 1.12 gives a summary of active site and substrate template volumes for a number of human P450s involved in drug metabolism, and there is a good level of concordance between these findings.

Table 1.12

Active site and substrate template volumes (Å3) for high-quality P450 structures

CYPActive siteSubstrate templateMinimised energy (kcal mol−1)Comments

Notes: 1. In general, the substrate template size is within the volume limitations of the active site. In the case of CYP2C9, however, there is a very tight overlay of substrates which are highly superimposable, thus leading to a significantly lower volume than that available in the haem environment. 2. The minimised energy values, produced after 200 iterative cycles of molecular mechanics, indicate that low energy stable conformations have been obtained from relaxation of the raw structure.

 
2A6 340 345 −1419.660 Refined crystal structure 
2B6 260 (2B4 value) 564 −1308.117 Model constructed from CYP2B4a 
2C8 1386 1013 −1204.880 Refined crystal structure 
2C9 1137 519 −1285.690 Refined crystal structure 
2C19 1110 806 −1479.431 Model constructed from CYP2C9b 
2D6 540 N/A −1143.472 Refined crystal structure 
3A4 1438 1264 −1431.405 Refined crystal structure 
3A5 1438 (3A4 value) 1282 −1387.338 Model constructed from CYP3A4c 
CYPActive siteSubstrate templateMinimised energy (kcal mol−1)Comments

Notes: 1. In general, the substrate template size is within the volume limitations of the active site. In the case of CYP2C9, however, there is a very tight overlay of substrates which are highly superimposable, thus leading to a significantly lower volume than that available in the haem environment. 2. The minimised energy values, produced after 200 iterative cycles of molecular mechanics, indicate that low energy stable conformations have been obtained from relaxation of the raw structure.

 
2A6 340 345 −1419.660 Refined crystal structure 
2B6 260 (2B4 value) 564 −1308.117 Model constructed from CYP2B4a 
2C8 1386 1013 −1204.880 Refined crystal structure 
2C9 1137 519 −1285.690 Refined crystal structure 
2C19 1110 806 −1479.431 Model constructed from CYP2C9b 
2D6 540 N/A −1143.472 Refined crystal structure 
3A4 1438 1264 −1431.405 Refined crystal structure 
3A5 1438 (3A4 value) 1282 −1387.338 Model constructed from CYP3A4c 
a

 = 4-Chlorophenylimidazole-bound structure (pdb code: 1suo).

b

 = Warfarin-bound structure (pdb code: 1og5).

c

 = Metyrapone-bound structure (pdb code: 1wof).

It is now possible to make direct comparisons between homology models and actual crystal structures of human P450s, and the results show that the homology modelling process is indeed a valid method for constructing P450s.34  As might be expected, there is a clear correlation between goodness-of-fit (rmsd) and primary sequence identity when one compares models with the relevant crystallographic coordinates. Table 1.7 shows the results for a number of such determinations, from which it can be readily appreciated that the logarithm of rmsd is linearly related to sequence identity when this is expressed as 100% identity. The overall relationship bears some degree of similarity with that reported for other protein structures and their sequences,115,135  and it is thought that most amino acid substitutions tend to preserve the overall protein fold.136  The individual rmsd placements along the entire protein sequence for CYP2C9 are presented as Figure 1.11, which compares the actual crystallographic data with that of a homology model constructed from the CYP2C5 template. This indicates that there are certain peaks along the sequence where the rmsd value is relatively large, although the general rmsd values are reasonably low. The likely cause of the peaks in rmsd is probably due to the effects of substrate binding on the conformation of the enzyme, and it is interesting to note that the high rmsd values all occur in SRS regions. For the automated docking of substrates or inhibitors within the active sites of refined crystal structures and homology models, we have found that the AutoDock software105,106  gives satisfactory agreement with experimental binding energies137  obtained from literature values for Km, KS or Ki (Lewis and Ito, unpublished results).

Figure 1.11

Root-mean-square distance (rmsd) profile for the CYP2C9 crystal structure and a homology model of the enzyme based on the CYP2C5 crystallographic template, where rmsd (Å) is plotted against residue number. Rmsd is the distance between matched α-carbons in the superimposed structures. The substrate recognition sites (SRS) regions are shown, together with those residues corresponding to the peak positions on the graph.

Figure 1.11

Root-mean-square distance (rmsd) profile for the CYP2C9 crystal structure and a homology model of the enzyme based on the CYP2C5 crystallographic template, where rmsd (Å) is plotted against residue number. Rmsd is the distance between matched α-carbons in the superimposed structures. The substrate recognition sites (SRS) regions are shown, together with those residues corresponding to the peak positions on the graph.

Close modal

It is important to emphasise that it is usually necessary to repair and refine the raw crystal structures prior to any further modelling studies. This is due to the fact that the original crystal structure requires initial relaxation using molecular mechanics, because various changes will have occurred during the crystal formation process as a result of packing forces. Furthermore, it is frequently necessary to add or remove certain amino acid residues from the structure such that the enzyme conforms to the wild-type, because residues are often changed to aid the crystallisation process. Thermal fluctuation of certain lengths of peptide may have resulted in an inability to resolve the entire structure satisfactorily, and these regions are thus omitted from the crystallographic coordinates. Occasionally, the backbone has been sufficiently resolved but some of the sidechains’ conformations could not be determined. Again, these need to be added and then the entire structure energy minimised to provide a refined version. Such good quality models and refined high-resolution crystal structures, where there is a good degree of confidence in the information, represent satisfactory starting points for further study and it is often particularly helpful to have a bound inhibitor or substrate present in the enzyme's active site, as this offers an insight into the likely model of binding for other selective compounds. In the sphere of bacterial P450s,42  for example, where there is current interest in utilising the modified enzymes in the targetted biosynthesis of antibiotics, it has been possible to employ the same homology modelling procedures outlined above for constructing certain P450s sequenced from Streptomyces coelicolor. In particular, it has been shown that one can readily produce, from the CYP154C1 crystallographic template, the 3-D structures of CYP105D5 (present in S. coelicolor) and CYP105M1 (present in S. clavuligerus) that are involved in the biosynthesis of actinorhodin and clavulanic acid, respectively (Lewis and Avignone-Rossa, unpublished results).

Molecular dynamics (MD) is an important technique for exploring the conformational space of biologically important macromolecules, such as proteins and nucleic acids, at physiologically relevant temperatures. Although computationally intensive, it is possible to utilise Newton's equations of motion to describe the likely behaviour of proteins such as P450 over realistic timeframes (nanoseconds) and in the presence of solvent molecules. MD has been employed to compare the conformational fluctuations in human P450 crystal structures and homology models, to show that there are close analogies between these two forms of 3-D structure in that, essentially, both represent a different starting conformation of the same protein. MD is a useful procedure for testing the validity of newly constructed protein models in order to derive measures of confidence in the energy-minimised structures obtained from molecular mechanics. It has been found that the Amber Force Field is most appropriate for running molecular dynamics simulations on proteins, and this is the one used to investigate the quality of homology models by comparing MD simulations with those carried out on crystal structures. In fact, it is found that the behaviour of both models and X-ray structures of P450s under dynamics is essentially the same, thus providing some degree of confidence in the homology modelling technique.

In addition to MM and MD procedures, there are several types of quantum mechanical (QM) approach available involving molecular orbital (MO) calculations of electronic structure which can be applied to the P450 system. In particular, it has been reported that Density Functional Theory (DFT) represents the most appropriate method for performing MO calculations on systems containing transition elements, and several groups of coworkers have studied the various stages in the P450 catalytic cycle using such procedures.117,121  Although the calculations are computationally intensive, it has been possible to explain much of the catalysis of this system via DFT, albeit only on the haem-thiolate moiety.

It has long been known that lipophilic character of a substrate has a bearing on P450 activity.138–145  This relationship has been quantified for many of the important drug-metabolising enzymes, and it is clear that there is a relatively straightforward relationship between compound lipophilicity, obtained from experimental log P values, and substrate binding affinity expressed in terms of either Km or KS values.83  For example, with substrates of CYP2B6, it is possible to discern a strong linear relationship between log P and -log Km (Figure 1.12), where conversion to the corresponding ΔG values enables estimation of the average binding energy component which is unrelated to compound lipophilicity. For these data, it appears that the average binding energy, presumably a combination of π-π stacking and hydrogen bond interactions is −2.4 kcal mol−1, thus indicating that there is a common hydrogen-bonded and π-π stacked interaction with substrates in the active site region, and this is confirmed by molecular modelling studies based on the CYP2B4 crystal structure. Similar findings have been observed for other P450s, although there are often outliers and/or secondary parallel lines for different substrates which suggest that additional active site interactions are present, since the difference in intercepts indicates either an extra hydrogen bond (−2 kcal mol−1) or a π-π stacking (−1 kcal mol−1) interaction being formed between substrate and the enzyme's active site region. The combination of molecular modelling and site-directed mutagenesis techniques is particularly useful for investigating such possibilities. Consequently, theoretical studies represent important in silico probes for pursuing investigations of the structure and function of P450s, especially where they complement experimental findings and thus provide an explanation of the observed facts. However, many potential structural descriptors exist146,147  and, therefore, extensive QSAR analysis should lead to further insights of P450 structure and function towards specific substrates and inhibitors.

Figure 1.12

Lipophilicity relationship for selected CYP2B6 substrates where −logKm is plotted against log P for 16 compounds. Km is the Michaelis constant for CYP2B6-mediated metabolism and P is the octanol/water partition coefficient.

Figure 1.12

Lipophilicity relationship for selected CYP2B6 substrates where −logKm is plotted against log P for 16 compounds. Km is the Michaelis constant for CYP2B6-mediated metabolism and P is the octanol/water partition coefficient.

Close modal

One of the features of the P450 superfamily of enzymes is its wide diversity of substrates, possibly approaching a million separate compounds, and from most chemical classes of organic molecules. Moreover, P450 enzymes display altered substrate selectivities across the superfamily. In some cases, there is exquisite selectivity bordering on specificity towards a single compound, as in the example of the mammalian steroid hormone biosynthetic pathway (Figure 1.5) showing that many of the stages whereby androgens and oestrogens may be synthesised from cholesterol are catalysed by P450s of the CYP11, 17, 19 and 21 families. This highly regulated and well-coordinated pathway for the synthesis of steroids is an example of how tightly engineered enzyme systems have evolved during the early stages of metazoan development for physiologically important hormonal functions. In addition to steroid biosynthesis, endogenous P450 substrates include: fatty acids, eicosanoids, prostanoids and certain vitamins, such as vitamin D3, together with some of their percursors. The known substrates of human P450 enzymes are presented in Table 1.9, which shows that certain forms are primarily associated with xenobiotic metabolism although some of these are also known to metabolise endogenous compounds such as the steroid hormones (Figure 1.5). It is thought that these may have originally evolved to provide homeostatic mechanisms for maintaining hormone levels at certain stages of the organism's development, such as for hormonal imprinting in the neonate.148  The substrate selectivity of the xenobiotic-metabolising P450s, however, represents a broad and complex area, especially with respect to species differences, even those between rat, mouse and human, for example.149 

It is possible to rationalise human P450 substrate selectivity and routes of metabolism based on the evidence from mammalian crystal structures and the three-dimensional models derived from them. The current homology models show very satisfactory correspondences with the recently reported human P450 crystal structures, and it is now possible to derive expressions for the goodness-of-fit in terms of rmsd values (where rmsd is the root mean square distance between matched α-carbons) and percentage sequence identity for the fitted structures, as discussed previously. The results imply that an overall homology of 40% and above between target and template sequences should produce models which can be expected to compare favourably with crystallographic findings.115,135  The relationship between rmsd and percentage sequence homology can be formulated as follows for six known mammalian P450 crystal structures:

graphic
where the correlation coefficient of 0.98 indicates the strength of this linear relationship. Careful analysis of the data reveals that the relation is actually exponential and, consequently, one can derive a logarithmic expression which gives slightly better statistical signifcance, with a correlation coefficient of 0.99 for six points, corresponding to CYP2A6, CYP2C8, CYP2C9, CYP2D6, CYP3A4 and CYP2B4 (see Figure 1.6).

The structure of P450s has remained essentially the same over the time course of biological evolution, with only minor variations to enable membrane binding for accommodating the move from prokaryotes to eukaryotes, and to allow for the occupancy of extreme environments such as those encountered in thermal oceanic vents. The many classes of reaction types, and huge chemical diversity of substrates, have been engineered over the evolutionary timescale by alteration of certain key amino acid residues whilst maintaining a generally conserved tertiary fold. Inherent conformational flexibility in certain P450s has provided the xenobiotic-metabolising forms with the possibility of accepting differently sized substrate molecules (eg. in CYP3A4 and CYP2C8), whereas those P450s with specific endogenous functions exhibit extremely narrow substrate selectivities (eg. CYP11, CYP17, CYP19 and CYP21). Over twenty unique P450 structures have been resolved crystallographically, mostly from bacteria, although several from mammalial species have recently become available, thus reflecting our anthropocentric viewpoint. Examination of further evolutionary relationships would necessitate the X-ray crystal structures of P450s from other animalia such as birds, reptiles, fish and amphibia, although much can be deduced from protein sequence comparisons. P450 has probably been the most extensively studied enzyme system known, as a considerable battery of spectroscopic and other physical techniques have been applied in an effort to understand its structure and function.1,2  There have also been extensive biochemical and molecular biological studies which, when complemented by robust in silico tools such as QM/MM/MD calculations, provide an intriguing insight into the unique biophysical chemistry of this most versatile of Nature's catalysts.

Although the crystal structures of 24 unique P450s are now available out of over 5000 enzymes known from sequencing, it is apparent that, in general, there is a retention of secondary and tertiary structure across species as diverse as bacteria and mammalia. The common structural core of P450 has probably been largely conserved during the course of evolution, although modifications have occurred in order to incorporate the initially cytosolic enzyme in prokaryotes to a membrane-bound one in eukaryotic species. In addition, the variations in substrate selectivity have been ‘tailored’ over the course of biological species development by key amino acid residue changes in the SRS regions, primarily. Furthermore, redox partners have altered over evolutionary periods, changing from the iron-sulphur redoxins in most bacterial systems and in mitochrondrial P450s to flavoprotein reductase in the microsomal system, together with the unusual Bacillus megaterium P450 (CYP102) which comprises fused reductase and haemoprotein domains. There is considerable potential for harnessing the important catalytic properties of P450s, such as CYP102, for biodegradation and other commercial applications150,151  including the design of industrial catalysts for processes which would otherwise be extremely protracted and difficult. However, the main interest in P450s lies in their major role in the metabolism of drugs and other xenobiotics where safety assessment of novel chemical entities is particularly relevant to the pharmaceutical industry.152  In conclusion, it should be recognised that much is understood about P450 but there are likely to be further surprises in our quest for further knowledge of this extraordinary enzyme family.

CYP is the standard abbreviation for cytochrome P450 when referring to the particular gene, enzyme family, subfamily or individual enzyme. However, P450 (plural: P450s) is employed in reference to cytochromes P450 in general. CYP is italicised with reference to specific P450 genes.

ER

endoplasmic reticulum

QSAR

quantitative structure-activity relationship

bya

billion years ago

mya

million years ago

P

octanol/water partition coefficient

D

distribution coefficient at pH 7.4

SRS

Substrate recognition site

rmsd

root mean square distance

ROS

reactive oxygen species

Eo

redox potential

-log K for oxygen binding where K is the equilibrium constant

Rmsd

root-mean-square distance

NCE

new chemical entity

SDM

site-directed mutagenesis

PDB

protein databank

Mr

relative molecular mass

pKa

logKa where Ka is the acid-base dissociation constant

The financial support of GlaxoSmithKline Research & Development Limited, Merck Sharp & Dohme Limited, ExxonMobil Biomedical Sciences Incorporated, British Technology Group Limited and the University of Surrey is gratefully acknowledged by DFVL. Yuko Ito would like to thank the Japanese Foundation for funding a Visiting Scientist Scholarship, and the Daiwa Anglo-Japanese Foundation for a travel grant.

1

The cysteine is present within a well-conserved decapeptide that constitutes the P450 signature motif by which the presence of the enzyme can be identified via sequence analysis.

Close Modal

or Create an Account

Close Modal
Close Modal