Foreword: Phenotypic Drug Discovery: A Personal Perspective
-
Published:09 Dec 2020
-
Special Collection: 2020 ebook collectionSeries: Drug Discovery
J. A. Lee, in Phenotypic Drug Discovery, ed. B. Isherwood and A. Augustin, The Royal Society of Chemistry, 2020, pp. P009-P020.
Download citation file:
Foreword
Contemporary drug discovery utilizes a spectrum of lead generation/optimization approaches ranging from hypothesis-driven strategies which associate specific molecular targets to disease states (target-based drug discovery, or TDD) to biology-/compound-driven approaches, which are mechanism-agnostic and based on empirical observations (phenotypic drug discovery, or PDD). In the past 10 years the use of modern PDD has dramatically increased from an estimated <10%1 to 25–40% of the project portfolio of select companies.2,3
PDD Comes of Age
Prior to 2010, target agnostic phenotypic approaches were (1) utilized in lead generation to identify molecules which led to novel therapeutics;4–7 (2) incorporated into a mechanistic systems biology assay platform;8 and (3) served as the experimental foundation for the first open innovation program in pharma.9 In 2011, David Swinney published a landmark paper, “How Were New Medicines Discovered?”,10 which analyzed the strategies used to discover new molecular entities (NMEs) approved by the US Food and Drug Administration (FDA) between 1999 and 2008. The principal findings of the study were a surprise to pharma: (1) the majority of first-in-class (FIC) small-molecule drugs were discovered empirically through phenotypic drug discovery (PDD), whereas the majority of followers were discovered using target-based drug discovery (TDD); and (2) the overall rate of FIC drug launches from PDD was two to three times higher than TDD strategies10 during a period when less than 10% of pharma lead generation was estimated to use PDD.1
Experimental work from early adopters addressed tactical issues related to PDD operations, such as assay robustness/statistical validation, compatibility of PDD with contemporary pharma workflows (structure–activity relationship support, computational hit expansion), and PDD's ability to identify lead molecules which were differentiated from standard of care and utilized novel mechanisms of action.9,11 As PDD practitioners originate from diverse scientific fields and collaborate across multiple therapeutic areas, an informal global community of practitioners assembled with the help of an online PDD special interest group (PDD SIG) via LinkedIn, PDD SIG meetings and PDD classes at the annual Society for Laboratory Automation and Screening meeting, special journal issues focused on PDD,1,12 and by initiation of a keystone meeting dedicated to PDD: “Modern Phenotypic Drug Discovery: Defining the Path Forward”.13 These various forums provided a means for scientists and biopharmaceutical managers to broadly discuss/debate the pros/cons and best practices for PDD and its role in drug discovery in the post-genomic world.
Managing Risk
Although difficult to quantitate, utilization of PDD does not appear to be uniformly embraced in biopharma. Publications from several pharmaceutical companies have suggested an integration of phenotypic approaches into their discovery process,2,3,14–17 with AstraZeneca and Novartis indicating that PDD, broadly defined, contributes 25%3 and up to 40%2 of their respective discovery portfolios. In contrast, other pharmaceutical companies utilize “PDD-lite” for focused investigation of new biology for established targets, have discontinued their use of PDD, or have yet to utilize PDD. This heterogeneous uptake of PDD is not unexpected, based on discussions from members of the PDD community. PDD was recognized to face intellectual, emotional, and business hurdles, as most practicing scientists today, both chemists and biologists, were trained when TDD strategies were prevalent and molecular cloning, ready access to purified proteins, and protein three-dimensional structure determination were commonplace.1 PDD is not a magic bullet or an easy path forward. Tactical considerations include the front-loading of resources (especially in vivo pharmacology; absorption, distribution, metabolism and excretion; and toxicology); the need for data-informed, dynamic flow schemes; and hurdles to the development and qualification of assay systems that mimic disease biology and build the chain of translatability.18,19
Questions related to target identification/deconvolution (TID) are inherent to any discussion of PDD.1,18 Technical advances in single-cell RNAseq, chemical biology, protein affinity tools, CRISPR/Cas-9 functional genomics and cellular profiling have greatly expanded the toolbox for TID studies, as discussed in several of the following chapters and reviewed recently.20 Although progress in these methodologies represent technical breakthroughs, TID studies do not often have a “one size fits all” solution; details related to the specific biology, the experimental system and attributes of lead compounds impact the initial choice of TID methods/strategies, but unfortunately do not guarantee success.
Multiple business, scientific, and managerial factors influence decision points related to PDD and TID.1,18 Considerations include organizational risk tolerance, unmet medical need, competitive landscape, potential to improve/differentiate from standard of care, translatability of discovery phase models, discovery of predictive biomarkers and balancing the tradeoff between TID advantages versus the absence of an FDA requirement to identify a molecular drug target before clinical trials or final drug approval.21 These elements form a complex decision matrix for analysis of potential cost/benefits of PDD, including risk-tolerance of an organization, in formulating next steps.1,18,19
Added Value
Although the multidimensional cost/benefit/risk tolerance matrix of each potential therapeutic is unique, the development and FDA approval of lenalidomide for multiple myeloma and myelodysplastic syndromes is a notable case study. Inspired by observations that thalidomide effectively treated leprosy, modulated multiple anti-inflammatory cytokines, inhibited angiogenesis and showed activity in multiple myeloma,22 higher potency thalidomide derivatives were synthesized and gained initial FDA approval for multiple myeloma in Q4 2005 followed by additional oncology and auto-immune indications (for review see ref. 23 and 24). In accordance with the FDA's guidance from the Center for Drug Evaluation and Research and the Center for Biologics Evaluation and Research,21 product launch preceded the identification of the molecular drug target by 9 years when the unprecedented molecular target and mechanism of lenalidomide was elucidated (alteration of E3 ubiquitin ligase (cereblon) substrate specificity to redirect the chemical modification and degradation of target proteins).25 The return on investment for utilizing this somewhat risky development strategy included first-year product revenues of $773.9 million,26 which have steadily increased to projected revenues of >$15 billion per year in 2020.27
Drivers for PDD center around the molecular target-agnostic, biology-first philosophy underlying the strategy and the resulting disproportionate number of FIC medicines derived from the approach.10 The discovery and development of therapeutics for hepatitis C and cystic fibrosis illustrate how empirical PDD identified FIC drugs using novel molecular mechanisms with consequent therapeutic impact. Hepatitis C is a liver disease caused by the hepatitis C virus (HCV) leading to >100 000 fatalities annually. HCV replication inhibitors, direct-acting anti-virals (DAAs), successfully treat hepatitis C and clear virus in >90% of infected patients. DAAs are composed of drug combinations with a common drug component that was initially discovered by PDD and which binds to NS5A, a protein essential for HCV replication with no known activity.6,28,29 Clinical studies with DAAs indicate that with appropriate infrastructure and high levels of HCV screening, Australia is on course to eliminate hepatitis C in 10 years.30 Cystic fibrosis is a progressive disease which decreases the average life expectancy of patients to 38 years. Cystic fibrosis is caused by mutations of the cystic fibrosis transmembrane conductance regulator (CFTR) gene which, in general, decreases CFTR channel function or interrupts CFTR intracellular folding and plasma membrane insertion.31 Target agnostic screens using cell lines expressing wildtype or disease-associated CFTR variants identified compound classes which improved CFTR channel gating properties (potentiators) and an unexpected mechanism class which enhanced the folding and plasma membrane insertion of CFTR (correctors).6,28,29 Ivacaftor, a CFTR potentiator, was approved by the FDA in 2012 and provided treatment for 5% of cystic fibrosis patients. Clinical trials using ivacaftor in combination with CFTR correctors led to the FDA approval of tezacaftor–ivacaftor in 2018 and the approval of elexacaftor– tezacaftor–ivacaftor (two correctors, one potentiator) in 2019, therapeutics which are predicted to treat 46% and 90% of cystic fibrosis patients, respectively.32,33
Consideration of several PDD-derived therapeutics tested in recent clinical trials buttresses the notion that empirical, target-agnostic drug discovery strategies often reveal novel mechanisms of action, and, intriguingly, may challenge our traditional notions of a “drug target”. Rather than working through specific molecular targets, many clinical-phase PDD therapeutics modulate specific cellular processes through ill-defined molecular mechanisms or by engagement of multicomponent “cellular machines”, a poorly defined molecular target. These include NS5A modulators, which inhibit HCV replication through a specific target which functions by an unknown mechanism;7 drugs which enhance CFTR folding and intracellular processing;6,29 therapeutics which redirect the degradation of intracellular proteins;25 translation inhibitors specific to PCSK9;17 and modulation of specific pre-mRNA splicing.14,15,34 Perhaps it is time to expand the concept of “drug target” to include specific signaling pathways and cellular processes in addition to specific molecular entities.
Long-term Innovation
I previously suggested that the lack of sustained NME productivity across the industry (prior to 2012) did not appear to reflect technology advances created by the genomics revolution and that PDD may provide a means to enhance pharma productivity and potentially minimize the number of “me-too” TDD drug discovery projects.1 Fortunately, the NME production trend has shown an overall steady improvement since 2011 with a current 5-year rolling approval average of 44 new drugs per year, double the low point of 22 in 2009.35 In contrast, overall pharma innovation, as measured by the FDA approval of FIC medicines (expressed as a percentage of total approvals) has shown a slight downward trend since 2014.36 A detailed breakdown of FIC medicines reveals that therapeutics based on recombinant protein production and other novel therapeutic approaches (antisense, antibody conjugates, cell and gene therapies) have trended upward, whereas the number of small-molecule FIC medicines have been flat, contributing approximately 5.8 NMEs per year37 in alignment with previous estimates of five per year.38,39 This trend suggests that the increase in FIC productivity and biopharmaceutical innovation in recent years is due to the maturation of technologies associated with biologics and new therapeutic modalities. Although it is heartening to realize a delayed impact of the genomics revolution on pharma innovation, it is important to note that the majority of new therapeutic modalities such as oligonucleotides/RNAi, PROTACs (proteolysis targeting chimeras), next-generation peptides, antibody drug conjugates, and gene therapy40 are based on a therapeutic hypothesis which links a specific target protein, mRNA, or gene to the disease biology, all TDD strategies.
In general, the uniformity of a TDD process provides easily defined metrics for project support and portfolio prioritization decisions which mitigates perceived risk and which casts TDD as the “low-hanging fruit” with a higher probability of success.18,19 However, with a long-term goal of developing innovative FIC therapeutics for unmet or underserved medical needs, the choice to utilize PDD involves more than project metrics and risk profiles. As highlighted in this article, PDD has uncovered unexpected disease-relevant mechanisms and revealed novel molecular targets and target classes. In contrast to TDD, PDD should be viewed as a drug discovery strategy which potentially leverages the vast biological diversity of novel drug target classes and “drug target space”.
How vast is drug target space? Historically, drug targets are proteins; integrated database mining of the human genome estimates that only 9% of the human proteome is known to engage drugs and tool compounds.41 Of the remaining 91% of the human proteome which is unliganded, 31% represents known drug target families (G protein-coupled receptors, nuclear hormone receptors (NHRs), kinases, channels, transporters, transcription factors and enzymes), suggesting that there may be up to three times more drug targets than currently utilized.41 The remaining 60% of the unliganded proteome belongs to protein families which are of current unknown significance to the biopharmaceutical sector.41 But, as illustrated by the novel protein classes found to be “druggable” by PDD, it is possible that additional protein drug targets may exist within this “proteomic dark matter”.
Therapeutic molecules targeting RNA have been of historical importance in infectious disease research (reviewed in ref. 42). Clinical-phase compounds derived from PDD highlight the importance of RNA splicing as a therapeutic mechanism of action in diverse human disease states.4,14,34 RNA as a clinically relevant drug target has been demonstrated by the development of SMN2 pre-mRNA splice modulators43,44 (reviewed in chapter 8). In addition to RNA splice modulators, the diverse functions associated with non-coding RNAs45 and the occurrence of potential binding sites within the tertiary structures of folded RNA have stimulated interest in structure-based drug discovery on RNA drug targets,46,47 although to date, existing RNA-targeted therapeutics have been derived exclusively from phenotypic approaches.42,46,47
Although speculative, recent advances in cell biology may expand the traditional definitions of protein and RNA-based drug target space. For example, recent work suggests that microproteins encoded by “non-coding RNAs” may have broad biological functions.48 Generally, a protein is inferred from genomic sequence data by bioinformatics analysis where open reading frames (ORFs) starting with an AUG are associated with canonical sequence motifs and are predicted to encode a protein ≥100 amino acids, considered a minimum size for folded protein domains. Analysis of RNA fragments that are protected from nuclease digestion by ribosomes and hence may be translated into polypeptides, identified more than 5000 previously unannotated ORFs which utilize non-canonical start codons and are less than 100 residues.48 More than 200 of these unannotated ORFs were confirmed by proteomics analysis; in addition, CRISPR-Cas9 disruption of 2353 selected unannotated ORFs identified more than 400 which stimulated the growth of human K562 leukemia cells and induced pluripotent stem cells. Subsequent work with a handful of non-canonical ORFs encoded by long noncoding RNAs revealed that these microproteins can interact with other cellular proteins and exhibited distinct subcellular localizations.48 Previous work detected microproteins in cardiac tissue49 and is consistent with a physiological role for small unannotated ORFs. It is tempting to speculate that a subset of the >5000 unannotated ORFs or microproteins may be disease-relevant and potential molecular targets for future therapeutic intervention.
Biomolecular condensates (BCs) are an emerging area of cell biology which has sparked initial interest in the biopharmaceutical community.50 BCs are micron-scale subcellular structures found in the nucleus and cytoplasm of eukaryotic cells which function to concentrate and segregate diverse proteins, cellular machines, and RNA into various supramolecular complexes (for review, see ref. 51). Examples include nucleoli (ribosome biogenesis), Cajal bodies (assembly of RNA spliceosomes), promyelotic leukemia nuclear bodies (potential roles in apoptotic signaling, antiviral defense and transcriptional regulation), processing bodies (mRNA and microRNA processing and mRNA silencing), stress granules (RNA processing and storage during cellular stress), and Lewy bodies (associated with Parkinson's disease). BC are not typical organelles bounded by a lipid bilayer, but form through liquid–liquid phase separations, driven by interactions between various classes of multivalent molecules. BC research is an intriguing and emerging area of cell biology51 with a nascent but potentially intriguing connectivity to drug discovery.50,52
The size of drug target space may therefore be potentially vast and is likely to expand the number of molecular target families which are currently considered “druggable”. Moreover, translational validation of a molecular target to a disease state is risky, difficult, and without guarantees of success.53–56 This notion is supported by a bibliographic database-mining study which demonstrated that research publications tend to focus on molecular targets that were the subject of previous published work.57 Barriers to widespread target validation studies may include uncertainty and time constraints imposed by grant support of academic research and cycle-time constraints and risk aversion in the for-profit pharma sector. Regardless, exploring the potential therapeutic linkage of novel targets and disease states would be greatly aided by the development of disease-selective pharmacological tool compounds and advanced leads identified by target-agnostic PDD.
Closing Remarks
The genomics revolution has enhanced innovation in the biopharmaceutical industry through the development of new therapeutic modalities such as antisense, antibody conjugates, cell and gene therapies. However, these therapeutic modalities require information linking the molecular target to the disease state. Obtaining this translational knowledge, termed target validation, can be problematic.53–56 Bibliographic database mining of almost 20 million papers published between 1950 and 2009 revealed a publication bias for two well-documented drug target families. Specifically, the analysis of Edwards et al.57 found that 65% of the kinase manuscripts and 75% of NHR publications focused on 50 kinases (out of 550) and six NHRs (out of 48) that were most studied in the 1990s. This suggests that the majority of basic research focuses on extending the characterization of known drug targets rather than exploratory work on less characterized family members. Difficulties in target validation may be reflected by the high percentage of failures in phase 2 and 3 clinical trials due to lack of efficacy: 64–84% of selected phase 2 trials3 and >50% of phase 3 trials industry-wide.58 Notably, these late-stage clinical failures occur after the clinical candidate is typically known to engage the molecular target in vivo.
The sheer size and molecular diversity of potential drug target space further confounds efforts to translationally link specific molecular targets to disease states. Conservatively the potential drug target space is reflected by the unliganded human proteome of known drug target families (31%);41 in addition, potential, perhaps speculative, drug targets associated with 61% of the human proteome not currently recognized as drug targets,41 non-coding RNAs,45 and microproteins encoded by “non-coding RNAs”48 may significantly expand potential drug target space.
Finally, the concept of target validation implies that a single molecular target is underlying the disease state biology, a concept which has been called into question by discussions of network pharmacology, multitargeted drugs, and poly-pharmacology.59–62 Difficulties in target validation and the confounding effects of poly-pharmacology are highlighted by a study where 10 cancer drugs (seven of which have been used in >29 clinical trials) and their six putative molecular targets are systematically knocked-out by CRISPR.63 Although more than 180 publications linked these various molecular targets with proliferation of various cancer cells, CRISPR knock-out of each alleged molecular drug target did not affect either cell proliferation or drug efficacy, suggesting that these advanced oncology compounds act via off-target effects rather than through their “validated” molecular targets.63 Taken together, it is likely that multiple factors contribute to difficulties in translational target validation and contribute to the high failure rates of phase 2 and 3 clinical trials3,58 due to lack of therapeutic efficacy.
Unlike TDD, PDD is independent of the validation state of the molecular target. One of the strengths of an empirical, PDD approach is that it is a biology-first, target-agnostic drug discovery strategy. This perspective has provided examples of FIC medicines derived from PDD which revealed unexpected mechanisms of action and identified unanticipated molecular targets, substantially improved patient quality of care, and have provided annual product revenues >US$1 billion to organizations which successfully balanced the multidimensional cost/benefit/risk tolerance matrix associated with the use of PDD.
PDD is not a magic bullet or an easy path forward and multiple considerations such as business, scientific, managerial, and organizational strategy/risk need to be carefully weighed.18,19 Nevertheless, it is my opinion that PDD approaches are, in the long term, necessary to understand the vast complexities of biology and to provide therapeutics for underserved and unmet medical needs.
It has been my personal pleasure and the highlight of my career to be a part of the PDD community. Special thanks to Ellen Berg, Fabien Vincent, and Dave Swinney for hours of discussion and fun. I wish current and future PDD investigators the best of luck and good hunting in the second decade of “neo-classic”1 drug discovery.