Chapter 1: HTS Methods: Assay Design and Optimisation
-
Published:05 Dec 2016
-
Special Collection: 2016 ebook collectionSeries: Chemical Biology
D. Murray and M. Wigglesworth, in High Throughput Screening Methods: Evolution and Refinement, ed. J. A. Bittker and N. T. Ross, The Royal Society of Chemistry, 2016, ch. 1, pp. 1-15.
Download citation file:
High throughput screening (HTS) remains the key methodology for finding hit and lead compounds within the pharmaceutical industry and also, recently, in the academic drug discovery community. HTS has changed significantly in AstraZeneca over the last 15–20 years with a massive expansion of the number of compounds available to screen, increasing industrialisation and automation of the process in order to cope with larger numbers of compounds, and more recently, the running of screens from external collaborators via open innovation initiatives. In this chapter we will discuss how this approach to HTS has been developed within AstraZeneca and pay particular attention to how the optimisation and validation of the wide spectrum of assays that we have to deal with in our group is done. We will discuss how we accept assays into HTS from our assay development groups and describe how these assays are validated and optimised for use as an HTS screen.
1.1 Introduction
High throughput screening (HTS) remains the key methodology for finding hit and lead compounds within the pharmaceutical industry1 and also, recently, in the academic drug discovery community. HTS has changed significantly in AstraZeneca over the last 15–20 years with a massive expansion of the number of compounds available to screen, increasing industrialisation and automation of the process to cope with larger numbers of compounds, and more recently, the running of screens from external collaborators via open innovation initiatives. It has also become evident over the last 5 or so years (at least in AstraZeneca) that a more nuanced approach to HTS is required, where a large repertoire of assays is needed that spans from very high throughput “industrial” biochemical assays for targets such as kinases to highly complex cell based phenotypic assays2 on hard to source cells such as primary cells, genetically engineered human cell lines and induced pluripotent stem cells. We are also using a wider range of detection methods, from standard plate reader assays through to technologies such as flow cytometry, imaging and high throughput mass spectrometry. This presents very significant challenges in designing and developing complex cell and biochemical assays for the assay development teams, and huge challenges to the HTS group to run hundreds of thousands to millions of compounds through these assays.
There are perhaps two core models of how to run HTS in drug discovery. The simplest and arguably most efficient model is to limit the repertoire of assays to very few detection technologies, and if the assay cannot run in this mode it will not be run. This allows both operational and cost efficiencies, and increases the productivity of a limited team. However, this model can also limit the impact of HTS on drug discovery by narrowing the targets that undergo HTS. Within AstraZeneca we run a model of HTS where we will try and run complex biochemical or cell based assay as high throughput screens. This promises to find a wider range of hits against a wider range of targets but it does require very sophisticated and costly automation platforms and considerable effort is needed to develop assays robust enough to screen large compound libraries. This requires staff with a wide range of experience and expertise. We need people with experience in running large scale assays and the management of the logistics of such assays. We also require experts in automation, informatics and statistics plus more specialised technologies such as flow cytometry and mass spectrometry. We also need to mirror some, but not all, of this expertise in the assay development teams. This makes staffing of such an HTS department more difficult, and with a need for more specialisation, departments can become less flexible.
In this chapter we will discuss how this approach to HTS has been developed within AstraZeneca and pay particular attention to how the optimisation and validation of the wide spectrum of assays that we have to deal with in our group is done. We will discuss how we accept assays into HTS from our assay development groups and describe how these assays are validated and optimised for use as an HTS screen.
1.2 HTS at AstraZeneca
Within AstraZeneca we have a single global HTS centre that provides high throughput screening for all AstraZeneca disease areas as well as our collaborators who have taken advantage of the various open innovation initiatives that AstraZeneca has launched. The HTS centre sits in an organisation within AstraZeneca called Discovery Sciences. Discovery Sciences supplies a large set of scientific and technical services to AstraZeneca, allowing for consolidation of the expertise and infrastructure to supply these vital components of the drug discovery value chain. This results in the HTS group interacting widely across the business as well as outside of it. In terms of reagent supply and assay development of high throughput screens, this is carried out by a separate group within Discovery Sciences called Reagents and Assay Development (RAD). Although introducing a handover step into the HTS process, this again allows the consolidation of expertise and infrastructure to both save cost and increase quality. However, the handover does present challenges to both the assay development and HTS groups, who must make sure that all assays required for HTS are of the quality that is required to support the costly undertaking of a screen. This organisational structure has led to a considered process of defining the criteria for an acceptable screen, accepting the screen and validating the screening assay to ensure that it is indeed suitable for an HTS campaign without incurring a large bureaucratic burden. Although some have questioned the need for these criteria, it is our experience that the standards defined within them are vital to facilitate the transfer and deployment of successful screening assays. Without this foundation we have found that standards inevitably slip and different practices spring up within and across groups, leading to issues with assays of varying quality being prepared for HTS.
1.2.1 Criteria and Acceptance
HTS is both costly to set up, with a high initial capital outlay, and a demanding process to maintain and run, yet it remains a good return on investment by being the most productive hit finding strategy we employ. To be able to screen millions of compounds and get a set of reliable data is difficult. Equally, HTS is the main method for finding novel chemistry for projects within AstraZeneca and beyond, and it is critical for keeping the pipeline of drug discovery projects filled with high quality chemical equity. Within HTS we have developed a set of criteria that will result in assays that are fit for the task of finding chemical leads.
However, it is worth reiterating that these are not hard and fast rules. What is important is that these guide the scientists to have a conversation regarding what risks are acceptable, where the problems lie and how they can be overcome. It would be our advice to anyone looking at these criteria to assess the quality of assays as early as possible as this will minimise the possibility of re-work later in assay development. These “mini” validation experiments combined with the recommended statistics can really help to define why and how an assay needs to be modified to become a good HTS assay.
The overriding aim is the development of robust assays. In many respects HTS is an anomaly in drug discovery in that the vast majority of data are generated by taking a single concentration of a compound and testing it just once in an attempt to see if it is active against a biological target. Of course there is an element of replicate testing in large HTS collections as there are clusters of compounds of similar structures, but an HTS assay needs to be sensitive enough to detect relatively weak compounds and robust enough that the false positive and negative rates are low. Much focus is on false negatives, as of course we do not like to think we have missed something. However, managing false positives is, arguably, a greater challenge and can lead to compounds being missed as teams try to separate true hits from many hundreds or thousands of false hits. Additionally, the assay has to be of a form that can actually be run on the automation platforms we have or be run on manual workstations at a throughput that allows the assay to complete in the time frame required to allow the flow of projects through a portfolio of assays. An assay also has to be reliable in that if it is run twice it will find the majority of active compounds both times, confirming that the hits are not due to random events. It quickly becomes apparent that there are some key criteria that an HTS assay has to fulfil to maximise its utility in finding hit compounds:
Robustness
Reliability
As simple to run as possible
Affordable
Relevant
In this discussion we will focus on the first three bullet points in explaining how we have generated a set of criteria to help design good HTS assays. Affordable is a given in many respects in that an assay has to fit within a budget. We do run assays with quite a range of different costs but there always has to be a balance between the cost and maximising both how easy the assay is to run and the ability to find hit compounds. Relevant is a key criterion and may seem obvious but is worth stating. The assay has to be relevant to the biological or disease process that we are wishing to disrupt or stimulate. Anything other than this wastes the investment in the screen. Robustness and reliability in many respects overlap, and in fact, robustness should lead to reliability.
1.2.2 Robustness/Reliability
Within HTS at AstraZeneca robustness of an assay is key. In our experience, a lack of robustness is the key reason we will struggle with an assay or in extreme circumstances stop the assay running. Determining robustness is a large topic with many differing opinions. We will discuss what works for us and it should be noted that in many of these topics another criterion we use is to keep things simple and understandable for the scientists doing the screening (and indeed the assay developers) whilst having a fit for purpose set of criteria. In Figure 1.1 we give the criteria that we use when setting out to develop an HTS assay. These are an attempt to generate robust and reliable assays that will pass assay validation. They are derived from our experience across AstraZeneca and other Pharma companies in HTS over the last 15 years and are there to guide the user to make informed decisions rather than being a simplistic check list. They are by no means an exhaustive list but are what we consider to be key. Below we look at some of these criteria one by one.
Z′-factor is a widely used parameter to help determine the robustness of an assay and its use for single shot screening. It is simple to understand and is popular across assay development and screening groups due to its proven utility. We use the robust Z′-factor to determine how sensitive the assay will be in finding hits and as a measure of the robustness of the assay from the performance of the control wells. We do not use Z-factor routinely (although it is calculated in our data analysis package) as we screen focussed libraries of compounds, which can be an issue because the very high hit rates result in a compound activity distribution that does not define the true central reference for the assay, leading to an artificially low Z-factor. Additionally, sticking to the Z′-factor gives consistency across the assay development and HTS groups when comparing data. The reason why we have adopted the robust Z′-factor {where standard deviation is replaced by robust standard deviation [median absolute deviation (MAD)×1.483] and mean by median in the equation derived by Zhang et al.3 } is to remove the influence of outliers on the Z′-factor and to remove the need for human intervention, which can result in people chasing a target Z′-factor value with subjective removal of “outlier” data. Although Zhang et al.3 state that assays can be used with a Z/Z′-factor as low as 0 to give a yes/no answer for an HTS primary screen, our experience has shown us that assays need to have a robust Z′-factor of at least ≥0.5 to perform robustly and reliably. We of course remain pragmatic and will take assays with a lower Z′-factor when the target is very high value and there is no alternative assay and/or nothing more can be done to improve an assay. In these circumstances we will look at other approaches to improve robustness such as replication in the assay, which almost certainly will reduce the number of compounds screened, or perform a quantitative HTS (qHTS) where concentration responses are run as a primary screen, again on a significantly reduced number of compounds.
A signal to background ratio (S : B) of >3 is used to ensure robustness and our experience again shows us that a relatively poor robust Z′-factor and a small S : B most likely will result in a poor assay unsuitable for HTS. This may again seem obvious but there is pressure from project teams to run assays, after all not running an assay guarantees not finding hits, and without a clear set of criteria clear decisions are harder to achieve. It is important that assay developers do not try and configure assay parameters solely to ensure the measurement of very high potency compounds to the detriment of a good S : B, especially as HTS most likely will not find such high potency compounds, and even if there were such compounds, an accurate measurement of potency at the HTS stage is not important; detection of active compounds is what we need.
Measuring the percentage coefficient of variation (%CV) across whole plates ensures that the dispensers and readers are functioning correctly, and if they are available, running known pharmacological standards as concentration responses gives confidence that the assay will find active chemistry in a screen, displays the same rank order of potency expected and can reliably estimate the potency of compounds. This in itself does not test the reliability of the primary single shot assay but is the foundation of a reliable assay. It is also important during screening to give confidence that assay sensitivity remains acceptable throughout an extended screening run.
Our assay development groups also run what we call a mini-validation set to test the reliability of the assay in detecting hit compounds. The mini-validation set is 1408 compounds from our main validation set (see Section 1.1.2.5) on both 1536 and 384 plates. Although it does not always contain hit compounds against all targets it is a useful set to run in that it does not take much effort, does not use too many precious reagents and will quickly flag issues such as high hit rates or poor reproducibility. The full validation set could of course be used at this stage but as we move to more complex screens with expensive and sometimes hard to resource reagents it is usually prudent to use the mini-validation set to preserve these reagents. With these data we can determine some simple parameters and assess the data to determine whether the assay is suitable for hit finding and can be moved to the HTS group for full validation and subsequent transfer. In Table 1.1 we show how the mini-validation data are used to determine, in this case, the screening concentration to be used. In the case of this epigenetic target, we expect a low real hit rate and a high artefact hit rate and the mini-validation data nicely show how we can determine key parameters for the screen at an early stage in the assay transfer. Additionally, we can use the output of the mini-validation exercise to determine the efficacy of any downstream assay to successfully remove false hits and allow the identification of true hits.
Criteria . | 10 µM Screening concentration . | 30 µM Screening concentration . |
---|---|---|
Robust Z′ of each plate | 0.6, 0.6 | 0.6, 0.6 |
Shape of distribution | Pass: tight central peak with long left hand tail | Fail: broad central peak with heavy left hand tail |
Median of compound wells (% effect) <10% | −1.93, 0.94 | −8.46, −14.85 |
Robust standard deviation of compound wells (% effect) <15% | 9.56, 9.89 | 20.36, 17.00 |
Hit rate at Q1−1.5×IQR <5% | 6.7%, 6.4% | 18.8%, 21.8% |
<5% of maximum/DMSO control wells show >50% effect | Pass | Pass |
<5% of minimum control wells show <50% effect | Pass | Pass |
No obvious plate patterns | Pass | Pass |
Predicted confirmation rate >50% ([#confirmed hits/(((#hits run=1)+(#hits run=2))/2)]×100) | 84% | 75% |
Criteria . | 10 µM Screening concentration . | 30 µM Screening concentration . |
---|---|---|
Robust Z′ of each plate | 0.6, 0.6 | 0.6, 0.6 |
Shape of distribution | Pass: tight central peak with long left hand tail | Fail: broad central peak with heavy left hand tail |
Median of compound wells (% effect) <10% | −1.93, 0.94 | −8.46, −14.85 |
Robust standard deviation of compound wells (% effect) <15% | 9.56, 9.89 | 20.36, 17.00 |
Hit rate at Q1−1.5×IQR <5% | 6.7%, 6.4% | 18.8%, 21.8% |
<5% of maximum/DMSO control wells show >50% effect | Pass | Pass |
<5% of minimum control wells show <50% effect | Pass | Pass |
No obvious plate patterns | Pass | Pass |
Predicted confirmation rate >50% ([#confirmed hits/(((#hits run=1)+(#hits run=2))/2)]×100) | 84% | 75% |
A decision regarding the screening concentration needed to made and mini-validation data were generated at both concentrations. This table summarises the criteria and, as can be seen, the data indicate that the screen should be done at 10 µM, whereas screening at 30 µM results in an assay that is not fit for transfer into HTS. As can be seen, even screening at 10 µM does not pass all criteria (hit rate >5% in all cases) but with a triage strategy we can deal with the relatively high hit rate at 10 µM. DMSO: dimethyl sulfoxide; IQR: interquartile range; Q1: first quartile.
It is important that both the HTS and assay development groups use similar (ideally the same) equipment to remove any issues that can arise during assay transfer with different equipment that perform differently. We have experienced transfers taking longer than necessary when we have used different equipment across the groups and this leads to whitespace (a term we use to describe downtime) in the project whilst the differences are investigated and corrected. The criteria we apply to the mini-validation assessment are essentially the same as for a full validation and are listed in Figure 1.1. The hit rate and reproducibility are key at this stage. A high hit rate can be particularly problematic and we try to ensure that the full screening cascade is in place prior to transfer so that we can test the ability of the assays to remove false hits and confirm real hits. We also have a small library of problematic compound classes such as redox compounds, aggregators and thiol-reactives to probe the sensitivity of a target and its assay to such compounds, which are commonly referred to as pan-assay interference compounds (PAINS).4 This is a key step in checking the robustness of an assay and also helps us to understand what assays are needed to remove a high hit rate associated with a class or classes of PAINS. This early information allows us to test the validity of the screening cascade. Having this view early on is key to ensure that we can successfully transfer an assay, validate and run the HTS and successfully prosecute its output.
1.2.3 Analysing Data to Define Robustness/Reliability
Another issue that can have a significant effect on the robustness and reliability of an assay is that of plate patterns. Plate patterns are often seen as we move to higher density plate formats and lower assay volumes. Reasons for plate patterns are many and varied, and certainly not a topic for detailed discussion here. One of our criteria is the absence of plate patterns, but unfortunately all too often we have plate patterns and they cannot be removed by taking practical steps such as incubating cell assay plates at room temperature for a period of time before placing them in an incubator. We will accept assays with a plate pattern but only if it can be corrected by the algorithms in our data analysis software, Genedata Screener®.5 Genedata Screener contains sophisticated and proprietary statistical algorithms designed to remove plate patterns whilst preserving genuine actives by looking for consistent patterns across multiple plates. In an HTS group, having access to such correction algorithms is essential. Genedata Screener® is a commercial software package likely out of reach of small and/or academic HTS groups due its relatively high cost, but there are alternatives available for lower or no cost, such as the B Score algorithm or the R Score algorithm.6–8 Indeed, many different normalisation and pattern correction methods have been described in the literature,9 each with their advantages and disadvantages. These would need to be implemented as some form of software tool and are often too complex to be implemented in Excel, for example, but anyone skilled in the art of programming in R statistics should be able to build a simple application. Alternatively, there are a range of freely available software tools, some of which are described by Makarenkov et al.10,11 In our experience, the choice comes down to assessing acceptable performance on your data and ensuring that the scientists running the screen and data analysis at least understand the resulting output. The data analysis package used in HTS is also an important choice and it is worth writing a few words on why we chose Genedata Screener®.
Up until 2013, AstraZeneca tended to develop most of its data analysis software in-house, which resulted in very functional software designed to tightly integrate with our processes. We would try to incorporate current thinking around data analysis into our software but the problem we faced with in-house developed software was that once the software was finished the development team was disbanded and the software was not developed any further. We also suffered from many different software packages being used across the business, leading to poor interoperability and increased costs. In standardising our software package we have reduced overall costs, improved interoperability, but more importantly, have invested in software that incorporates the latest thinking in data analysis techniques. With Genedata Screener® being a commercial product it is also regularly updated to keep pace with new screening technologies, such as combination screening and Biacore. As the pace of change in HTS gets faster it is important to be able to respond to this change in both the experimental science you carry out and also in your data analysis approach.
1.2.4 As Simple to Run as Possible
Again, the importance of making an HTS assay as simple to run as possible seems like a very obvious statement to make, but is easy to overlook this when designing HTS assays or thinking about running a million compounds or more. It can seem like the best option is to utilise the investments made in automation at every step. However, it is our experience that automation is not always the best answer. If an assay comes to HTS in a 1536 format and has a low number of additions we will almost always try to run it manually. This is because of the time taken to validate the automation system and/or set up complex automation such that it can run a new screen. Although we are constantly looking for more efficient ways of doing this and have recently invested a significant amount in redesigning our future automation with this in mind, our historic experience tells us that removing the optimisation steps for automation by starting a suitable screen manually can mean faster progression and fewer assay failures. Hence, we are always asking ourselves, what is the best approach to screening for this assay, and can it be simplified to remove processes and increase screening efficiency? It is also worth noting that with some recent phenotypic screens we have run assays over very long time periods and with a large numbers of steps, which we simply could not have envisaged running manually. By selecting the correct automation and investing in making these systems work efficiently, they are indispensable to our process.
1.2.5 Assay Validation
Once an assay has fulfilled the key criteria and is assessed as suitable to transfer into HTS then the next step is to optimise and validate the assay. Optimisation is the process whereby we check the assay runs in our HTS laboratory and start to optimise it for running at the scale that is required for the screen. Again, having the same dispensers and readers as the assay development teams makes this process much faster. In this phase we determine the throughput of the assay (i.e. how many plates can be run per day or batch) and whether it will run on the chosen automation platform, and once those conditions have been set, whether the assay detects active chemistry reliably. This process we call validation and is the final check before deciding whether to commit to a full HTS. In order to validate assays we have designed a library of compounds selected from the full screening deck that represents the diversity of the compound collection that will be screened. This is important as we can then use this set to assess hit rates and plan the steps needed to mitigate high hit rates. The set consists of approximately 7000 compounds available in both 384 and 1536 plate formats. Furthermore, we have two sets or picks that position the compounds on different plates and wells across the two picks. This is designed to assess any positional effects on the data such as plate patterns or compound carry-over and any consequences this may have for detection of hits. As a minimum we have found running pick 1 and pick 2 on separate days is sufficient to assess the reliability and validity of the assay, but we often run both picks on both days to assess intra- and inter-day variability. Additionally, we also investigate the paradigm of running low molecular weight (LMW) compounds at a higher concentration at this point, having previously defined the concentration at which the main body of compounds will be run during the mini-validation steps described above. This gives us an opportunity to differentiate across our compound collection and ensure that the assay is capable of running the LMW subset at a high concentration with the aim of finding all progressable chemistries, as described elsewhere.12
In addition to running the validation plates, we also run a batch of compound plates from the compound collection that is the same size as the daily batch size we plan to run in the screen. We can then place a set of validation plates at the start and end of this batch of plates, allowing us to test, at the same time, the repeatability/reliability of the hit calling and the stability of the assay over the time it takes to run a batch of plates. Although this should confirm the stability of the assay and reagents that have already been determined in the assay development phase, we do find assays where this does not hold up when being automated. Assumptions can be expensive and time consuming to resolve in HTS. Hence, full batch sizes in the format that the HTS laboratory will run the screen are required to de-risk these assumptions.
The validation data are only useful if we can analyse and extract information from them to help with our decision regarding whether to proceed with the full HTS screen. We have made various attempts at analysing the data. In one incarnation of analysing the validation data we had a close collaboration with our statistics colleagues and came up with a complex mixture modelling algorithm looking at patterns of how close the individual replicates were and breaking these down into different populations by their variability. Using these data we could model assay cut-offs and make predictions of false positive and negative results by looking at the frequency of replicates falling either side of the cut-off compared with the other replicates and the average of the replicates. This gave us great insight into the nature of the data we were generating, but the data summaries were difficult for non-statisticians to interpret and understand. Furthermore, with the screeners struggling to understand the data, it made it very difficult to explain them to the drug project teams and so we decided to discontinue using the tool. This in no way says anything about the use of expert statistical input in HTS data analysis, but does say to us that the people running the screens and the teams receiving the output need to understand the methodology to accept the decisions made based on the data. Using the knowledge gained from the mixture modelling analysis of our validation data we have established a series of data plots in Tibco Spotfire™ to present and analyse the data to ensure that the key criterion of repeatability can be assessed. In designing this set of visualisations alongside the criteria in Figure 1.1, we wanted to ensure that the data were accessible and understandable and aided decision making. In the example plots in Figure 1.2 we show a Bland–Altman plot13 to compare two of the picks of the validation set, which helps to clearly display any shift in activity between runs and how repeatable the data are, assessment of hit rates at various cut-offs and box plots showing plate to plate variability. These and the other plots in the Spotfire template (two of which are also shown in Figure 1.2) provide a very visual assessment of the data and are great for allowing cross-group discussion and decisions to be made, led by the HTS screener, regarding the validation data. We do not quantify false negative or positive rates but use the visualisations to look for compounds exhibiting this activity and use the plots alongside the criteria values in Figure 1.1. In general, we find false negative and positive rates to be low and not of major concern at the validation stages. False hits due to interference with the assay technology or compound toxicity in cell assays are more of an issue and these are dealt with by well thought out counter assays in a well-designed and validated screening cascade.
1.3 Summary
The HTS group supports all disease areas in AstraZeneca and also screens from outside of AstraZeneca via our open innovation initiatives. We as a group do not apply any restrictions on the assays we will run as long as they are of a quality that can run at the scale of throughput that is required, we have the equipment needed or can source it, and they fit into our budget. This is what makes HTS such a fundamental and productive hit finding technology within AstraZeneca, but equally this gives us a great challenge in understanding and validating a wide range of assay types and technologies, and illustrates why clear criteria around assay performance are so important. Looking to the future, we perceive that we will be performing more complex cell based assays on rare cell types such as primary human cells. Our experience today tells us that phenotypic cell assays can be challenging in HTS and require a different approach to enable screening, and quite often we have to accept screens that are less robust than those we have accepted for a simpler assay. When considering the range of assays we see in HTS, it is only reasonable that our expectation of performance will be different and we expect assay performance criteria to evolve over time. However, we believe that the process we have in place both helps to define what is acceptable and what is normal for any particular screen; for example, we may accept a cell based assay with borderline acceptance criteria and an average robust Z′-factor of 0.4, and we may set performance expectations that we will fail plates in screening if their robust Z′-factor is less than 0.3. We often set greater expectations for biochemical assays where we commonly see an average robust Z′-factor of 0.7 and will fail plates with robust Z′-factors of less than 0.5. This may seem like a double standard. However, the data we have generated tell us the normal behaviours of an assay and we commit to screening once we have assessed this. Additionally, where a screening plate deviates from the normal behaviour this gives us reason to suspect that this plate is different and should not be included in the analysis.
As we move towards using rarer cell types and more complex biological read outs we will need different approaches such as screening smaller numbers of compounds using either replicates in single shot or dose response screening. As technologies progress and we start to look at single cells, our understanding of what a robust assay is and how to define such assays will have to evolve markedly as will the methods used to validate such screens. It is our belief, however, that similar criteria and advice will result in the continued use of HTS libraries and screening processes such that robust assays and valuable data can be generated to progress drug discovery projects.