Skip to Main Content
Skip Nav Destination

With the development of the European Union, the development of common toxicological standards was required to remove barriers to inter-community trade, and it has also been thought desirable to have common standards of worker safety and common environmental standards. There are various types of regulatory regimes, including premarketing approval systems and notification schemes. Yet a further type of regulation applies to existing situations, such as air pollution. Here, the main roles of the regulatory regimes are setting standards. Regulation depends on good data, which come from two main sources: proprietary data and studies in the peer-reviewed literature. Both sources have strengths and weaknesses. Proprietary data are generally generated according to good laboratory practice and guidelines are available for the conduct of many regulatory studies. There are a number of instances of retracted papers in the peer-reviewed literature that have had regulatory impact. For many substances, reference doses are calculated from toxicological data, most often obtained from experimental animals, generally by application of an uncertainly factor, or factors, to the lowest relevant no-effect-level in the most sensitive species. Air quality guidelines are predominantly derived from epidemiological data.

Regulatory toxicology is to toxicology as military music is to music (Sir Colin Berry, 2014).

The European Union (EU) has its origins in the European Coal and Steel Community (ECSC), created in 1951 under the Treaty of Paris and comprising the German Federal Republic, France, Italy and the three Benelux countries (the Netherlands, Belgium and Luxemburg). The ECSC essentially created a common market in respect of coal and steel. The European Economic Community (EEC), formed by the same countries under the Treaty of Rome in 1958, extended the common market to other goods and, to some extent, to services. Since then, the EEC has been steadily enlarged, and the Maastricht Treaty effectively created the EU from its predecessor, the EEC, in 1993. The need for common standards in traded goods, largely to remove barriers to inter-community trade, required the development of common toxicological standards, and it has also been thought desirable to have common standards of worker safety (including toxicological aspects thereof) and common environmental standards.1 

Before the EU and its predecessors existed, the individual member states had their own regulations relating to toxicology, but such regulations varied from country to country. Legislation regulating chemicals started in the 19th century: the 1863 United Kingdom (UK) Alkali Act2  is often said to be one of the first pieces of such legislation, although there had been various earlier attempts at smoke abatement (see also Brimblecomb).3  The Pharmacy Act,4  which was the first attempt in the UK to control poisons, was a result of the Bradford sweets poisoning: this was the accidental arsenic poisoning of more than 200 people in Bradford, England, in 1858, with 20 fatalities. Another important UK statute, the Sale of Food and Drugs Act5  was intended inter alia to prevent adulteration of food. At this time, methods of analysis were primitive compared with those available in the 20th century, and only gross contamination could be detected. In many countries, the thalidomide disaster was the impetus towards modern regulation of pharmaceuticals and, in the UK, resulted in the Medicines Act 1968.6  Thalidomide, which was developed in Germany, also had adverse effects there: in 1964, the Bundestag made the testing of new drugs compulsory by amendment7  of the West German Drug Law of 1961,8  and this was followed by the West German Drug Law of 1978.9  Also, many countries, notably France, established systems for pharmacovigilance. Thus, in France, a Centre National de Pharmacovigilance was created by the Conseils de l’ordre des pharmaciens et des médecins (pharmacists and doctors) in 1973. In 1976, the pharmacovigilance system became more official with a decree (arrêté du 2 décembre 1976) to establish pharmacovigilance regulations.10  In 1982, the decree n°82-682 (30th July 1982) established the structures and organisation of pharmacovigilance in France.11  Twenty-eight centres were created in the pharmacology/toxicology departments of university hospitals (the number has since been increased). In 2005, a new decree (arrêté du 28 avril 2005) created the practice of Good Pharmacovigilance Practice in France.12 

Many factors seem to influence people’s attitude to risk, including familiarity, control of their own exposure to risk and novelty of the risk. Some of these are discussed in Living with risk: the British Medical Association guide.13  These factors have to be taken into account by any regulatory regime, although the aim is always the same; namely, to protect the public, workers, consumers and the environment.

Diggle14  discussed the various types of regulatory regimes that exist. Premarketing approval systems (also called authorisation or licensing systems) require the organisation wishing to market a substance to first gain approval from the regulatory authority. This is the system under which pharmaceuticals, both those for humans and for other animals, are regulated. A similar system is used for pesticides and biocides. Here, the role of the regulatory authority is to decide whether a substance is sufficiently safe to be marketed or that the benefits outweigh the risks; although, with some substances, e.g. pesticides, a working assumption is made that no individual benefit accrues to people. This is further discussed below. Another type of system is a notification scheme whereby the regulatory authority must be notified of the use, marketing or trading of a substance. Often, the requirements of this type of scheme become more rigorous the more the substance is produced or traded. Yet a further type of regulation applies to existing situations, such as air pollution. Here, the main roles of the regulatory regime are setting standards and risk mitigation.

Robust regulation depends on good data. Evidence relating to the toxicological effects, or lack of such effects, of compounds or products subject to regulation comes from two main sources:

  • (1) Proprietary data, comprising reports of work undertaken by, or on behalf of, the producers of the compounds.

  • (2) Studies in the peer-reviewed literature in which studies are reported by (in most cases) independent workers.

Some take the view that proprietary data should be ignored and only studies in the peer-reviewed literature evaluated; others consider that only Good Laboratory Practice (GLP)-compliant studies, conducted according to regulatory guidelines, should be evaluated. Both types of study have potential strengths and weaknesses,15,16  but the present authors are strongly of the opinion that both types of data should be evaluated if of sufficient quality. Failure to do so could result in the ludicrous situation where, if one insisted only on data from GLP-compliant facilities, data on human poisonings would have to be ignored, as would many animal studies from reputable university departments or other research institutes. Conversely, poorly conducted studies in the peer-reviewed literature would be taken as valid, but a proprietary study, which was GLP-compliant and undertaken to internationally-agreed guidelines, looking at the same endpoint would have to be ignored. Science is a search for the truth, difficult enough in all circumstances, but particularly difficult when an absolutist view is taken of particular sorts of data. As Charles Darwin said, “scientific man ought to have no wishes, no affections, a mere heart of stone”.17  In fact, the greatest enemy of prejudice in science is facts from well-conducted studies.

Attempts have been made to grade scientific information by quality; for example, that of Klimisch and colleagues.18 

An EU Regulation that takes an absolutist approach is regulation 1107/2009, which firmly states that “In relation to human health, no collected data on humans should be used to lower the safety margins resulting from tests or studies on animals.” This regulation concerns pesticides, an area where the use of human experimental data has been particularly controversial (see below).19 

Although studies in the peer-reviewed literature are usually not paid for by industry, journals increasingly require declarations of conflicts of interest; those declared are generally financial. But other, non-financial, conflicts of interest may occur (see discussion by Purchase20 ). Moreover, non-governmental organisations (NGOs) that take the view that only studies in the peer-reviewed literature can be trusted may have their own interests, which they may wish to protect. Leonard21  points out that many NGOs, which purport to represent the general population, are highly dependent on national government or EU money – though, of course, this no more implies an inability take a detached and disinterested view of issues than it does in the case of research workers (see also Pigeon22 ). Moreover, NGOs may have agendas other than public health – for example, anti-globalisation, anti-capitalism, opposition to intensive farming, dislike for certain companies etc. – but tend to project their concerns as concerns for public health. The habit of certain NGOs of attacking authors ad hominem and/or their source of funding, rather than addressing the science, is to be deplored.

In fact, few people are truly disinterested, and we all have our prejudices. Thus, with publications in the peer-reviewed literature, it should be reflected that benefits, including recognition and promotion, might accrue to the workers as a result of publication. It should also be recalled that peer-review is not a fool-proof, error-free, process in that reviewers are, to a large extent, dependent on the honesty of those submitting work for review. If such honesty is lacking, then peer-reviewers can be misled and the process of quality assurance fails (instances of retraction of papers in the peer-reviewed literature are discussed below). The glaring problem with peer-review, as used by scientific journals, is the general inability of reviewers to access raw data. Further, when comparing the two sources of data, it should be remembered that the producers of compounds stand to lose a great deal, in terms of reputation and finance, should their submissions turn out to be flawed. No ethical producer stands to gain by marketing a compound which has not been rigorously tested or for which the results of such tests have been falsified. Quite apart from anything else, clearance of regulatory hurdles using false data would be unlikely to protect producers against lawsuits, at least in common-law countries.

As discussed above, the value of work published in the peer-review literature is sometimes stressed, to the disadvantage of the work provided to regulators in the proprietary literature. Such comparisons are based on the perception that the proprietary literature is likely to be biased in favour of the products studied in that industry, which in turn, is paying for the study; however, many studies are carried out by contract toxicology houses, who have no financial interest in the regulatory consequences of the results of their studies (as long as they get paid!). Whether such criticism of the proprietary literature is justified is thus open to question. It is also perceived, perhaps wrongly, that proprietary data are more difficult to access than those published in the open literature. This has led to data being described as being within the grey literature, although such data can often be accessed through national and international regulatory bodies.

In the 1970s, there were scandals involving scientific misdeeds and fraud at toxicology laboratories, the best known case involving the firm Industrial Bio-Test (IBT), and a result was the establishment of GLP regulations by the US Food and Drug Administration (FDA), finalized in 1979. In 1983, the US Environmental Protection Agency (EPA) established similar guidelines for pesticide toxicology studies and, in 1989, extended them to cover all research data submitted for the purposes of pesticide registration. Because studies are designed to support marketing in multiple jurisdictions, GLP was widely adopted throughout the world. GLP makes falsification of data very difficult and provides a paper-trail, as do other requirements of GLP such as the retention of data, samples and specimens. However, the efficacy of GLP depends crucially on a good quality incorruptible GLP inspectorate. See also Marshall23  and Seiler.24 

In addition to the requirement for GLP, with many regulatory regimes, unless there is a good scientific justification for deviation, there is an obligation that studies be carried out in accordance with guidelines such as those produced by the Organisation for Economic Co-operation and Development (OECD) or, for pharmaceutical toxicology, the International Council on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use (ICH). In addition to the latter, there is a veterinary equivalent, the International Cooperation on Harmonisation of Technical Requirements for Registration of Veterinary Medicinal Products (VICH). Also, most EU regulatory bodies have their own test guidelines. The OECD is based in Paris, whereas ICH and VICH are nomadic, although they have permanent secretariats. ICH is located in Switzerland; VICH is based at Health for Animals, Brussels.

At the meetings of these bodies which produce guidelines, most attendees are from regulatory bodies in various countries or from international regulatory bodies such as the European Medicines Agency. In producing guidelines, organisations have to consider animal numbers, a major consideration being adequate statistical power to detect the effects of interest while minimizing the possibility of type I or type II error (false positives and false negatives in statistical hypothesis testing). Choice of species is based upon availability of animals and background knowledge of the species. Thirteen week studies are usual in rodents and one non-rodent species, usually dogs. Long-term/carcinogenicity studies are normally done in rats and mice, developmental toxicity in rats and rabbits and multigeneration studies of reproduction in rats, occasionally in mice. Testing for eye and skin irritancy is carried out in rabbits; sensitising potential is assessed in guinea pigs. Occasionally, particularly for insecticides used also in veterinary medicine, there may be data on farm and/or pet animals: these data are of limited use in risk assessment of pesticides as, with the exception of dogs and rabbits, such animals are not well validated for human risk assessment as there are comparatively few data on appropriate uncertainly factors (UFs) for extrapolation to man. Most studies are designed so as to elicit no-adverse-effect levels (NOAELs) or values from which benchmark doses (BMDs) can be estimated. A BMD is an estimate of the dose that results in a predetermined level of change based on regression analysis of the response against dose and is an alternative to NOAELs (see also Section 1.6.1). The NOAEL or BMD can then be used for the calculation of reference doses/health-based guidance values (RfDs). For mechanistic studies, major considerations include the likely similarity in response to that of humans. A special case is organophosphate-induced delayed polyneuropathy (OPIDP), where the standard test is carried out in hens (Gallus gallus domesticus). In most cases, the guidelines suggest a study designed to elicit NOAELs to facilitate risk assessment. This is not the case with the hen test for OPIDP, where the basis for quantitative extrapolation from the hen to humans is too uncertain for NOAELs/BMDs to be of regulatory use, nor are NOAELs gained from typical tests of genotoxicity.

The rigidity and narrowness of guideline test protocols have been criticized, as have the statistical tools used and the criteria for using toxicity data in regulation.21  The expense of the studies has also been criticized; this is partly a function of the number of animals used. In this respect, it should be noted that increasing the number of animals in a study diminishes the probability of both type I and II errors. Certainly, tough regulatory regimes weigh particularly heavily on small companies and may therefore inhibit innovation and competition: this is true of all chemical regulatory regimes, although attempts have been made to remedy the problem, particularly with pharmaceuticals. But it was accepted after thalidomide that tight regulation of pharmaceuticals was a price worth paying, even if it involved loss of some potentially beneficial products.

In general, it is true that a very well-argued case is necessary in order to justify departing from the applicable guidelines.

Many university departments and other (non-commercial) research institutions lack the facilities to be GLP compliant. Furthermore, studies in the peer-reviewed literature may lack sufficient detail for independent replication, and important details about the test substance may be missing, especially about its purity. This may not be the fault of the authors as editors and/or reviewers often ask for papers to be shortened. The key defence against impropriety in research establishments is the provision of sufficient information to allow replication of the study and retention of raw data. There is, in addition, often a fundamental difference between work done in research laboratories and that done specifically for regulation. For example, research workers tend to be interested in mechanisms of action of compounds and in the use of compounds to explore physiological processes. The standards by which such work should be judged are those of research rather than those of compound regulation. Nevertheless, there is a not inconsiderable body of evidence suggesting that a proportion of scientific papers cannot be reproduced.25,26  Indeed, in the area of social sciences and physics, there is a hilarious example of a spoof article having been published under the impression that the article was genuine science.27  One lesson to be learned is the need for more replication studies.26 

A number of articles in the peer-reviewed literature on toxicological matters with regulatory significance have been withdrawn. Several notable examples exist. In the realm of endocrine disrupting chemicals, a paper by Arnold et al.28  was retracted a year later.29  In brief, this paper reported that certain organochlorine insecticides and hydroxylated polychlorinated biphenyls (PCBs), which have a weak estrogenic activity when acting alone, were up to 1000 times more potent in mimicking estrogen when tested in combination. Another pesticide-related retraction concerned work on amitraz published in 2005 by Rodriguez et al.,30  retracted in Environmental Health Perspectives in 2012 31  and a study on maneb, a fungicide, and paraquat, a herbicide, by Thiruchelvam et al., published in 2005.32  This was also retracted in 2012 (see also Office of Research Integrity33  and Buckley34 ). Both the last two articles were about Parkinson’s disease and pesticide exposure and were much-cited; the long period between publication of the studies and retraction is extremely worrying. A study by Albanito and colleagues35  on atrazine, a herbicide, was retracted in 2014.36  Another retraction with regulatory implications was a paper by Séralini and his colleagues published in 2012.37  This concerned Roundup, another herbicide, whose active ingredient is glyphosate, and a Roundup-tolerant genetically-modified maize and was not a matter of misconduct but rather of experimental design (see also Arjó38  and Resnik39 ). In pharmaceutical toxicology, an instance of impropriety related to Debendox (a combination of pyridoxine [vitamin B6] and doxylamine succinate).40  Unfortunately journals do not always make it easy to ascertain the reason(s) for retraction, some simply publishing one sentence retractions, with no further explanation, while in others the retraction is uninformative e.g. the study on lung and nervous system toxicity of manganese by Hałatek et al. published in 2005,41  with retraction in 2012,42  because of “inappropriate use of previously published work.” More information can sometimes be gleaned through websites such as Retraction Watch ( Many of these examples of article retractions come from the USA: this is almost certainly a reflection of the fact that that country possesses a powerful organisation to investigate research malpractice. This is regrettably not true of many other countries.

A further problem with the peer-reviewed literature is the quality of journals, which can vary from journal to journal. The quality of journals can, to some extent, be assessed by their impact factors, whether they are cited in abstracting databases and looking at journals’ websites to see whether they have a robust peer-review system.

Occasionally, the same study may be available to regulators both as proprietary data and as a published paper. The published paper will inevitably be much shorter and probably easier to comprehend. However, the regulator must resist the temptation to ignore the longer proprietary study: it may contain important detail not presented in the published version and there may even be discrepancies between the two. GLP rules mean that the proprietary version is likely to be the more accurate when there are discrepancies: GLP makes it easy to trace “improving” data by, for example, removal of outliers in data sets. Most importantly, the proprietary version will also contain the raw data in the form of comprehensive tables and graphical representations, thereby allowing regulators to examine the study results in great detail.

A technical difficulty with using many non-guideline studies for risk assessment is that they may not be designed to elicit NOAELs or to define BMDLs and cannot therefore be used for quantitative risk assessment.

Human data may be of proprietary origin, for example, population and workforce epidemiology studies and human experimental studies. Studies may appear in the peer-reviewed literature and most of these will be epidemiological in nature. Human experimental studies decrease the amount of uncertainty in deciding the effect of chemicals on human populations because there is not the need to consider uncertainty created by extrapolation from animals to humans. As discussed above, EU legislation prohibits the use of NOAELs from human experimental studies to relax acceptable daily intakes (ADIs) and Acceptable Operator Exposure Levels (AOELs) based upon animal studies in respect of pesticides. There are clearly ethical dilemmas when exposing humans to chemicals when no individual benefit can accrue to those taking part, and such dilemmas in relation to pesticidal studies have been extensively discussed (see, for example, London et al.43 ). But progress in pharmacology, for example, would be extremely difficult without human safety studies, where, again, those taking part can, in general, expect no personal benefit. The results of studies in humans can be used in the evaluation of drugs used in veterinary medicines. For example, a dose determined to be a no-effect level (NOEL) for analgesia in human subjects may also be used to establish a pharmacological NOEL to be used in the elaboration of maximum residue limits or levels (MRLs) for veterinary medicinal use. Moreover, adverse effects noted with clinical use of a particular drug in humans can be taken into account when assessing the safety of the same drug for veterinary use, either to consider whether the same effects might occur in animal patients or when considering consumer safety.

Epidemiology studies on human populations may be available. These may be workforce cohort studies, but these are often too small and/or of too short a duration to be of much use, and exposure of workers nowadays is usually low. Population studies may be cohort or case–control studies, but exposure data are often incomplete and there may be mixed exposure; furthermore, recall bias and confounding factors may be a problem. Crucially, epidemiology shows associations not causation. The Sir Austen Bradford Hill features of causal associations (often called the Bradford Hill Criteria) are helpful in looking at causation,44  but it should be noted that none of the features is absolute except temporality and that there is no unequivocal test of causality. Where there are a large number of studies, meta-analysis may help resolve confliction between the results of individual studies, and there are a number of organisations that specialize in this area, notably the Cochrane collaboration (

In the air pollution field, reliance is placed on epidemiological studies rather than on predictions of effects in man based on work in animal models (see below). Better understanding of the effects of air pollutants on health has stemmed from the increased application of epidemiological methods. Thus epidemiology, not toxicology, has been the key discipline in revealing effects of low concentrations of air pollutants. Estimates of risks associated with exposure to air pollutants are based on two major types of epidemiological study: time-series studies and cohort studies. Intervention studies have also been carried out. Time series studies are less labour-intensive than cohort ones in that time series studies do not require knowledge of individuals. Cohort studies, however, look at the effects of long-term exposure to air pollutants and have revealed, specifically for particulate matter, a much larger effect than that predicted by time-series work. Currently available epidemiological techniques have been effective in detecting effects on health at concentrations of air pollutants which would, in earlier days and in industrial settings, have been regarded as harmless or, at least, likely to be associated with only minor effects. This has led to a revolution in thinking and has set difficult questions for toxicologists: some of these questions remain unanswered. Many toxicologists who have worked in the air pollution field over the past thirty years have made what might be described as a personal journey from guarded disbelief in, to cautious acceptance of and attempts to explain the findings of epidemiological studies of the effects of air pollutants on health.

Expert panels, such as those of the European Food Safety Authority (EFSA), have a problem in that many of the experts in a particular field may have been in receipt of financial support from industry and might therefore be perceived as not being disinterested. The opposing danger is that exclusion of such people may result in an “expert body” comprising only the uninformed. In our view, the presumption that an individual who has received financial support for his or her research from a company producing compound C will, ipso facto, be biased in favour of allowing the marketing of compound C is unwarranted. It is based on the depressing and, in our opinion, unsustainable view that experts are more likely to be dishonest than honest. It should not be forgotten that an expert’s credibility and reputation, as judged by his or her peers, depends on the quality of his or her work and advice, and the publication of shoddy work or the provision of biased advice is likely to lead to ostracism. No expert wishes to be ostracised. Frank declaration of financial interests is, of course, necessary but inevitable exclusion on the basis of such interests is, at best, Draconian and, at worst, folly. One way around this is to use specialists who contribute their knowledge but do not take part in drafting the final opinion. Unfortunately, in our view, this is not much used by EFSA.22 

With respect to food additives and contaminants (which include pesticide and veterinary drug residues), substances in water supplies and exposure to industrial chemicals, the general approach to risk assessment is similar, although there are some differences in detail. Risk assessment assumes that no individual benefit arises from exposure or use of the substance and is based primarily on animal studies, although occasionally human experimental data are available. Risk assessment for human pharmaceuticals is somewhat different as it is based upon risk–benefit analysis. Risk–benefit analysis may also be used with respect to public health use of other chemicals, for example the use of insecticides in vector control. Before veterinary products can be authorised in the EU, they must have a positive risk–benefit outcome with respect to patient, consumer, user and environmental safety.

In the air pollution field, reliance is, in general, placed on epidemiological studies rather than on predictions of effects in man based on work in animal models.

Risk assessment has been divided into four stages that concern toxicologists and is associated with a fifth stage that concerns policy-makers (Table 1.1). Some would say a sixth exists – risk communication. This last is particularly important for medicines for human use and for veterinary medicines, where the risk communication is via the product literature and label, but also true, for the same reasons, for biocides, pesticides and industrial chemicals.

Table 1.1

Risk assessment

Hazard identificationIdentification of adverse effect(s)
Hazard characterisation Quantitative evaluation of adverse effect(s) 
Exposure assessment Measurement or estimation exposure 
Risk characterisation Prediction of the likelihood of effects 
(Risk management) Doing something about it: limiting risk 
Hazard identificationIdentification of adverse effect(s)
Hazard characterisation Quantitative evaluation of adverse effect(s) 
Exposure assessment Measurement or estimation exposure 
Risk characterisation Prediction of the likelihood of effects 
(Risk management) Doing something about it: limiting risk 

Hazard identification is the identification of adverse health effects associated with exposure to a substance from animal studies, human studies, studies in vitro and studies of structure–activity relationships (SARs). Hazard characterisation includes quantitative evaluation of the adverse effects, by dose–response evaluation, evaluation of mechanisms of action and of species differences in response. Exposure assessment involves consideration of measured or estimated exposure for the population or subgroups thereof (toddlers, children, adults, pregnant women, ethnic groups). Risk characterisation involves consideration of hazard identification, hazard characterisation and exposure assessment in combination to predict whether effects in the species of interest (usually humans) are likely and the severity and nature of such effects. Furthermore, the proportion of the population likely to be affected and the possible existence of vulnerable sub-populations should be considered. Associated with risk assessment is risk management, which comprises the development of policies to mitigate risk.

For many substances, RfDs are calculated from toxicology data, most often obtained from experimental animals. These may be the tolerable daily intake (TDI) for environmental chemicals, and for food chemicals, the acceptable daily intake (ADI), acceptable operator exposure level (AOEL) and sometimes an acute reference dose (ARfD). The TDI is the quantity of a chemical that has been assessed safe on a daily basis for humans over a lifetime. The ADI is the amount of a substance that can be consumed every day for a lifetime in the practical certainty that, on the basis of all known facts, no harm will result. Often the fact that there are multiple pathways of exposure has been ignored in calculation of TDIs and ADIs: this is really only scientifically defensible where one pathway of exposure predominates and, increasingly, multiple pathways of exposure are considered, sometimes known as aggregate exposure, a term introduced by the US Food Quality Protection Act.45  Typically, the TDI and ADI are calculated from the critical NOAEL or BMD lower confidence interval (BMDL). This is divided by a UF, and in most cases, that NOAEL/BMDL will be the lowest reference point (RP) in the most sensitive species. However, the RP is chosen on a weight of evidence basis and, in some cases, may not be the lowest in the data package, as all relevant information is considered. For example, if the critical RP is from a 90 day rat study, effects in other species (e.g. mouse, dog), other study durations (e.g. 28 d, 2 year) and other comparable studies (e.g. multigeneration) would be considered for consistency and plausibility: plausibility can also be assessed by comparing histopathology and other findings, for example, liver pathology and levels of enzymes reflecting liver function in plasma or renal pathology and blood urea nitrogen. For pesticides, another concept is also often used: the ARfD. This is defined as the amount of a pesticide that can be consumed in a day or in a meal in the practical certainty that, on the basis of all known facts, no harm will result and is the critical RP from those studies appropriate for acute risk assessment divided by a UF. The ARfD results from the realisation that ADIs represent a mean intake limit over time. Nevertheless, some pesticides, e.g. some anticholinesterases, have appreciable acute toxicity, and it would be possible to have a situation where mean daily intake is below ADI, but on individual days, intake would cause acute toxicity. It should be noted that the ARfD can be used in certain circumstances in the risk assessment of veterinary products (see Chapter 5).

The NOAEL is the highest dose in a study at which no adverse effect on the animals (or humans) was observed. Some effects observed in animals are not considered adverse/relevant for human risk assessment. The BMDL makes use of all of the dose–response data for a particular toxicological endpoint. The BMDL usually used is the BMDL10 (benchmark dose lower 95% confidence limit, 10%) which is an estimate of the lowest dose which is 95% certain to cause no more than a 10% incidence of the effect in question (there are other definitions: see Kendall and Buckland46 ). The use of the BMDL approach has been recommended by the EFSA47  for substances which are both genotoxic and carcinogenic48  and for, inter alia, other food chemicals. The AOEL is calculated in a similar way to the ADI, but makes allowance for the length of the working day and any use of protective equipment.

UFs have their origin in 19th century engineering. In toxicological terms, the application of the 100-fold typical UF can be attributed to Lehman and Fitzhugh,49  who, in the 1950s, were toxicologists at the US FDA (see also Chapter 5). Of the 100, a factor of 10 was introduced to account for intra-species variability and another of 10 to account for extrapolation from animals to humans. In the case of human experimental studies, only the former factor is necessary. These are default factors intended to take account of the fact that the intra-human variability in response to a substance is rarely known, nor is the relative sensitivity of humans and experimental animals. The problems with extrapolation from animals to humans have been reviewed e.g. Brown et al.,50  Oesch and Diener.51  Additional data may allow refining of the UFs, thus Renwick and colleagues52–54  proposed dividing the two 10-fold UFs into their pharmacokinetic and pharmacodynamics components, envisaging the use of actual data where possible and of the default factors where not (see also Dorne and Renwick).55  There may be situations where additional factors are required. For example, if no long-term study is available and where no NOAEL is found and risk assessment has to be based on a lowest-observed-adverse-effect-level (LOAEL), an extra factor of 3–10 is often considered necessary. Certain endpoints in animal studies, such as tumorigenicity not of genotoxic origin, teratogenicity and fetotoxicity unaccompanied by maternal toxicity, may require UFs greater than 100. With non-genotoxic carcinogens, where there is a full genotoxicity data package and the tests are negative, it is usually possible to assume a threshold dose. The more that is known about the mechanism and the commoner the tumour type, the greater the reassurance. Otherwise, an extra safety factor may be required on the NOEL for the tumours or for the underlying mechanism of the tumours, if known (not necessarily the same as the overall NOAEL for the study). If that is the case, one calculates the NOEL for the tumours divided by the high UF and the overall NOAEL for the study divided by the normal UF, and then one uses the lower figure. Thus, the methodology assumes that the observed effects have a threshold. For non-threshold effects, such as genotoxic carcinogenicity, with food additives and pesticides, it is simply possible not to authorise their use. Where that is not the case, a decision has to be made on what level of ill-health is to be accepted. It should be noted that the safety factor approach is not generally used in setting air quality guidelines and is often unsuitable for regulating essential components of the diet, particularly trace elements and vitamins. The reason for this in the latter case is that insistence on a UF of 100 could result in deficiency if the ratio of the minimum daily requirement and the lowest toxic dose in an animal study is less than 100 (this is further discussed below).

In recent years, there have been a number of developments in risk assessment. In hazard identification/characterisation, for example, the adoption of protocols for tests for developmental neurotoxicity (DNT) and for endocrine disruptors. In hazard characterisation, there has been the consideration of the toxicology of chemical mixtures with respect to food additives and contaminants, particularly pesticides, and the use of BMDLs instead of NOAELs. In exposure assessment, probabilistic measurement of exposure has been used, particularly when undertaking risk assessments of mixtures. Major changes in pesticide regulation have taken place with Plant Protection Products Regulation 1107/2009, which established a risk/hazard based system.19  There is little experience in the use of such a system in Europe (or indeed the rest of the world), and it is completely out of line with risk assessment as carried out by the Joint Meeting on Pesticide Residues of the Food and Agriculture Organization of the United Nations and the World Health Organization Core Assessment Group (JMPR), or the North American Free Trade Area (NAFTA).

With substances deliberately added to food, such as genotoxic pesticides or proposed food additives, where it is considered that there is no threshold for effects, a simple method of risk management is to prohibit use. In other situations, e.g. occupational exposure, use may also be prohibited.

The ultimate aim of risk assessment is an acceptable level of safety. This is often ensured by setting standards. In food toxicology, these may be MRLs. The aim of these is to ensure that RfDs are not exceeded and MRLs are normally safety-based limits. It is noteworthy that this is not the case with pesticides, where the primary function of MRLs is to make sure that pesticides are being applied to crops in the way authorised. However, pesticidal MRLs must be compatible with consumption not exceeding the RfDs (usually there is a considerable margin of safety). In the case of veterinary medicines used in food producing animals, MRLs are set for food products, consistent with human safety, but a risk–benefit analysis is used with respect to the animals concerned for other aspects of regulation while user safety and environmental safety are also considered (see Chapter 5).

As is discussed in Section, it is sometimes possible to prohibit use. However, with contaminants, this is not always possible. With such substances, an as-low-as-reasonably-achievable (ALARA) approach is generally adopted. For example, the EFSA Panel on Contaminants in the Food Chain (CONTAM) stated that this approach should be adopted for aflatoxins, but the panel also concluded that public health would not be adversely affected by increasing the levels for total aflatoxins from 4 µg kg−1 to 10 µg kg−1 for all tree nuts.56  The difficulty with aflatoxins is that, short of eliminating many much-loved items from the diet, it is not possible to completely eliminate aflatoxin intake. Other approaches, such as calculation of a margin of exposure (MOE) from the BMDL10, have been proposed by EFSA.47,57  The MOE enables comparison of the risks posed by different genotoxic and carcinogenic substances. Sometimes, it is adjudged that there are simply insufficient data to produce a TDI. This is the case with amnesic shellfish poisoning, where EFSA decided that there were insufficient data to establish a TDI for domoic acid.58  Canada, where the initial outbreak occurred, has established an action limit of 20 µg g−1 of wet weight tissue.59  In such cases, ALARA is adopted faute de mieux.

In some cases, standards are set at the limit of quantification (LOQ) of the analytical assay used for their determination. This is the case with pesticides in the water supply, where the European Community Drinking Water Directive (80/778/EEC)60  established maximum allowable concentrations (MACs) for total pesticides in drinking water of 0.5 µg l−1 and of 0.1 µg l−1 for each individual pesticide, the latter equivalent to the detection limits with the analytical methodology available at the time. Although neither of these MACs is in any sense toxicologically-based, they have been retained in subsequent EU legislation.61 

For essential components of the diet that are toxic in excess, e.g. many vitamins, metals such as iron, copper and cobalt, ranges of daily intake that are within acceptable limits are used rather than TDIs/ADIs. The addition of vitamins and some other substances to food in the EU is controlled under Regulation 1925/2006 of the European Parliament and of the Council.62 

One of the most remarkable findings in the air pollution field is that the effects on health of many air pollutants, as revealed by epidemiological studies, are not characterized by thresholds. Of course, it is not possible to be sure of effects at concentrations lower than those recorded in the areas studied, but associations with very low concentrations have been reported and the majority of statistical models do not suggest the presence of threshold of effect. Non-threshold effects are well known to toxicologists but have, in general, been thought to be limited to genotoxic effects. It is important to remember that epidemiological studies in the air pollution field relate ambient concentrations with effects. The concentrations tend to be monitored at one or perhaps a few monitoring sites in an area perhaps as large as a city. Such measures of concentration may well represent a sort of average of concentrations to which individuals are exposed: but some may well be exposed to concentrations higher than monitored. This, combined with what may be a wide range of sensitivities across a large population, may explain the lack of observed thresholds at a population scale.

The World Health Organization, in the Air Quality Guidelines for Europe of 1987,63  adopted a no-threshold assumption only when dealing with genotoxic carcinogens: for these, risks were expressed as Unit Risk Factors – these being the excess risk associated with life-time exposure to a unit concentration of the pollutant in question. For other air pollutants, guidelines based on apparent thresholds were recommended; these included what was perceived to be a margin of safety. Few of the current mass of time-series studies, and none of the cohort studies referred to above, had been published by 1987. By the time the Guidelines were revised for their second edition (2000),64  a move away from conventional guidelines and to the use of coefficients (gradients or slope factors) as guidelines was occurring. In the EU, WHO Air Quality Guidelines have been adopted as standards (expressed as Limit Values) and Member States run the risk of penalties being imposed if these are not met (see also Chapter 12). Setting standards for air pollutants presents difficult problems. Current thinking suggests effects occur at current ambient concentrations, indeed at lower than ambient concentrations. Thus, there is no possibility of setting a standard which includes a generous margin of safety. Standards are inevitably interpreted by the public as safe levels: this is clearly inappropriate with regard to air pollutants.

This may be separate from risk assessment or done by the same organisation, but the trend is to separate the process. For substances regulated under prior approval systems, the product may simply be withdrawn if considered unsafe (see Section, or with pesticides, the conditions of use may be changed, or the product may be forbidden for amateur use. With food and drinking water contaminants, maximal allowable concentrations may be set. With drugs, which are divided into general sales list medicines (GSL), pharmacy medicines (P) or prescription only medicines (POM), the product can be recategorised.

It should be noted that regulatory toxicology is protective not predictive. The UFs used are large because the relative sensitivity of animals and humans and the range of sensitivity of humans are rarely known. Where safety standards are exceeded, it is rarely possible to say what the effect will be: indeed, often the effect will be nothing.

We live in an age of public concern about the effects of chemicals on health. Research shows that the public are right to be concerned. It follows that the public have a right to expect their elected representatives to put in place measures that will limit deliberate and inadvertent exposure to chemicals to levels that are unlikely to be of harm to health. This has been done in most countries: indeed, an industry concerned with the regulation of chemicals has developed. The objective of this regulatory industry can be summed up in one word: safety. But safety is more difficult to define than might be thought, and all levels of safety carry costs. These costs include the obvious costs of testing chemicals and operating regulatory agencies and their committees, the costs to industry of implementing decisions and the hidden costs of compounds of potential value being discarded at an early stage of development due to safety concerns. Also to be considered are the costs that the industry has to bear and, to these, the costs of regulatory fees and life-cycle management of the products in question e.g. the costs of pharmacovigilance. These costs represent the price the public pay for safety.

As regulations proliferate and tighten, as higher standards of safety are demanded, we should examine, ever more carefully, the scientific basis of these regulations and standards. This examination involves more than an exploration of the methods used in toxicity testing; it involves considerations of the philosophy of testing and, indeed, of the concept of safety. This exploration is aided by examination of the different approaches taken by regulators working in different fields. We should ask for their reasons, for the arguments that underlie their decisions. It is to be hoped that their reasons are good ones and based firmly on science, and not on habit or tradition or developed as a “knee-jerk” response to concern or disaster. In this book, a number of experts have explored these issues with regard to their own fields of expertise. As such, it is, we think, a useful contribution to the debate. But this is not a field in which a final answer should be expected: science can help only to a limited extent, and the perceptions of the public, be they informed or uninformed, will play a large part in the development of regulatory policies, both within the European Union, the focus of this book, and more widely.

Our thanks are due to Dr Lisa Passot and Dr Martin Wilks for information on regulation in France and Germany, respectively.

Close Modal

or Create an Account

Close Modal
Close Modal