Chapter 1: Computers as Scientists
Published:15 Jul 2020
The role of computers in science has changed dramatically because of the increase in computational power, accessible platforms for data storage and use, and the development of artificial intelligence and machine learning. This chapter addresses a number of important questions regarding the role of computers in science and presents some relevant examples.
1.1 What Is Computational Science?
Computers excel at many tasks that humans are very bad at (see Figure 1.1). For example, if you have a smartphone with a voice assistant, try asking it what the square root of 32 761 is. For the phone, this problem is very straightforward, and it almost immediately returns the correct answer of 181. If instead, we were to ask a person the same question, they would probably find it very challenging. However, if they get the correct answer as quickly as the phone, we will think they are a genius (or perhaps more likely that they are cheating!). Now, for comparison, try to engage your voice assistant with a knock-knock joke, and it doesn't work. In contrast to the mathematical challenge, the machine finds this social interaction impossibly complicated, but we would think a person a simpleton if he/she is not able to partake in it. Children learn how to make these jokes before they learn to add. This illustrates why computers make such good scientific companions as they help scientists to be better at the things that they are naturally much worse at.
Computational science is a scientific field, which uses a computational methodology to answer scientific questions. Computational scientists are most often trying to answer questions in fields such as chemistry, physics or biology using the algorithms developed by computer scientists. This is, in some respects, analogous to an experimental chemist who uses a reaction to make a molecule in the lab, as opposed to the one who proposes that reaction, or a scientist who uses an NMR machine to determine the chemical composition of a biological sample using a procedure devised by an NMR expert. This is not to say that the scientists deploying these techniques, which might be algorithmic, experimental or otherwise, are any less important than their discoverers. The most impactful scientific techniques are those which have long-term applications across multiple fields.
The use of artificial intelligence (AI) and machine learning (ML) algorithms, as described later in this book, could be considered as computational science, but chemistry has a long track record in other computational methods too. Numerous computational techniques do not require AI but rather use the power of computers to solve mathematical equations. Many of the most complex problems are in quantum mechanics.
The field of quantum mechanics is just over 100 years old. In 1900, Max Planck1 proposed that energy is absorbed and radiated in discrete quanta (or packets of energy). Albert Einstein2 used quantum theory to explain the photoelectric effect in 1905, which led to the theory that light acts as both a wave and a particle. Louis de Broglie3 extended wave–particle duality to include all types of matter, including, importantly, electrons. Once it was accepted that electrons behave as waves, the hunt began to find a way of describing these waves mathematically. A mathematical model was proposed by Erwin Schrodinger4 in 1925, and this is shown below, in its time-independent form;
where Ĥ is the Hamiltonian operator, Ψ is the wavefunction and E is a constant equal to the energy of the system. The mathematics of this equation is far beyond the scope of this chapter, but it is worth noting that finding solutions to the Schrodinger equation is not trivial for any system. At present, the equation cannot be solved exactly for species with multiple electrons. However, with suitable assumptions and high-powered computers, approximate probability distributions can be found, which tell us where electrons are (approximately, as for quantum systems, it is impossible to know exactly where the electron is) and their associated energy levels. Such details are useful in chemistry to help explain how, and why, the reactions happen.
Because of the wealth of detail that quantum calculations provide, the use of computational calculations to explain, predict and even mimic chemical reactions is an extremely important field. Ken Houk's group at UCLA, for example, have been exploring asymmetric chemical catalysis for many years using computational calculations.5 Such chemical reactions are crucial in the manufacture of molecules with chiral centres, and many of which have important biological properties. Chemicals can have handedness or chirality, which can lead to important biological consequences, as some molecules fit into spaces that others cannot. To manufacture such molecules, we must consider which product(s) might be generated by a chemical reaction that could follow several different pathways. Computers can help us predict why certain reactions are more favourable. To do this, we need, among other things, to calculate the energy barrier that exists between the reactants and the products of a reaction. Computers can consider the shapes of molecules in 3D, their interactions with one another and the energies of these systems, allowing the calculation of these so-called activation energies (see Figure 1.2).
From among all the chemically reasonable routes from reactants to products, how can we select the one that has the lowest activation energy? This low-energy pathway will often dominate the reaction, and therefore its product should be the one that is experimentally observed. Calculations of the activation barrier can be considered alongside the experimental observations to evaluate the hypothesis and gain insight – does what is expected through the calculations match the observation? If so, the model can provide useful insight into the chemistry of the interaction by providing information about the structure and dynamics of the transition state as chemicals move from reactants to products. Perhaps, is there an unfavourable interaction between two large groups in the unfavoured pathway that raises its energy? Or is there any favourable electronic interaction in the favoured pathway that lowers the barrier? Such insights can be used subsequently to design new and more efficient reactions.
1.2 What Is Artificial Intelligence?
Defining AI is surprisingly challenging. When trying to do so, I often think of the idea of a generally intelligent machine (a machine able to do a wide range of different tasks as a human does) and would, therefore, define an intelligent machine as A machine capable of doing things that humans consider to be intelligent.
Although this may not be entirely satisfactory, it sums up the goal of AI science quite well as humans are the most intelligent species on earth. It also aligns with the well-known Turing test, developed by Alan Turing in 1950. The standard version of this test requires a human interpreter to converse with, and subsequently judge, two other hidden players, of which one is another human and one is a machine. If, after having conversations with both players, the interpreter cannot distinguish between the machine and the human, the machine would be deemed to have passed the test and therefore be considered “intelligent” (within the boundaries of Turing's test of course) (see Figure 1.3).6 In this context, the test is often referred to as “the imitation game”.
Some tasks that we might consider require a degree of intelligence include speech and vision, scientific and medical decision making, and strategic gameplay. We will consider some examples of these tasks in the rest of this section.
AI has been a pursuit of scientists at least since the 1950s. During that decade, there were three important meetings. A “Session on Learning Machines” in Los Angeles in 1955, a “Summer Research Project on Artificial Intelligence” at Dartmouth College in 1956, and a symposium on the “Mechanization of Thought Processes” at the National Physical Laboratory in the United Kingdom.6 These early meetings tackled many of the problems that still occupy AI researchers today. These include examples of tasks such as pattern recognition,7 understanding of language,8 and playing chess.9 Other, more mechanistic, issues were also discussed, including the imitation of the human brain and central nervous system as the basis for learning machines,7 and the use of iterative learning to ultimately improve computer performance in learning tasks,10 as well as the realisation that powerful machines would be required to address many of these challenging topics.
Let us consider again the definition of AI given earlier. What is it about humans that makes them intelligent? And does answering this help us understand what kinds of things machines need to be able to do to also be considered intelligent?
Humans sense the world around them through vision. They explore the world by moving around it. They can speak to one another and recognise incoming speech. They can also recognise patterns in objects they encounter and events they experience. These goals are all currently being explored in AI, in the fields of computer vision, robotics, speech recognition, natural language processing, and pattern recognition. Let's cover some examples from the history of AI in these different research areas.
In the late 1950s and early 1960s, a number of projects attempted to use pattern recognition algorithms to aid in the identification of target objects in aerial photographs (aerial reconnaissance). Laveen N. Kanal, Neil C. Randall, and Thomas Harley at the Philco Corporation, for example, attempted to screen aerial photographs for military tanks.6,11 A small section of the film was processed to enhance any edges, and the result presented to the target detection system as a 32×32 array of 1's and 0's. This array was segmented into 24 overlapping 8×8 “feature blocks”, and each of which was then evaluated using a statistical test to establish if the block contained a part of the tank.
These tests assessed 50 photos containing tanks and 50 without tanks. This allowed a statistical model to be prepared to discriminate tank-containing feature blocks from non-tank-containing feature blocks using a linear boundary in 64 dimensions. A score was calculated for each image by considering how many of the 24 feature blocks in an image suggested the presence of a tank. This can be considered analogous to how a random forest12 model works (see Figure 1.4).
The performance of the model on the test images was excellent. Every image containing tanks had a score of at least 11 while all images without a tank had a score of 7 or less. In addition, half of the test images were perfectly classified, i.e. a score of 24 for an image containing a tank or a score of 0 for a non-tank image. Given the technological limits of the era (this work was not programmed on a computer, it used analogue circuitry), this is quite remarkable.
Another early project in pattern recognition and computer vision concerned human face recognition. Woodrow W. Bledsoe, Charles Bisson, and Helen Chan developed techniques for facial recognition at Panoramic Research in the 1960s.6,13 In this early work, a human operator extracted the coordinates of facial features, such as the position of the pupils, from photographs. From these coordinates, a set of 20 distances was calculated, such as the width of eyes, and these distances were then stored on a computer to be compared to the same distances in previously unseen photos during the recognition phase. The closest records were returned, using a nearest neighbour14 calculation to make a prediction (see Figure 1.5).
In 1970, this work was taken further by M.D. Kelly during his PhD at Stanford.6,15 Kelly wrote a computer program, which was able to detect the facial features by itself, for the identification of people, enormously reducing the workload required.
Subsequent work on facial recognition software progressed rapidly. A 2007 National Institute of Standards and Technology report showed the results of testing seven state-of-the-art face recognition algorithms against humans under different lighting conditions.6,16 While all the algorithms performed admirably in a number of case examples, three of the algorithms outperformed the humans in all the test cases. The best of these methods for facial recognition use ML algorithms on extremely large data sets.
One goal of true AI is to produce a machine that can understand written and spoken language. Such a machine would approach, and perhaps pass, the Turing test. Garnering meaning from words and sentences, as opposed to merely being able to recognise their occurrence, is an extreme challenge – as those who are talented enough to have learned a second language will attest. Consider the simple sentence;
“I never said he stole my money”
Depending on the emphasis, this sentence can have seven different meanings;
I never said he stole my money – Somebody else said it.
I never said he stole my money – I didn't say it.
I never said he stole my money – I implied it.
I never said he stole my money – Somebody did, maybe him.
I never said he stole my money – It was only borrowed.
I never said he stole my money – It was stolen from someone else.
I never said he stole my money – He stole something else.
And this is even before we get to truly crazy sentences such as “Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo” using three meanings of the word buffalo to construct a grammatically allowed sentence from eight repeats of the same word!
To assist computers in this task, we break sentences into parse trees – diagrams that represent sentences and their structure.6 As some sentences have more than one meaning, they also have more than one parse tree. A key breakthrough here was to use probabilities to identify which meaning is most likely in the given context. This, in turn, allows rules to be established as to which combinations of words are allowed in which orders, and hence which meanings are most likely from given sentences. The huge number of rules that would require coding make writing such a rulebook impossible. Luckily, the vast array of written words catalogued through human history make this a perfect task for the algorithms covered in the next section – ML.
1.3 What Is Machine Learning?
ML is a subcategory within the field of AI and specifically pattern recognition. ML algorithms are computer programs that, through the inspection of a data bank, and without explicit instruction from a human programmer, learn to carry out specific tasks. What kinds of questions can these algorithms answer?
Many modern AIs, including ML algorithms, require a large amount of data for their training. They also require well-defined questions, as they do not currently have the capacity for independent thought. This will be further explored in the next section, but I want to delve into a couple of examples from the world of ML to set the scene for this.
Firstly, I will discuss some of my own research, which includes the development of ML algorithms. Chemicals can have a range of effects in the human body. Some of these are desirable (such as the therapeutic effect of a medicine), whereas others are not. Undesirable effects are investigated in toxicology studies, which are required before a new drug can be prescribed, a new hair shampoo released, or a new pesticide used on crops. The primary aim of these studies is to ensure that the new chemicals are safe for humans and the environment, taking into account the amount of chemical that is used (the dose or exposure). Traditionally, toxicology studies were conducted using animal experiments, and an estimated safe dose for humans was extrapolated from these tests. However, animal testing is expensive, time-consuming, unethical, and humans and animals have different biology that can impact the results. Consequently, there is now a major drive in toxicology to look at methods that do not require animals; computational toxicology is one of these areas.
In my research, I aim to understand how the interaction of chemicals with proteins and enzymes in the human body might lead to a toxic outcome. Using computer models, new chemicals can be assessed before any experimentation is carried out. This mechanistic toxicology approach, based on understanding if and how interactions happen that lead to toxic outcomes, is an effective approach to toxicological problems. These rapid and cost-effective computational results can be followed up with targeted in vitro experimental assays on cells to provide further evidence. By making use of advanced new computational techniques, including ML, toxicology can become a more efficient, ethical, and reliable science.
To construct any kind of computational model, experimental data are required. Luckily numerous biological experiments on cells have been conducted, with widely disseminated results. Databases such as ChEMBL (https://www.ebi.ac.uk/chembl/)17 and ToxCast (https://www.epa.gov/chemical-research/toxicity-forecasting)18 collate much of this data and are open source. This data can provide the basis for models.
Initial modelling approaches included the use of structural alerts, coded fragments of molecules that are found more commonly in experimentally active molecules than inactive ones.19,20 While this procedure is relatively simple, it is popular among toxicologists as it allows them to easily identify the parts of a molecule that have caused it to be considered active by the computer. The procedure is extremely transparent. Furthermore, the toxicologists can also look at other molecules containing the fragment in a process known as read-across, in which the experimental results for an existing chemical are used for a new, untested, chemical when an argument can be made that the two molecules are similar enough. In the structural alert approach, a computer program is used to identify the common chemistry in the training data to be built into a model. If a new chemical contains a known active chemical feature, it is predicted as active by the algorithm (see Figure 1.6).
In a second modelling approach, quantum chemistry calculations were used as discussed earlier in the computational science section. Data were obtained for the Ames mutagenicity assay, and this was modelled by considering the activation energy for the reaction between the chemical toxicants and methylamine.21 The methylamine is a substitute for DNA in the calculation, as the Ames assay measures mutagenicity, and one way a chemical can be mutagenic is via a direct chemical reaction with DNA. Several experimentally tested chemicals were considered, and their activation energies were calculated computationally. The results showed a good correlation with the experimental results – with higher activation energies being calculated for Ames negative compounds and lower ones for Ames positive compounds (see Figure 1.7).
This is as expected: the Ames negative compounds have high activation energies and hence cannot react with DNA to cause mutations under normal biological conditions. The research allowed us to establish activation energy thresholds for chemicals that were likely to be Ames positive, negative, and unknown (in the middle, uncertain area) that can be used in the future to classify new chemicals.
ML neural networks are also used. These have involved the use of the TensorFlow algorithm in Python 3 to classify molecules in a similar fashion to structural alerts (see Figure 1.8).
Open source receptor-binding data are used to train neural networks to classify molecules as binders or nonbinders. These networks are extremely powerful predictors and show a high level of prediction accuracy, outperforming the structural alerts. The networks are also able to provide a confidence estimate of how well a new molecule matches what it knows to be a binder or a nonbinder, which can be treated as a measure of certainty in the prediction.
A disadvantage of using ML algorithms in toxicology is the difficulty in understanding why a specific prediction has been made. Unlike structural alerts, a trained neural network does not point out the parts of the new molecule that make it active. This is a challenge we have been looking to overcome using similarity calculations and examining which molecules the network “thinks” about in the same way, to identify chemical analogues during the prediction process.
A second, exciting ML example dates from the beginning of 2019.22,23 DeepMind, a leading AI research lab, and Blizzard Entertainment, a computer game development company, released a video demonstration on YouTube of AlphaStar, an AI, battling against professional human gamers in the game StarCraft II. DeepMind already had experience in building AIs to play games against humans in both Chess and Go, but StarCraft II is an entirely different challenge and shows how powerful AIs have become, and how much scope there is for them in the future.
Why is StarCraft II a more complex game than Chess or Go? Well, StarCraft II is a real-time strategy (RTS) game, meaning that players do not take turns, but are instead expected to strategically outwit their opponents and execute their moves in real time. In addition, StarCraft II is more complex than Go or Chess, with considerably more potential moves available at any given time and more moves to be made per game. Finally, the players have imperfect information, as they cannot directly observe the whole game area at all times, so it is very challenging to analyse what other players might be doing. This is obviously very different from Go and Chess where the players can always observe the whole board. As a result of these factors, success in an RTS game can be seen as a step towards a more general AI, which is able to solve a variety of problems like a human.
It is worth noting for context that in this demonstration, StarCraft II was slightly simplified to assist the training and operation of the AlphaStar algorithm. Only a single game map was considered (whereas human players are expected to play across a variety of different maps) and only a single faction, Protoss, was used (whereas RTS games tend to have a number of different factions to play as, all with their own strengths, weaknesses, and suitable strategies). This made the game slightly easier for AlphaStar to compete with the human professionals, in terms of the complexity of the full StarCraft II experience. That said, AlphaStar was also handicapped in the demonstration. A computer might be expected to have an advantage over a human player by being able to make faster moves during the game. Unlike a human player, AlphaStar does not have to click or press buttons to execute its commands. DeepMind limited AlphaStar to a number of actions-per-minute lower than that of a typical StarCraft II professional player.
AlphaStar itself is a neural network ML algorithm. The algorithm observes the game space, and its imperfect information, in a manner analogous to a human player and then makes decisions on what moves to make. While it perceives information, AlphaStar uses a neural network called a long short-term memory (LSTM)24 to make decisions (see Figure 1.9).
LSTMs are ideal for tasks when decision-making changes with time, as it allows AlphaStar to remember its previous decisions and thought processes to impact its current move. The LSTM then makes a decision to execute an action at a specific location, i.e. to build units or attack the opponent. Finally, AlphaStar considers how the overall game is going in terms of a win prediction to feed into its strategy. For example, if AlphaStar thinks it is winning, it is more likely to attack and try to finish off the opponent. This is one of the most difficult things for a human player to do, and hence it is very impressive that AlphaStar is able to consider this.
So, before the demonstration, AlphaStar needed to be trained to play StarCraft II. Its training began with an initial setup phase where human vs human games were used to extract the basic moves of StarCraft for AlphaStar to imitate. Once the algorithm had these basic moves, it was used as the basis for the AlphaStar League, an internal competition between different algorithms to establish the strongest ones. In each iteration of the League, the strongest algorithms were kept and slightly altered and the weakest discarded (which is the sort of approach used in the AI technique known as genetic algorithms). This allowed the algorithms to discover new strategies and attempt a variety of different tactics. Over many iterations, the best algorithms win out, and these are the ones that played against the professional players in the demonstration. Through the seven-day training procedure AlphaStar was put through in the AlphaStar League, it was estimated that it had played the equivalent of 200 years of StarCraft II for a human pro!
Now, I don't want to spoil the demonstration for those of you who have not seen it, but I can say that AlphaStar performed admirably against human professional players, and the pros and commentators who took part in the demonstration were impressed with the AI. During its training League, AlphaStar was also able to invent new strategies that would be considered unconventional for a human player to choose and sometimes it made in-game, real-time decisions, which baffled experienced commentators but turned out to be excellent choices. I cannot recommend checking out the demonstration highly enough.