Chapter 4: Deep Learning and Chemical Data
-
Published:04 Nov 2020
-
Special Collection: 2020 ebook collectionSeries: Drug Discovery Series
C. Batchelor, P. Corbett, A. Day, J. White, and J. Boyle, in Artificial Intelligence in Drug Discovery, ed. N. Brown, The Royal Society of Chemistry, 2020, ch. 4, pp. 45-62.
Download citation file:
Deep learning, machine learning that uses multilayer neural networks, has been responsible for significant advances in performance on standard tasks in image processing, machine translation and speech recognition in recent years. This is at least in part due to breakthroughs in algorithms for training neural networks and the widespread availability of GPUs. In this chapter we outline some tasks that use chemical data and practical examples of how deep learning has been used. The first of these is to infer facts about chemical structures from the corresponding NMR spectra. The second and third are based on chemical text. We describe entries to the BioCreative series of text-mining competitions, both by some of the present authors and outside the group—identifying chemical names and chemical–protein interactions. We discuss different appraoches to deep learning, a more traditional approach based on careful feature selection and engineering, and ones where a very simple underlying representation is chosen and the feature set is learnt by the system. We also compare the performance of deep learning algorithms to human annotators.