- 1 Introduction
- 2 Strategies in inverse molecular design
- 2.1 Genetic algorithms
- 2.2 Monte Carlo methods
- 2.3 Linear combination of atomic potentials
- 2.4 LCAP Hamiltonian in extended Hückel tight-binding framework
- 3 Recent applications
- 3.1 Applying TB-LCAP approaches for materials discovery
- 3.2 Inverse molecular design for nonlinear optical materials
- 3.3 Inverse molecular design for dye-sensitized solar cells
- 3.4 Experimental verification
- 4 Conclusions
Chapter 1: Inverse molecular design for materials discovery
-
Published:30 Oct 2013
-
D. Xiao, I. Warnke, J. Bedford, and V. S. Batista, in Chemical Modelling: Applications and Theory, ed. M. Springborg and J. Joswig, The Royal Society of Chemistry, 2013, ch. 1, pp. 1-31.
Download citation file:
Inverse molecular design has emerged as an attractive computational approach to take on the challenges in materials discovery. We present a conceptual formalism for the idea of inverse molecular design as it is implemented to perform optimizations that searches along the molecular property hypersurfaces constructed directly from the Hamiltonian of molecular systems. We first outline the basic principles and procedures for the discrete optimization algorithms that are used in inverse design, namely genetic algorithms (GA) and Monte Carlo (MC) methods. Then, we bring the reader's attention to a new inverse molecular design strategy based on the scheme of linear combination of atomic potentials (LCAP), developed by Beratan and Yang in 2006. Finally, we review the progress made in applying LCAP in the tight-binding framework to successfully design novel nonlinear optical materials and dye-sensitized solar cells. Due to the low computational cost of tight-binding electronic structure calculations, we envision TB-LCAP as a promising approach to inverse molecular design for applications in materials discovery such as catalysts design and solar fuels.
1 Introduction
Discovering materials with optimum properties is a long-term dream1–8 for both experimental and theoretical researchers. Historically, scientists used an approach of “trial and error” to find new materials that exhibit desired properties. Owing to the development in modern theoretical and computational chemistry (e.g., density functional theory9 ), predicting molecular properties using accurate and efficient quantum chemistry methods becomes more and more practical. As a consequence, inverse molecular design10–12 has emerged as an attractive computational approach to take on the challenges in materials discovery.
Inverse molecular design is a general term describing strategies in molecular design, that are in contrast to direct design methods. In direct design, a new molecule is proposed first, and then the molecular property is computed or measured to check its potential use. In contrast, inverse molecular design aims at searching for optimum points on hypersurfaces defining property-structure relationships, and then mapping out the molecular structures at the optimum points.13 Hence, using the idea of inverse molecular design could significantly enhance the efficiency and success rate of molecular design and save costs in materials discovery.
Inverse molecular design has been implemented as an optimization method in theory,10–12 assisting the search for optimum chemical structures using global optimization algorithms.
Here, finv is a notation for the operation of inverse molecular design. denotes a molecular property (an observable), which is a functional of the Hamiltonian operator . are the set of user-defined variables for varying the Hamiltonian. For example, these variables could be the indices defining a molecule as a composition of molecular fragments, a set of nuclear coordinates, or even atomic numbers.
OT denotes a given target value of a molecular property, e.g. a maximum point of the molecular property. The minimization operation ‘min’ may be performed through a variety of different optimization algorithms that minimize the quantity . Thus, finv aims at finding a particular set (and thereby a molecular structure) that has the best match to the target molecular property. This is a general formulation for the idea of inverse molecular design. When applied to specific systems, the formulation may be transformed for the purpose of optimizing particular target molecular properties.
In this work, we focus on reviewing inverse molecular design based on hypersurfaces of molecular properties vs. molecular structures that are constructed through direct calculations of molecular properties from variable Hamiltonians (for representing different chemical structures). Alternatively, analogous hypersurfaces relating property and structure have been constructed based on statistical models for molecular properties with respect to sets of chosen molecular descriptors (for molecular structures or properties). Such hypersurfaces are used extensively for the inverse design based on quantitive structure activity relationship (QSAR). We refer interested readers to literature14–16 on inverse QSAR, which is not the focus of this review.
In molecular structure space, the Hamiltonian variables are associated with the atom types and their spatial arrangement.11,17 Different stochastic and deterministic optimization algorithms10–12,18 have been adapted to work in inverse molecular design methods. The choice of an optimization method depends on how the particular Hamiltonian, linking structure and property, is varied during a search (i.e. depends on the set of Hamiltonian parameters/variables that are varied).
We begin with reviewing optimization algorithms that are based on discrete molecular objects such as genetic algorithms and Monte Carlo methods. Then, we describe an emerging approach named linear combination of atomic potentials (LCAP) developed by Beratan and Yang11 for inverse molecular design. This approach allows us to search for optimum molecules using continuous or discrete optimization algorithms. In particular, we will review recent progress made in applying LCAP in the tight-binding (TB) framework, which could provide an efficient way for molecule search. Finally, we review applications of TB-LCAP for optimizing nonlinear optical materials and dye-sensitized solar cells. In particular, novel materials have been proposed by TB-LCAP and verified by experiments. Due to the low computational cost of tight-binding electronic structure calculations, we envision that the TB-LCAP will be a promising inverse molecular design method for taking on challenges in materials discovery such as catalysts design and solar fuels applications.
2 Strategies in inverse molecular design
Genetic algorithms and Monte Carlo methods are commonly used as optimizers for inverse molecular design in discrete molecular structure space of chemically representable candidates.
2.1 Genetic algorithms
Genetic algorithms19 are methods tailored to address complicated multi-dimensional optimization problems.20 They belong to a broader class of evolutionary algorithms20,21 which originated in the 60s and 70s when scientists started to explore the possibility of using basic principles of evolution to develop adaptive and highly efficient and general optimization schemes. Nowadays, GAs find extensive use in a large variety of scientific fields. Notably, over the past 30 years, they have graduated to become a major tool for computational disciplines in physics, chemistry and materials science where they are used for atomistic and electronic structure level optimizations of molecular geometries, energies and properties.22
In inverse molecular design, GAs may be applied as optimization algorithms to locate vectors X={λ1, λ2…, λn} such that becomes minimal. In electronic structure calculations there is generally no simple relationship between X and the property of interest . The calculation of these properties often is complicated and comes at high computational cost.23 Efficient optimization algorithms such as GAs are necessary for locating optima on the hypersurface .
Zunger et al.4 adapted GAs for the purpose of inverse molecular design, and discovered structural motifs with optimal bandgaps in quaternary (In, Ga)(As, Sb) semiconductors. Hutchison et al.24 used GAs for optimizing organic polymers for photovoltaics. To shed light on how GAs are used for inverse molecular design, we review selected basic principles and current developments.
GAs are population based methods.25 In GA context, a population consists of a set of trial solutions {Xa|a=1,…, population size}. Basic selection and recombination principles inspired by Darwinian evolution26 drive the adaptation of the population towards a target property, i.e. the optimum of an objective function. Genetic operators26 are realizations of the selection and recombination rules. They determine which trial-solutions (individuals) are selected from the population and specify how they are recombined to form new ones. The concept of fitness refers to the difference between the objective function value of a trial solution and the global optimum.
Figure 1 illustrates the basic steps common to all GA optimizations. (1) The first step in a GA optimization consists in generating an initial population which, in practice, will often be made of randomized trial solutions X. (2) Subsequently, GAs proceed to evaluate the individuals’ fitness. (3) Based on the fitness a selection operator chooses a number of individuals (parents) for recombination and formation of new individuals (mating). (4) During the mating process, crossover operators27 recombine characteristics of parent-solutions to form new trial solutions while mutation operators28 account for the introduction of small random changes with a certain likelihood. (5) The resulting new trial solutions are then pitched against their parents and form a new generation by replacing the weakest individuals of the previous one. Selection and mating are repeated until convergence is achieved.
Fitness function. The criteria applied in the selection process are expressed in terms of a fitness function ffit(X).29,30 To minimize a given objective function, ffit can often be set to equal . However, appropriate scaling of the objective function may influence the roughness of the potential hypersurface and play an important role for the efficiency of the optimization. Approximations to the objective function can be used to estimate their fitness. Such strategies can be used to increase computational efficiency when the objective function evaluation comes at high computational cost.31 Especially when electronic structure methods are needed to calculate energies or locally optimize individuals, the evaluation of a single individual's fitness may take hours or even days. In such cases, an efficient and sufficiently accurate approximation can often be used to identify promising candidates which are in turn treated at a higher theoretical level.
Selection operators32 realize methods for choosing individuals from the existing population to generate new individuals (children) in the recombination process. Population size, the number of individuals chosen for recombination and the number of new individuals to be generated from a given generation are parameters that may be chosen to suit the problem.33 In a straightforward selection, a certain amount of the fittest individuals would be chosen for recombination. In practice however, a variety of selection strategies exist which involve the drawing of random numbers.32 Such probabilistic schemes introduce a likelihood for weak individuals to be drawn for recombination while ensuring that the fittest individuals have the highest probability of being selected. Most commonly applied are fitness proportionate selection schemes, where the likelihood of selection is proportional to the individual's fitness. In the alternative tournament selection method, the fittest individuals within randomly chosen sub-populations are drawn. Here, the number of sub-populations and the number of individuals to be drawn from each population are parameters that adjust the selection pressure, i.e. the likelihood of weak individuals to be drawn for recombination. Other selection schemes exist and are outlined in the literature.32
Crossover operators27 realize the set of rules according to which the selected parents are recombined to form new individuals that exhibit combined or completely new characteristics. As is the case with all other GA operators, there is no unique way of defining such a procedure and several variants are commonly used. The simplest version is a single point crossover operator. It chooses a random crossing point between two parent vectors X and then interchanges the crossed sections. The number of crossing points can be increased to allow for more flexibility in the recombination process. A schematic of the procedure is presented in Fig. 2. The number of parents entering the recombination process, as well as the number of children resulting from the crossover is sometimes increased. “Cut-and-Splice” techniques34 differ slightly from the described crossover method; parents can have different crossover points which allows for evolutionary variation of the dimensionality of the optimization problem, or size of the system. The uniform crossover method35 uses a fixed mixing ratio (e.g. 0.5) and allows each bit of information to be interchanged between the parents individually and independently. This strategy is equivalent to choosing a random number of crossing points while maintaining a given ratio of information that is exchanged between the parents.
Depending on how specifically individuals are encoded in the GA, exchange of random regions during the crossover procedure may not always produce meaningful outcomes. Whenever trial-solutions need to satisfy conditions such as minimum or maximum distances between nuclei, the ordering of variables may become important. A number of strategies exist to overcome this challenge. Some schemes involve reparation of a random trial solution while others create solutions that satisfy existing conditions by design. For example, Ahlrichs and co-workers36 describe a crossing method designed for optimizing geometries of clusters. The structures are cut into fragments and recombined such that
only contiguous groups of atoms are interchanged and
the minimum interatomic distance is controlled. Before pitching the created individuals against the rest of the current population, their structures are locally relaxed to the closest local minimum.
Mutation operators. Most GAs incorporate the biological concept of mutation to preserve a degree of diversity during evolution of the population. Mutation operators introduce random changes to trial solutions while being generated in the crossover process. Mutation operators are designed to help the GA avoid local trapping which may occur whenever the population has obtained a reduced diversity due to evolution towards a non-global optimum. Mutation creates random features to enable sampling over new regions in the space of possible variables. The motivation for mutation operators is similar to the reasoning for introducing randomized selection operators that avoid an exclusive drawing of only the fittest individuals. Different versions of mutation operators exist28 and are appropriate depending on the nature of the optimization and on whether the algorithm uses vectors of real numbers or catenated binary strings to represent individuals. For real-coded37,38 GAs, a common procedure is to add a Gaussian distributed random value to an arbitrary variable in the vector. If the population is encoded in the form of binary vectors21 , bit operations such as swapping, inverting, and scrambling the bits of variables can be used to introduce mutations. The likelihood of mutation and the number of mutations allowed per individual influence sampling and convergence of the optimization. In general, an excessive mutation probability will slow the convergence as the genetic information stored in the population is lost at an elevated rate and the method becomes equivalent to a random search algorithm.
Convergence. Ideally, the algorithm would stop producing new generations when the global optimum is found. However, in most GA applications, there is no clear indicator that the global optimum has been located, i.e., no single criterion exists that is sufficient for global convergence. In practice, different criteria are used to decide when to terminate an optimization.39 The time available on a computer system might limit the number of generations that can be produced. For some applications it might be sufficient to locate solutions that exhibit a certain target fitness regardless of whether they correspond to the global optimum. Convergence is assumed when the population's fittest individuals stop to evolve and remain constant for a number of generations. Sometimes, manual inspection of the individuals can give a good indication for convergence. For example, chiral structures may be among the low-energy geometries of metal clusters. If this is the case, then the GA has to locate both iso-energetic enantiomers. The lack of a sufficient convergence criterion is a problem for all global optimization methods.
More aspects of GAs
Seeding. In certain instances, the performance of GA optimizations may be improved greatly by seeding.40 If common features or patterns in solutions are known prior to the GA optimization they may incorporated into the initial population. For instance, when searching for minimum energy metal cluster structures of a given size N, the structural motifs that appear in the most stable clusters of size N − 1 may be used to generate initial structures of size N that may be already close to the optimum.
Isolated-island models41,42 realize the idea of sub-dividing the total population into smaller sub-populations which may then evolve independently and under different conditions. Such approaches offer a high flexibility and may be advantageous whenever the GA parameters needed for an efficient convergence are not known. Sub-populations can be of different size, and their evolution may be governed by different selection, crossover, and mutation operators. The islands’ sub-populations may be further subdivided and we can allow for an occasional exchange of individuals between different sub-populations. While one population may be small and evolve through a set of GA operators that favor elitism and only small mutation rates, another might be large and allow for more variety. Populations may influence each other through exchange. Many variations are possible.43
Parallel GAs. GAs are particularly well suited for parallelization44 , where the computational load is distributed over a number of individual processors. Generally, the time-limiting step in GA optimizations is the fitness evaluation for each new individual31 , typically an electronic structure calculation of the electronic energy of a molecule or a derived property. In a naive parallel scheme, the computation of each individual's fitness can be delegated to a different processor. Such a parallel scheme is especially beneficial in case of large populations and when many processors are available.
In conclusion, GAs combine a set of characteristics that make them very attractive for molecular design.22,45 Mimicking basic principles of evolution, they constitute a paradigm change in global optimization schemes.46 As population based methods, they do not compute a single function value at a time, but a set of function values corresponding to a population of trial solutions which are initially distributed randomly. As new generations are formed by genetic operators, the newly formed individuals adapt and close in on minima. Human knowledge of the problem is required only for creating the set of rules for evolution, i.e., the genetic operators. However, seeding the initial population may greatly reduce the number of objective function evaluations needed for convergence. As meta-heuristic optimization schemes, GAs are conceptually simple, easy to implement and adapt to a specific problem. Meanwhile our fundamental understanding of their convergence properties is still lacking.47 A large variety of different versions of genetic algorithms exist. Different variants may be suited for different optimization problems. Once the specific GA model is chosen, parameters such as population size, the number of children and mutation rates may be adjusted. The success of this class of adaptive optimization algorithms is documented by the vast amount of applications of GAs to problems throughout science. While we introduce basic concepts and discuss several relevant aspects, issues, and advantages of GAs, we refer to recent reviews22,45 for specific applications of GAs in the field of molecular and materials design.
2.2 Monte Carlo methods
Another popular strategy to circumvent exhaustive enumeration in molecular design problems are Monte Carlo (MC) methods.
Zunger et al. first adapted the Monte Carlo method for the purpose of inverse molecular design in a property space directly computed from a quantum mechanical Hamiltonian. They discovered the optimal atomic configuration Al0.25Ga0.75As for tuning optical bandgaps.10 To outline the use of Monte Carlo methods in inverse molecular design, we review the basic principles of the current development of the algorithm in the followings.
Monte Carlo optimization operates in the same way as Monte Carlo simulation.48 As with the previously discussed GAs, we provide a short overview of MC methods, pointing out key aspects. The Monte Carlo method begins with a random initial point in molecular configuration space (λ1, λ2, …, λN). Where the configuration space is understood to include both the three dimensional position of the atoms in the molecule and the identity of the atoms. From λ1, λ2, …, λN a new point in configuration space is selected by an elementary Monte Carlo move. This new point is called λ1,trial, λ2,trial, …, λN,trial is set equal to λ1, λ2, …, λN with probability
Where
Where is the function that maps the point in multi-dimensional configuration space {λ1, λ2,…, λn} to a point in single dimensional property space. Note that this equation corresponds to a minimization of the desired property function . T is a fictitious temperature parameter, kept constant when performing Monte Carlo optimizations. However, it can be ramped up and slowly decreased to trap the system in the optimal {λ1, λ2,…, λn}. This is known as simulated annealing.49
In essence, these equations state: if property value of the trial point in configuration space is less than the current property value, then accept the trial solution. However, if the property value of the trial point in configuration space is greater than current property value, then accept the trial solution if a random number between 0 and 1 is less than exp
At first glance it may seem counterproductive to accept less optimal configurations with a non-zero probability. However, by accepting such configurations the algorithm is capable of moving out of local minima and ultimately finding the global minimum.49 The idea is that through repeating MC and SA procedures we move closer to the true global optimum.
Key issues in implementation. The algorithm described above is the minimal ingredient required to perform Monte Carlo optimization. In practice, several other issues must be addressed to get meaningful results in an acceptable amount of time. MC optimizations permit great flexibility in the specific implementation of how the MC trial configurations are generated. There are in fact many different kinds of ways to make an elementary MC move and thus generate λ1trial, λ2trial,…, λNtrial. An important result from MC theory is that the choice of how the Monte Carlo moves are made does not affect the ultimate global optimum reached. Rather, it affects the rate in MC steps at which the simulation converges to a global optimum.50 It is very important to wisely choose how one sets up the Monte Carlo move method. The specific details concerning the generation of λ1trial, λ2trial, …, λNtrial can be refereed to existing literature.51
The various methods of generating trial configurations are similar in that they try to produce moves that have a high probability of acceptance and produce trial points that are “far” from each other in configuration space. Meeting these two requirements translates into an ability to quickly explore the configuration space.
In addition, for SA we must also address the question of how quickly we allow the T to change while optimizing. We know from SA theory that log-rate cooling will guarantee that we are trapped in the global minimum. This is often prohibitively expensive in many applications. Another common cooling function is exponential cooling. While this cooling function is computationally advantageous, it often leads to final configurations that are trapped in local minima.50
Recent algorithm advances. There are many aspects that need to be addressed in order for the Monte Carlo method to be successfully applied to the inverse design of molecules. Even with these methods, when the property surface is rugged, the Monte Carlo method may become trapped in local minima. In addition continuous optimization schemes such as the continuous LCAP search outlined below have been shown to be less efficient in exploring the entire property surface when it is rugged.52
To overcome these issues and apply the LCAP inverse design method to new problems, such as inhibitor design, Hu, Beratan, and Yang have devised a way to incorporate local gradient information in the Monte Carlo method.52 This method is known as the gradient directed Monte Carlo method (GDMC). GDMC uses property gradients to jump between discrete molecules. When the search algorithm is trapped in a local optima, random MC moves are helpful to overcome local barriers.
GDMC is advantageous for two reasons. Firstly, it is computationally efficient as is saves computational time by not searching “intermediate” states of continuous property surfaces. Such states correspond to non-physical structures. Secondly, as mentioned, this method is able to move out of local minima when the property surface is very rugged.
The GDMC method works by constructing a continuous virtual property surface using the LCAP procedure. This algorithm is then fed the initial molecule. The property of interest is then computed along with its gradients. Proposed structures or atom types are generated using the property gradients produced with the LCAP procedure. Finally, these new structures or atom types are accepted or rejected according to the metropolis rule.
In conclusion, although MC methods are conceptually simple, some drawbacks exist which often times limit their applicability. It is often prohibitively expensive to perform true logarithmic cooling. Recently, very promising advances include the gradient-directed MC method that combines the favorable properties of LCAP approaches with the advantages of MC methods.
2.3 Linear combination of atomic potentials
Instead of searching for molecules “one by one” in the discrete space of molecular compositions, Beratan and Yang introduced a continuous optimization strategy based on the scheme of linear combination of atomic potential for inverse molecular design.11 The LCAP approach provides a continuous interpolation between discrete molecular structures. This characteristic distinguishes the approach from a earlier strategy.13 As a result, a continuous molecular property hypersurface with regard to variations of the chemical structure is constructed. Such a continuous property surface allows property gradients to be defined and evaluated, leading to efficient structure optimization.
LCAP scheme in DFT. In density functional theory, the electronic structure problem is cast as
where T is the kinetic energy operator. vext is the external potential operator. vH is the Hartree energy operator. vxc is the exchange-correlation energy operator. E is the energy. φ is the electronic wavefunction. Given the option of a few atom types (or functional groups) in one or more atomic atomic sites, the external potential can be written as a linear combination of atomic potentials11
Here, is the external potential of atom or group type A at position . Specifically, if A represents an atom type, is given by
where ZA is the atomic number of atom A. The optimization coefficients define the weighting of a particular atom or group type, and are varied during the optimization. For real molecules, one value must be equal to one and all others must be zero at a given . That is, no more than one chemical unit may exist at any position . The constraints on during the optimization are and . can vary continuously between 0 and 1, leading to continuous transformation of atom types.
The potential and the number of electrons completely specify the input for the ground-state electronic structure and its properties. The external potential establishes a space on which optimization is conducted to design molecules. When the optimum is reached in a continuous optimization, the values are rounded to the nearest chemically representable molecule, zero or one. Earlier optimizations of electronic non-linear polarizabilities support the viability of this rounding approach.11
Following the LCAP scheme in DFT, inverse molecular design methods have also been developed in the frameworks of tight-binding18,53 and semi-empirical54 electronic structure theories.
Optimization on LCAP hypersurfaces (of molecular properties vs. structures) can then be carried out using continuous or discrete optimization algorithms. For example, one can use the gradients information to continuously search along LCAP hypersurfaces as reported initially.11 This strategy can be effective when the molecular property surface is not particularly rugged. One can make discrete jumps between real molecules on the property surface, following the LCAP gradient analysis. One can introduce stochastic jumps to exit local minima, as in finite temperature Monte Carlo sampling. This is known as the gradient directed Monte Carlo (GDMC) approach.54,55 Discrete optimization algorithms have been developed with the LCAP scheme for optimizing hyperpolarizability.56,57 The discrete best-first-search scheme was adapted for the LCAP scheme to optimize the photoacidity of 2-naphthol derivatives.58
Variational particle number approach for inverse molecular design. Similar in spirit to LCAP, another new inverse molecular design strategy was developed by von Lilienfield et al. on the basis of variational particle numbers.12 This approach uses atomic numbers as variables and gradients of molecular properties with respect to atomic numbers to guide the search for molecules. The authors applied this approach to design ligands for protein binding. Later, the approach was developed for searching molecules in the chemical compound space using atomic positions, atom type, and number of electrons as variables.17
LCAP Hamiltonian in the tight-binding framework. The LCAP principle has been implemented in a Hückel tight-binding framework by Xiao et al.,18 denoted TB-LCAP, and later expanded into the extended Hückel framework.53 In the following, we present the implementations for TB-LCAP approaches in the Hückel and extended Hückel frameworks, respectively.
In the Hückel tight-binding framework59 , the Hamiltonian is represented in matrix form for describing the electronic structure. The interactions are only accounted for between the pz-orbitals, the basis functions of the pz-orbitals are assumed orthonormal. The electronic interactions happen only between nearest neighbor sites. Thus, the Hückel Hamiltonian matrix is
where i and j are indices of atomic sites, and n equals the total number of atomic sites in a molecule.
When applying LCAP to the Hückel Hamiltonian matrix, the coefficients are not operated on external potentials, but directly on the elements of the Hückel Hamiltonian matrix. Assuming there are n sites in a molecular framework for inverse design, one intends to search for the optimal atom (or group) type for some (or all) of these sites for the target property. For each variable site, there are Ntype atom or group choices. The coefficients in Eq. (5) may be expanded to include the coefficients for each site. The site energy Hii (i.e., diagonal term) is determined by the available atoms for a site, so the LCAP value of this matrix element is
where is the site energy of atom type A at site i, is the LCAP weighting coefficient for atom type A on site i, and is the total number of atom types possible at site i. For chemically fixed sites, the site energy equals the site energy of the atom type at that site: and . The interaction between sites i and j (i.e., off-diagonal term) is:
where is the interaction strength between atom type A (at site i) and A′ (at site j). This approach differs from an earlier approach by Kuhn and Beratan13 who restrict the Hamiltonian matrix elements to interpolate between values arising from several atomic possibilities. Since molecular electronic properties are directly computed from the LCAP Hamiltonian matrix, optimizing molecular properties can be achieved by evolving the coefficients.
The choice of candidate atom types for this TB-LCAP approach depends on the availability of Hückel parameters and . In the literature60 , the Hückel parameters of 13 atom types are available for the C, N, P, B, O, F, Si, S, and Cl elements. These parameters were obtained from Pariser-Parr-Pople (PPP) calculations.61 Thus, the Hückel TB-LCAP method can be applied to molecular systems with electronic structures dominated by the pz orbitals interactions that come from the C, N, P, B, O, F, Si, S, or Cl elements. As an example, the Hückel parameters for C, N, and P are shown in Table 1.
. | . | . | . | . |
---|---|---|---|---|
C | N | P | ||
C | 0.00 | 1.00 | – | – |
N | −0.51 | 1.02 | 1.09 | – |
P | −0.19 | 0.77 | 0.78 | 0.63 |
. | . | . | . | . |
---|---|---|---|---|
C | N | P | ||
C | 0.00 | 1.00 | – | – |
N | −0.51 | 1.02 | 1.09 | – |
P | −0.19 | 0.77 | 0.78 | 0.63 |
2.4 LCAP Hamiltonian in extended Hückel tight-binding framework
The extended Hückel tight-binding (EHTB) approach has been extensively applied for calculations of electronic structures for a wide range of molecular and solid-state structures62,63 as well as for studies of sensitized TiO2 surfaces including the analysis of interfacial electron transfer.64–68 In the EHTB theory, the time-independent Schödinger equation is represented in matrix form
where H is the extended Hückel Hamiltonian in the basis of Slater-type atomic orbitals (STO's), Q is the matrix of eigenvectors, E is the diagonal matrix of eigenstate energies, and S is the overlap matrix of the STO's. Note that the Hamiltonian matrix elements include the interactions between any two sites (not just the nearest neighbor sites). In addition, the electronic interactions between any type of orbitals (not just pz orbitals) are considered.
When the LCAP scheme is applied to the EH Hamiltonian, the LCAP coefficients are operated onto the Hamiltonian matrix elements. Assuming possible atom types at atomic site i, the participation weight of each atom type A at site i is given by the coefficient , allowed to vary continuously between 0 and 1. For each atom site, the sum of is 1, i.e., the constraints are and . The diagonal matrix elements of the EHTB-LCAP Hamiltonian are defined as follows:
where represents a STO of atom type A and is the EHTB diagonal Hamiltonian matrix element for atom type A at site i. Note that for the specific case of , since all other participation coefficients at the i-site are equal to zero. More generally, is an arithmetic average of EH matrix elements associated with all possible atom types at site i weighted by their corresponding participation coefficients. The off-diagonal matrix elements are defined as follows:
with representing the atomic orbital of atom type A′ at site j, and the original off-diagonal Hamiltonian element for the case of atom type A at site i, and atom type at site j:
where K′=K+Δ2+Δ4 (1−K) is defined according to the original Wolfsberg-Helmholz formula69 with and K=1.75. The off-diagonal overlap matrix elements for atom types A at site i and atom types A′ at site j are defined as
The participation coefficients are initialized randomly. These coefficients are subsequently optimized with respect to the property of interest (e.g., solar light absorption) by using elements , and as defined by the EHTB-Hamiltonian, with STO parameters reported by Hoffmann et al.69 without any further corrections. The extended Hückel parameters are available for most of the elements in the periodic table, enabling EHTB-LCAP to search molecular structures other than π-conjugated molecular systems. In addition, the basis sets for extended Hückel calculations include a variety of valence orbitals types. For example, the STO basis set includes 3d, 4s and 4p atomic orbitals for Ti, 2s and 2p atomic orbitals for O, C, and N elements, and 1s orbital for H.
3 Recent applications
3.1 Applying TB-LCAP approaches for materials discovery
Once the TB-LCAP Hamiltonian is formulated, molecular property functions are computed from the TB-LCAP Hamiltonian. Following the gradients of molecular properties with respect to the TB-LCAP coefficients, one can reach the optimum points of molecular properties using deterministic search algorithms. In the followings, we will show how the TB-LCAP approaches have been to optimize nonlinear optical materials and to optimize dye-sensitizers for dye-sensitized solar cells, respectively. For each of the applications, the key elements for implementing the TB-LCAP scheme are illustrated in the following order: (i) target molecular properties, (ii) gradients of molecular properties, (iii) molecular framework, and (iv) optimization results. The TB-LCAP hypersurface studies are summarized. The optimization efficiency of TB-LCAP is discussed for the design of nonlinear optical materials at the Hückel level, and the experimental verification for optimized lead chromophores is shown for the design of dye-sensitizers on TiO2 at the extended Hückel level.
3.2 Inverse molecular design for nonlinear optical materials
When photons interact with nonlinear optical (NLO) materials, the frequencies, phases and other properties can alter. Because of their ability to manipulate photonic signals, NLO materials are highly demanded in technologies such as optical communication, optical computing, and dynamic image processing.70–77 For molecular design, materials with large NLO responses, such as the first hyperpolarizability72,77 are desired to improve the efficiency of manipulating photonic signals, such as changing photon frequencies.
Target molecular properties. To find optimum nonlinear optical materials, Xiao et al.18 maximized the electronic hyperpolarizability along the LCAP hypersurface using the Hückel TB-LCAP approach. For the purpose of comparison, they also maximized the polarizability (a linear optical property) in the same approach. These two target molecular properties are computed based on perturbation of the electronic energy of a molecule. The response of the electronic energy of a molecule to an external electric field F is
where indices i, j and k represent the Cartesian directions. The molecular polarizability α is the second derivative of E with respect to F, , and the first hyperpolarizability β is the third derivative of E with respect to F, . The target function of polarizability is defined as the average of the diagonal elements of α:
Here, the polarizability components (αxx, αyy and αzz) are computed using the finite-field methods:
The molecular property function of the molecular hyperpolarizability is78 where and the subscript a represents x, y, or z. The hyper-polarizability components are computed using the finite-field methods:
and
The electronic energies E(0) and E(Fi) (with field) are calculated from the trace of the Hamiltonian and the density matrices. The Hamiltonian interacting with the field is,
Here, k and l are indices for the p-orbital basis functions. In the Hückel tight-binding approach, the transition dipole integral is approximated as , where is the ith-direction Cartesian coordinate of the kth atom. In the finite-field method, the numerical value of the field F is taken to be 0.01 V/Å. Once optimized the stability of the property values is verified by comparing the results to sum-over-state (SOS) calculations.79
Gradients of target properties. The target properties and were optimized using a continuous optimization algorithm, i.e., the quasi-Newton method.80 In the quasi-Newton method, the gradients are calculated numerically using finite-differences,
Function f represents any target property function, and represents variation of . Once optimization is complete, is rounded to 0 or 1 to determine the corresponding chemically representable, “real” molecule.
Molecular frameworks. To apply the Hückel TB-LCAP optimization method, Xiao et al.18 designed ten sp2 molecular frameworks, as shown in Fig. 3. Framework 1 is a linear conjugated lattice. Frameworks 2–10 are lattices with ring structures, and are listed in the order of ascending number of chemical structures that can be built using each framework. The varying sites in these frameworks are allowed to take the options of atom types (i.e., C, N, or P) or atom pairs (i.e., CC, NN, CN, NC, CP, PC, NP or PN). The number of possible chemical structures out of the frameworks range from 102 to 1016. With the chosen atom types and functional groups, all possible optimal structures for each molecular framework represent synthetically plausible molecules based on known heterocyclic chemistry.81
Optimum structures. Figs. 4 and 5 show optimized structures with maximum and , respectively. These structures are global optimum points as verified by exhaustive enumerations, and they are obtained after 100 random initializations. By comparison, there are interesting findings for optimum nonlinear optical materials. Firstly, hyperpolarizability and polarizability increase with the size of molecules. For example, the maximal (or ) value for framework 1 (the linear conjugated chains) is larger than those of framework 2–9 (smaller aromatic frameworks), and smaller than that of framework 10 (large multi-ring aromatic framework). Frameworks 1 and 10 have the largest linear dimension. This finding is consistent with the literature.82–84 Secondly, optimized structures are of lower symmetry than optimized structures, verifying that nonlinear optical materials favor asymmetric chromophores such as the donor-acceptor “push-pull” framework.70,72,74,83,85,86 Based on these findings, we conclude that the TB-LCAP inverse design can indeed lead us to the optimal structures or lead chromophores for designing nonlinear optical materials.
Smoothness of LCAP surfaces. Polarizability and first hyperpolarizability surfaces are shown Figs. 6 and 7 as a function of the mixing parameters for two sites in molecular framework 8. The property values change relatively smoothly with variation of group chemistry at the two sites. In addition, at the end of continuous optimizations, most of the results converge to 0 or 1 for the variables. Rounding the other coefficients to 0 or 1 does not dramatically change the property values. This suggests that the Hückel TB-LCAP hypersurfaces for and are relatively smooth. Only a few different optimal structures (local) arise from the 100 randomly seeded searches, indicating the presence only a small number of local optima in the hypersurface.
Optimization efficiency. To evaluate the efficiency of the TB-LCAP search, Xiao et al.18 compared the optimization time with the time required for exhaustive enumeration. For example, the exhaustive enumeration time for is 0.8 seconds for framework 2 (with 512 possible structures), and 9 hours for framework 9 (2.4×106 possible structures). The efficiency metric is defined as the ratio of the exhaustive enumeration time to the average time for one random seed optimization (η). Setting the continuous convergence precision to 102 and using the average time based on 100 random seed molecules, they constructed a log-log plot of η vs. the number of possible chemical structures in the library N (see Fig. 8). They find that η increases as N increases. The overall search efficiency is where M is the number of random initial seeds.
The TB-LCAP optimization generally shows high efficiency of 103 when the chemical space size reaches approximately 106. This suggests that the TB-LCAP search can be very efficient for chemical spaces containing millions of structures, if using 100 random seeds. The optimizations of different properties show different efficiencies. Fig. 8 shows that optimizing Egap (energy gap) is more efficient than optimizing .
The tight-binding LCAP implementation described here provides an approximate framework that can be used to discover optimal structures in extremely large molecular libraries. This continuous search of chemical space is effective and efficient. The number of molecular property computations required to find the optimal structure is a small fraction of the overall library size, suggesting the potential utility for this approach for designing structures in large molecular spaces. LCAP optimization may become more costly as geometry optimization and solvent effects are introduced. Results from optimizations using larger libraries suggest that large linear polarizations favor symmetric structures, and that large first hyper-polarizabilities favor less symmetric structures, as expected.
3.3 Inverse molecular design for dye-sensitized solar cells
The design of molecular chromophores is critically important for the development of dye-sensitized solar cells (DSSCs)87,88 for photo-electricity generation and photocatalysis.64,65,89,90 Since the discovery of DSSCs by O'Regan and Grätzel,87 hundreds of dyes have been developed and tested,91–94 and cells with photoconversion efficiencies as high as 11% have been reported.87,88,95 However, for almost 20 years the outstanding challenge remains to be the design of DSSCs with even higher photo-conversion efficiency. Xiao et al. introduced an inverse design methodology to guide the synthesis and optimization of lead molecular chromophores with suitable photo-absorption properties.
The target molecular property is the photoabsorption probability given by
defined as the sum of products of the oscillator strengths of electronic state transitions q←p, with transition wavelengths λpq, times the probability of having a solar photon with such a wavelength
as modeled by the spectral radiance of a black body at T=5523 K (see Fig. 9 ). The oscillator strength fpq corresponds to the transition between electronic eigenstates |ψp〉 and |ψq〉96
where vpq is the transition frequency in cm−1, c is the speed of light in vacuum, me is the electron mass, h is the Planck constant, e is the electron charge, and is the transition dipole moment. For pure states (i.e., states with participation coefficients equal to either 1 or 0), the transition dipoles are obtained in terms of the EHTB eigenstates and in the basis set of atomic STO's |φi,α〉 where α is the type of atomic orbital, i labels the atomic site, and are the expansion coefficients obtained by solving the eigenvalue problem, introduced in Eq. (10), giving:
For alchemical states (i.e., with ), the generalized atomic orbitals are
where elements are calculated by using the STO atomic basis set. Substituting Eq. (27) into Eq. (24) and then Eq. (24) into Eq. (22), the total ‘photo-absorption’ spectrum of the alchemical structure with participation coefficients is obtained.
The structures that maximize light harvesting were obtained by summing over electronic state transitions, introduced in Eq. (6) must include all allowed transitions in the UV- visible-IR range where there is significant solar radiance. For photo-conversion, however, it is important to maximize photo-absorption due to electronic transitions that yield interfacial electron transfer (IET) (e.g., excitations to the adsorbate LUMO, mixed with electronic states in the conduction band, and cross transitions directly into electronic states in the conduction band). For example, the phenyl-acac HOMO-LUMO transition has been shown to yield ultra-fast IET into the TiO2 surface. Hence, the inverse design of chromophores based on phenyl-acac as the reference structure, is focused on the optimization of photo-absorption by maximization of
Here, fHL and P (λHL) are computed by using Eqs. (24) and (23), respectively, for p=HOMO and q=LUMO.
Gradients of molecular properties. The total photo-absorption cross section is maximized with respect to the participation coefficients by using standard optimization techniques (e.g., the quasi-Newton BFGS algorithm).80 The gradients of the photo-absorption probability are computed using the finite increment expression.
Molecular framework. The phenyl-acac linker is particularly attractive for sensitization of metal-oxide surfaces because it binds strongly to nanoporous TiO2 thin-films, even in aqueous solutions and under oxidative conditions.65 However, its photoabsorption spectrum peaks in the UV region. Therefore, it is important to search for structures with a common binding motif to TiO2 but with enhanced photoabsorption of solar light leading to interfacial electron injection.
The TB-LCAP inverse design method is illustrated as applied to the reference molecular framework defined by the phenyl-acac anion (see Fig. 10), a chromophore that absorbs only in the UV region (e.g., at 284nm in methanol solution). To locate structures with maximal photoabsorption in the visible range (due to HOMO-LUMO transitions), the atom types in the six-membered ring are continuously varied as a linear combinations of C, N, O, S and H atoms. The phenyl-acac structure in Fig. 10 is used as reference framework in the EHTB-LCAP optimization of fHLP(λHL). The group sites marked by ovals are linear combinations of C–H, N–X, O–X and S–X. The symbol X is used to denote “dummy atom” sites. The boxed sites allow for a mixing of C–H, C=O and C=S functional groups. These variational degrees of freedom result in a total number of 144 possible molecules for this simple molecular framework.
Optimum structures. While the nuclear positions are defined by the reference framework and remain fixed during the optimization, TB-LCAP optimization of the objective function with respect to the participation coefficients identifies local minima on the alchemical potential hypersurface. The participation coefficients corresponding to the alchemical minimum are then rounded off to yield a chemically representable candidate. The geometry of the candidate structure is then optimized in a DFT/B3LYP gas phase calculation, yielding the optimum structure. Subsequently, the TDDFT/B3LYP photoabsorption spectrum of 3-acac-pyran-2- one is compared to the TDDFT/B3LYP photoabsorption spectrum of the phenyl-acac structure that served as starting point for the optimization. This comparison validates the predictive power of the TB-LCAP inverse design approach.
Smoothness of hypersurfaces. In this case, optimizing fHLP(λHL) means locating maxima on a 14-dimensional hypersurface. The smoothness of this surface is analyzed in Fig. 11. It presents a two-dimensional contour plot of an optimization pathway from the phenyl-acac starting point to 3-acac-pyran-2-one. Sites 1 and 2 are varying between C=S and C=O (i.e., b(c=0)+b(c=s)=1) and sites 3 and 4 are varying between O–X and N–X (i.e., b(o−x)+b(N−X)=1). The data presented in Fig. 11 suggests that fHLP(λHL) varies smoothy with variation of b(N−X) and b(c=0) and that the use of gradients for locating minima is justified. Fig. 11 also indicates small gradients of fHLP(λHL) in the region with b(c=o)>0.78 which means that gradient based local optimization may be challenged. To avoid areas with inadequate gradient information, randomized starting points (sets of participation coefficients) that fall into this area are simply rejected. For the area with b(c=o)<0.78, the surface shows a global trend to increase as b(N–X) decreases from 1 to 0, leading to successful searches. For b(c=o)>0.78, the objective function value globally increases as b(N−X) decreases, enabling successful gradient-based optimization. Except for a narrow area formed by b(c=o)=0.78−0.85 and b(N−X)=0.62−1.0, where the LCAP search paths might be trapped, any other TB-LCAP paths with random initialization leads to structure 3-acac-pyran-2-one (i.e., the point with b(c=o)=1 and b(o−x)=1). LCAP searches might get trapped in the small area defined b(c=o)=0.78−0.85 and b(N−X)=0.62−1.0. All other randomly initialized searches will deterministically lead to 3-acac-pyran-2-one, i.e. the point defined by b(c=o)=1 and b(o−x)=1. For example, starting a gradient search at point (b(c=o)=0.95, b(o−x)=0.80) (see Fig. 11), the gradient-based search algorithm moves along the points (0.86, 0.35), (0.86, 0.34) and (0.90, 0.24) to approach (0.84, 0.01) which is very close to the local alchemical optimum (0.86, 0). Rounding the participation coefficients to the 1 and 0 leads to the chemically representable 3-acac-pyran-2-one.
Predicted electron injection property. The time-dependent probability for the electron to be localized in 3-acac-pyran-2-one at time t after photoexcitation is obtained via quantum electronic dynamics simulations. The results presented in Fig. 12 show, that after being excited to the LUMO of adsorbate 3-acac-pyran-2-one, the electron is injected into the TiO2 surface in about 10fs. This ultrafast interfacial electron transfer can be used to induce charge separation via photoexcitation of the covalently attached adsorbate.
3.4 Experimental verification
3-acac-pyran-2-one contains a pyran-2-one ring that is also present in commercially available coumarin-derived dyes.91,93,97 The pyran-2-one ring is fused to a phenyl ring and anchoring groups (–COOH) at positions 5 and 6. C343 and NKX-2311 and similar coumarin-COOH derived dyes can be obtained by adding donor/acceptor groups or via extension of the π-conjugated system. This class of dyes are known to exhibit strong absorption bands in the visible range as well as fast interfacial electron injection rates (fs timescale).91,93 The available literature97 on coumarin-derived dyes suggests that the LCAP-identified 3-acac-pyran-2-one could be used as a lead chromophore in practice. Theory predicts that 3-acac-pyran-2-one will exhibit an absorption at the in the UV-Vis range and that it will engage in interfacial electron transfer to semiconductor surfaces. To experimentally validate these predictions, 4-acac-coumarin has been synthesized and its behavior analyzed, both, in solution and adsorbed to TiO2.
Figure 13 compares experimental and calculated absorption of the 4-acac-coumarin anion solvated in methanol. One equivalent of NH4OH is added in the experiment to deprotonate 4-acac-coumarin. In semiquantitative agreement with experiment (Fig. 13-b the theoretical spectrum of 4-acac-coumarin anion in Fig. 13-a shows an absorption band at 406nm.
Figure 14 compares calculated and experimental absorption of the 4-acac-coumarin anion model structure (Fig. 14-a). In the experiment, 4-acac-coumarin anion is adsorbed onto a thin film of TiO2 (Fig. 14-b). Calculation and experiment are in agreement regarding the band around 405–410nm. Gaussian line broadening (FWHM=60nm) is used to obtain the theoretical spectrum from excitation energies and oscillator strengths.
The result of the electron dynamics simulation for the 4-acac-coumarin anion adsorbed onto the TiO2-anatase (101) surface is shown in Fig. 15. As shown in Fig. 15, the 4-acac-coumarin anion electron density (LUMO) rapidly decays in about 15fs and the electron is injected into the TiO2 conduction band (Fig. 15-c).
4 Conclusions
Inverse molecular design has emerged as an attractive computational approach to take on the challenges in materials discovery. Inverse molecular design aims at searching for optimum points on the hypersurfaces defining property-structure relationships, and mapping out the molecular structures at these points. We present a conceptual formalism for the idea of inverse molecular design when it is implemented as an optimization problem that searches along the molecular property hypersurfaces constructed directly from the Hamiltonian of molecular systems. We review a few state-of-the-art strategies in inverse molecular design and present the basic principles and procedures of the two most commonly used optimization algorithms: genetic algorithms (GA) and Monte Carlo methods.
Historically, discrete optimization algorithms (such as GA and MC) are used for inverse molecular design. Here, we bring the attention to a new inverse molecular design strategy based on the linear combination of atomic potential, developed by Beratan and Yang in 2006.11 The LCAP approach provides a continuous interpolation between discrete molecular structures, enabling us to search optimum molecules efficiently using deterministic or continuous optimization algorithms. In addition, discrete optimization algorithms were combined with the LCAP scheme for inverse molecular design. We present the basic principle of LCAP in the framework of density functional theory, as originally proposed.
In particular, we review the development of LCAP in the framework of tight-binding theory (i.e., TB-LCAP approaches) and outline the progress in applying TB-LCAP approaches for discovering nonlinear optical materials and dye-sensitized solar cells. The TB-LCAP method was first developed in the Hückel tight-binding framework,18 and later expanded into the extended Hückel tight-binding framework.53 The Hückel TB-LCAP method was used to successfully find global optimal structures for nonlinear optical materials in a few extremely large chemical libraries (with 102 to 1016 possible molecules). The EHTB-LCAP method successfully predicted a novel lead chromophore linker 3-acac-pyran-2-one, which could be used as lead adsorbate anchor in practical applications to dye-sensitized solar cells. Experimental synthesis and spectroscopic characterization confirm that a 3-acac-pyran-2-one derivative yields improved sensitization to solar light and provides robust attachment to TiO2 even in aqueous conditions.
Even though there is ample room for improvement in TB-LCAP approaches for inverse molecular design such as relaxing the frozen molecular framework during the optimization and improving the smoothness of the TB-LCAP structure-property hypersurfaces, the TB-LCAP methods are already established as promising tools for discovering materials that are realizable in experiments. Due to the low computational cost of tight-binding electronic structure calculations, we envision that the TB-LCAP approaches will be one of the promising inverse molecular design methods for taking on more challenges in materials discovery in the future such as catalysts design in solar fuel applications.