**By R. Belmans**

### Differential evolution

**Differential evolution (DE)** (Stom & Price 107,108) is a rather recent approach for the treatment of real-valued multiobjective optimisation problems. As is typical for stochastic search algorithms, differential evolution does not require any prior knowledge of the variable space, nor of the derivatives of the objective functions towards the design variables. The algorithm is very simple, requiring only two control parameters, and is inherently parallel. Differential evolution is a self-adaptive evolution scheme clearly deviating from the scheme outlined in the previous section.

**Consider a //-dimensional vector of design variables jc:**

In the initial step, a population of sizeof randomly chosen designs is constructed such that the initial population covers the entire parameter space uniformly. Practically, this could be achieved by defining an initial step length of design variationsbeing applied randomly to a given start design

withand the randomly chosenIn the process of the optimisation, DE generates new parameter vectors by adding the weighted difference between a defined number of randomly selected members of the previous population to another member. In its basic strategy, this is the difference of two vectors added to a third:

forwith k the generation index,randomly chosen and mutually different andThe difference term is obviously similar to the step lengthin the previously describedscheme, as well as a finds its equivalence as the adjustment (or learning) factor. The difference is that this step length is not taken from a fitness-selected set of individuals, but rather from a randomly selected individual of the previous population. No explicit step length information has to be maintained. Also, the basis vector x’*’ does not need to be fitness selected. To increase diversity in the population, crossover is introduced, leading to a new parameter vector of the form:

The bracketsdenote the modulo function with modulus N. The index n is a randomly chosen integer from the intervalL defines the number of parameters that are to be exchanged and is taken from the interval _L is chosen in such a way that the probability with the crossover probabilityIn pseudo-code, the determination of L can be expressed as:

**This recombination scheme is different** from the one defined by (7.13) and (7.14). The new parameter vectoris checked for violation of any constraints. If it violates one of the constraints, this parameter vector is rejected and the construction of a new vector is repeated. The selection process now has similarity to a tournament selection process. If this resulting design vector yields a better value of the objective function than its predecessorthe new design replaces the old one in the population. If not, the old vector is retained. This selection scheme is the most distinctive difference from the aSA-evolution strategy introduced in the previous section. Here, fitness is tested against the direct predecessor, whereas theconstructs a fitness selected set of parents from the whole population. Several deviations of this algorithm can be defmed, depending on the choice of the vector to be perturbed, the number and choice of parameter vectors considered for the computation of the difference vector and the crossover method (Stom & Price IC8). A good choice for non-critical technical problems is a strategy that increases the greediness of the algorithm by using the best parameter vector from the previous population:

or applying mutation to the direct predecessor:

The construction of the new population memberinvolves in any case the crossover scheme as outlined in (7.18). The stop criterion for thescheme is the variation of the average step length 8 of the selected parents. Such an average step length has to be constructed explicitly for differential evolution, to serve as a stopping criterion. The L2-norm of the difference vector between each population member and its predecessor is taken.

**The experiments conducted** in the scope of this topic have indicated that the (Df^best/l)-strategy defined in (7.19) is to be favoured for most technical problems. The two remaining strategy parameters can be chosen asThe influence of the population size has been found to be less critical than inschemes. A minimum of should be chosen always ifHowever, problems with up to 25 parameters have been successfully solved with population sizes between 30 and 40. Stom & Price 108 advises one to chooseHere, the authors have found that such a large population is not necessary for the problems classified in section 7.1 . Stom & Price 108 use the algorithm to solve artificial optimisation test problems.

**DE has been applied to the** optimisation test problem (7.5), (7.6) as well. The results are presented in Fig. 7.13. The strategy usingis successful in only 68% of the test runs, while tests withsucceeded in 94% of the cases within 300 trials. Differential evolution is well suited for parallel implementation.

**Both evolution strategies can be used** in cascade coupled with the Hooke and Jeeves algorithm. The optimisation is started using an evolution strategy, until a defined accuracy bound or step size variation is reached. Then the Hooke and Jeeves algorithm is invoked for the fine-tuning of the parameter set, assuming that the evolution strategy had approached the global optimum, allowing a local optimiser to take over. This reduces the overall number of function calls, but might not be permitted in the case of a strongly constrained optimisation problem. .

**Fig, 7.13. Convergence history of a successful test using the (DE/best/l)-strategy with**

## Indirect methods

**While in direct** search methods the optimisation algorithm samples the objective function directly, in indirect methods the optimisation algorithm is applied to an approximation of the real N-dimensional feasible surface. A trial on a fitted surface is computationally comparatively cheap if it replaces, e.g. a finite element analysis.

### Response surface methodology and design of experiments

**The response surface methodology (RSM)** (Box & Draper 7) is such an indirect method. The classical RSM is based on first or second order polynomials. The general form of a second order response surface is:

withthe unknown coefficients that can be found using the method of least squares and N the number of variables (in RSM also called factors). Theare called the interaction coefficients, whereas the are referred to as the coefficients corresponding to the main effects. The immediate advantage of the computation of the coefficients is the information they provide about the significance of individual interactions and main effects. They provide a tool for reducing the dimensionality of the problem by excluding non-significant design parameters from the design optimisation task. This is usually performed using first-order response surfaces. To fit any second-order surface, the number of trials to be computed must be at least equal to the number of unknown coefficients.

**There exists a theory to define the sample points to consider.** This can be found under the name design of experiments (Box & Draper 17, Montgomery 81, Clarke & Kempson 25). Experimental designs should ensure an efficient sampling of the region of interest and provide information on the accuracy of the response in the region. To find an accurate second-order fitting surface, three level designs are required (three values per parameter range). This leads to 3w-samples to be taken for one surface. The research on statistical methods advises a number of samples that are suitable for response surface fitting and require considerably fewer function evaluations than the full 3-level factorials (Box & Draper l7, Clarke & Kempson 25). Furthermore, they give equal precision for the responses at equal distance from the centre of the factor space (rotatability property). Such recommended designs are:

**• Central Composite Design (CCD) (Box & Draper ’7) **

**• Small Composite Design (SCD, special case of CCD) (DraperiA) **

**• Box-Behnken Design (BBD) (Box & Draper l7).**

**Taking the axis experiments a distance ar from the centre of the region ensures rotatability of the design. But ar is determined for CCD as:**

withthe number of runs for theleading for the N= 2 to If the parameters are normalised and bounded in the interval [1,1], this would be outside the feasible region of the optimisation task. In this case, one takes the outlying axis runs to fit within the feasible space (Fig. 7.14).

**Fig. 7.14. Central composite design on normalised parameter space of dimension N-2 consisting of four factorial experiments or runs****, four axis runs**** ****and one centre run**

A CCD is a five-level design, as it requires the coding of the parameter levelsIt should be noticed that a CCD for N=2 (with one centre run) appears to be simply a rotation of the full factorial design (Fig. 7.15a). However, going to higher dimensions, the advantage ofexperiments becomes obvious.

**In the case of the objective function** defined by eqns.(7.5) and (7.6), Fig. 7.15 and Fig. 7.16 represent the second order response surface obtained by fitting CCDand fulldesigns. The CCD is

constructed by taking thefactorial designs (also called runs), 2N axial or star runs andcentre runs (Fig. 7.14).

**The Box-Behnken designs are formed by combining** two-level factorials with incomplete block designs (Box Draper l7). In the two-dimensional case, this would reduce to a two-level factorial with one added centre point, requiring only 5 runs. However, 6 coefficients have to be determined, leaving the least squares problem underdefined. Box-Behnken designs are very effective from 3 factors upwards.

**Fig. 7.15. Second order response surface using a central composite design (CCD). Part a) shows a contour plot of the response surface and indicates the positions at which samples are taken, while b) visualises the agreement between response surface and original curve.**

**As the response surface is constructed by a second order polynomial, an optimisation algorithm can be derived by applying a four-step scheme:**

**step 1:** Construct a second order response surface using designed experiments.

**step 2:** Use a gradient based algorithm to find the minimum of the response surface.

**step 3:** Contract the active design space by a definedfactor around the new optimum.

**step 4:** Stop if stopping criterion fulfilled or go back to step I._I

**Fig. 7.16. Second order response surface based on a full 3-leveI factorial. Part a) shows a contour plot of the response surface and indicates the positions at which samples are taken, while b) visualises the agreement between response surface and original curve.**

**Examination of Fig. 7.15 and Fig. 7.16** reveals the problem of the response surface methodology: the actual fit of the response surface can be veiy poor. In our case, the minima of both response surfaces are closer to the global one than the initial centre point. However, this cannot be guaranteed, if multiple minima reside in different sized regions of the feasible domain. Brandiski 19,20 has examined the usefulness of different approaches for multiobjective optimisation in the design of small permanent magnet machines using FE-analysis, such as:

**• Select one main objective** and reformulate the other objectives as constraints.

**• Apply an Euclidean distance function** to minimise the distance of the global optimum to all single objective optima (this is comparable to the idea of minimising the sum of the single normalised objectives).

**• Apply a desirability function to each single** objective (normalising the response to values of the interval [0,1]) and apply a geometric mean to achieve a single scalar value.

**Brandiski concludes that both last-mentioned** methods are useful, with advantages of the approach using desirability functions. The concept of desirability functions shows similarities to applying the fuzzy set theory to formulate a multiobjective function. These approaches do not solve the inherent weakness of the response surface methodology regarding multiminima functions.

**As a conclusion**, the response surface methodology using first or second order polynomials has its advantages if the objective function surface features a low curvature, probably having a single optimum only.

**Its success in finding the global optimum** depends on the choice of the limits of the search region and its contraction scheme. It is interesting, however, to examine the main effects of the variables and their interactions, which can be used to reduce the dimension of the optimisation problem by excluding less significant design variables or to detect unwanted dependencies.