DEPP - Differential Evolution Parallel Program

Optimization is a mathematical problem often found in science and engineering. Currently, however, there is no general method to face this problem. Solutions are generally addressed by two approaches, both iterative: (a) quasi-Newton methods (Griva, Nash, & Sofer, 2009) and (b) heuristic methods (Coley, 1999; Feoktistov, 2006). Each one has advantages depending on the problem to be optimized. Quasi-Newton methods, in general, converge faster then heuristic methods provided the function to be optimized (the objective function) is smooth. Heuristic methods, on the other hand, are more appropriate to deal with noisy objective functions, to handle failures in the calculation of the objective function and are less susceptible to be retained in local optimum than quasi-Newton methods.


Software description
Software Architecture DEPP source code was written following the FORTRAN 2008 standard language (Brainerd, 2015), it uses the Message Passing Interface (MPI) directives (Message Passing Interface Forum, 2009) for parallel processing and takes advantage of the Object-Oriented Paradigm.
Thanks to Object-Oriented Programming, the source code was designed to simplify the implementation of new methods. The algorithm works on abstract classes separated, basically, as (i) population initializers, (ii) search strategies, (iii) fitness calculation and (iv) stop conditions. The concrete classes are generated using the Factory Design Pattern (Freeman et al., 2004). All MPI commands are encapsulated into proper classes, following the logic of the Adapter Design Pattern. In this way, users interested in the implementation of new methods will not concern about MPI details.
DEPP treats the fitness function as a black-box, that must be provided by an external program (EP). This approach increases the range of the application of DEPP, since in many cases the fitness function can not be written as a DEPP subroutine.
Communication between DEPP and the external program is performed by disk files. DEPP calls the EP by command-line, passing to it the name of a file as an argument. This file contains the name of another file (where the external program must save fitness), the dimension of the trial solution vector X and the components X1, X2, … of X, e.g., <path/filename> = arqfit: name of the file to save fitness <value> = nu: number of unknowns (dim. of the trial vector X) <value of X1> . .

. <value of Xnu>
On the other hand, the external program must calculate, and save the fitness in the file indicated by DEPP with an exit status code (to be further explained). This file reads as <value> = Fitness <value> = Exit status code The basic algorithm of DEPP is illustrated in Figure 1. DEPP reads the input data from a configuration file, whose name is passed to it by a command-line argument. The stop condition is analyzed. In the first iteration, the stop condition is generally not satisfied (it may not be true when starting from a backup). When the stop condition is satisfied, the iterative procedure is finished and the output data is saved in the depp_output directory. Otherwise, a trial population is generated. DEPP sends the information to a set of threads (T1,· · ·, TN) using MPI. Then, each of these threads runs a copy of the external program (EP) for a given individual of the population. After that, DEPP receives the calculated objective function and compares their fitness with the fitness of the current population. Only the best individuals are held. A backup is optionally generated and the iterative cycle is restarted. During the calculation of the objective function, some failures may occur. DEPP handles failures using an exit status code returned by the external program. If the exit code is 0, the calculations were performed correctly; if 1 is returned, then a failure occurred and the trial individual is discarded; if 2 is returned, then a failure occurred, the trial individual is discarded and a new one is generated. All the failures are registered for the posterior user's analysis. The folder structure of DEPP is shown in Figure 2. The directory src contains the DEPP source code. After compilation, the executable depp.x is moved to bin directory. test folder contains Bash scripts (Ramey & Fox, 2016)

Supporting theory
The following sections describe the optimization theory behind the current implementation of DEPP. It must be emphasized that DEPP was designed based on an Object-Oriented Paradigm, in such a way to simplify the implementation of new and better methods.

Differential Evolution
The Differential Evolution algorithm is based on the principles of biological evolution. Basically, it works as follows. Given a population (set of discrete points in the domain of search), new individuals (trial points) are generated by "mutation" and "crossing over". Here, "mutation" means a rule for creating new trial individuals while "crossing over" means a rule of exchanging information between a trial individual and an individual of the population. The fitness (objective function) of the new individuals is compared to the fitness of the parent individuals and the worse individuals are eliminated. The procedure is repeated for several generations (iterations) until a criterion of convergence is satisfied within a given tolerance.
The current version of DEPP focus on a particular kind of maximization problem, often found in practical applications, i.e., maximize where f is a real-valued function, also known as the objective function, depending on a D- The first step toward solving this optimization problem is the population initialization. For the first generation (g = 1), a population (set of points) where R is a random number uniformly distributed in [0, 1) and x ij is the j-th element of the i-th individual.
After the population initialization, the fitness of each individual x i is calculated.
For the next generations, trial individuals are calculated based on the population of the previous generations. Trial individuals are created by a process of mutation and crossing over as follows.
For each individual x i of the previous population, other three distinct individuals (x i1 , x i2 and x i3 ) of the same population are randomly selected and a mutant individual v is created where F is the differentiation parameter. Then, the mutant individual and x i cross over, producing the trial individual u. The components u j (1 ≤ j ≤ D) of u are given by In the previous equation, C r is the crossover parameter and j * is an integer randomly chosen between 1 and D. If the trial individual violates the constraints, i.e. if it is not within L ≤ u ≤ U , then the process of generating a trial individual is repeated.
The fitness of the trial individual is compared to the fitness of New generations are created until a stopping condition is satisfied. DEPP implements a composition of the following stopping conditions: (i) best fitness stagnation, (ii) maximum number of generations and (iii) population convergence measure. In the first one, the generation cycle is interrupted if no improvement of the fitness function is achieved within N A generations. The second condition establishes the maximum number of generations allowed. Finally, in the third condition, the cycle is interrupted if the P-measure P m (Feoktistov, 2006) is lower or equal to a given tolerance. The P-measure is evaluated as where | · | is the L 2 norm, the components of x ′ i are given by The approach presented in this section has three parameters: the population size N p , the differentiation parameter F and the crossover parameter C r . Feoktistov (2006) gives reference values/ranges of these parameters based on optimization studies performed on many test functions.

Differential Evolution and Response Surface Methodology
In order to accelerate the convergence of DE, Vincenzi & Savoia (2015) proposed a new mutation operation based on the Response Surface Methodology (RSM) (Khuri & Cornell, 1996;Myers et al., 2009). The basic idea consists of fitting a simple response surface to a selected set of individuals of the population. The mutant vector is, then, obtained by maximizing the response surface, which is done through matrix inversion. For a given benchmark, the authors showed that the number of generations necessary to reach a given precision was substantially reduced (more than 50%).
Vincenzi and Savoia considered two response surfaces. The first one is a quadratic polynomial P (x), given by where β are coefficients to be fitted. The quadratic polynomial has N fm = (D + 1)(D + 2)/2 coefficients and, so, it requires, at least, the same number of points to be fitted. Note that the number of points grows significantly when the number of unknowns D is increased. That is why Vincenzi and Savoia proposed a second model.
The second model is an "Incomplete Quadratic Model", where the terms β ij of the previous equation are neglected. In this case, the minimum number of fitting points is reduced to N fm = 2D + 1. On the other hand, the accuracy of the model is also reduced.
The DE-RSM hybridization implemented in DEPP was inspired in the work of Vincenzi and Savoia, but with some modifications that are explained below.
DEPP stores a history list, i.e. a list of all individuals calculated since the first generation and their corresponding fitness. This list is employed to generate mutant vectors by RSM hybridization and may be used by other methods to be implemented in the future.
According to the DE algorithm, for each individual x i of the population, a trial individual u is created, resulting from a mutation and a crossing over. When the RSM hybridization is active, the mutant vector may be created by the DE procedure explained in the previous section or by hybridization with RSM. The following conditions must be satisfied simultaneously for applying RSM: • The number of individuals in the history list must be, at least, twice the number of individuals required to fit the response surface N f . Notice that N fm ≤ N f ; • A random number R (within [0,1)) must be less than the probability of hybridization f h .
The probability of hybridization f h may be prescribed by the user or may be calculated dynamically during the optimization. The dynamic calculation of f h considers the probability of success f s when applying RSM. More precisely, where f hmin and f hmax are the minimum and maximum allowed values of f h , respectively.
The probability of success of RSM f s is the ratio of the number of times that the hybridization contributed to maximizing the objective function to the last N p times RSM hybridization was applied. In the first generations, when the number of samples is less than N p , f s is not available. In this case, f h is equal to f h0 , an initial value of the fraction of hybridization.
Suppose, now, that a mutant vector, associated with individual x i , will be calculated by DE-RSM hybridization. In order to fit a response surface, it is necessary to select a set of individuals. This selection is made in two steps. First, the history list is ordered from the best individual to the worst one and the i-th best individualx is taken (target individual). Second, the history list is reordered from the closest to the furthermost individual to the target individual. From this list, N f − 1 individuals are selected according to the following rules, starting from the closest individual: • The distance betweenx and the candidate individual x k must satisfy where η tol is a prescribed tolerance. In other words, there is a minimum distance between the target individual and its neighbors.
• The candidate individual x k is selected with a probability of 50%.
Once the fitting individuals are selected, the coefficients of the response surface are determined using the weighted least squares method (Khuri & Cornell, 1996).
Two weighting functions are implemented in DEPP: uniform and exponential (Vincenzi & Savoia, 2015). The exponential weighting function reads as where x k is any of the selected individuals for fitting and f best is the best fitness among the individuals selected for fitting.
Finally, the mutant vector is obtained by maximizing the response surface.
If the application of RSM fails or the trial individual is out of range, then a pure DE individual is created.

Examples
Four test functions, whose maximizers are known analytically, were chosen for illustrating the application of DEPP: • Step function (Zhang & Sanderson, 2009)
• Rosenbrock function (Feoktistov, 2006) f • Noisy Quartic function (Zhang & Sanderson, 2009) where R is a uniform random number belonging to [0, 1), • Schwefel 2.26 function (Zhang & Sanderson, 2009)  These functions were chosen because they present particular features that make optimization difficult. The Step function (Fig. 3), a set of flat surfaces, poses a hard problem to optimizers relying on functions' derivatives. This is even worse for the Noisy Quartic function (Fig.  4), which has a noisy plateau near its maximum. Rosenbrock's function (Fig. 5), although smooth, has a sharp narrow ridge around a parabola, which makes it a challenge to find an appropriate search direction. Finally, Schwefel 2.26 (Fig. 6) has several local optima, which may lead to premature convergence.
The maximizers x * i and the domain of optimization of the test functions are presented in Table 1.
Step -100 100 0.5 Rosenbrock -2 2 1 Noisy Quartic -1.28 1.28 0 Schwefel 2.26 -500 500 420.968597844358 Additionally, this example aims (i) to show the effect of DE-Response Surface (DE-RSM) coupling on reducing the number of iterations required to achieve convergence and (ii) to show the effect of parallelization on reducing the optimization time. In order to accomplish the first task, the four test functions were optimized for D ∈ {2, 4, 8} dimensions using DE and DE-RSM. For the second task, the number of dimensions was restricted to D = 2, but for each call of the external software that calculates the objective function, it was added a delay of 1 s in order to simulate a computationally expensive program. Since DE is a stochastic method, in both tasks, the final results are the average of 50 optimizations.
The parameters applied in the optimization are as follows. The population size is N p = 20 for D = 2 and N p = 40 for D = 4 and D = 8. For pure DE, F = 0.85, C r = 0.5. Hybridization uses quadratic polynomial as the response surface. The fraction of hybridization is calculated dynamically using f h0 = 0.35, f min = 0.1 and f max = 0.9. The crossing over parameter of RSM is C r = 1. The number of individuals selected for fitting the response surface is twice the minimum required, i.e. N f = 2N fm . The tolerance for selecting individuals for fitting is η tol = 0.0001. Uniform weighting is applied for fitting. Concerning the stopping conditions, the maximum number of generations allowed is 5000, the maximum number of generations allowed without improving fitness function is 40, for D = 2, and 80, for D ∈ {4; 8}, and the tolerance of P-measure is P m ≤ P tol = 5 × 10 −4 .
For the test functions considered, numerical results show that DE-Response Surface (DE-RSM) coupling reduces the number of generations necessary to find the global maximizer. Table 2 compares the mean number of generations G necessary to reach stop condition and the probability of success P s to achieve the global maximum. The number in parenthesis is the standard deviation σ. The poor performance of DE in finding the global maximum of Rosenbrock's function is due to stop condition. In this case, the number of generations without improving the fitness function was exceeded and optimization was interrupted. The maximizer x * of optimization was considered a success if where x * a is the analytical global maximizer and F tol is given by Finally, Tab. 3 presents the mean computational time per generation τ for different number of threads N t . As can be seen, τ Nt is approximately proportional to τ 1 /N t .