groupe-dfo-bbo:acteurs:students:master:romain-vanden-bulcke:english [Groupe de recherche en DFO-BBO]

Romain

Context

The general field of this project is derivative free or black box optimization. Derivative free optimization is, as the name suggests, the field of applied mathematics where we are interested in minimizing a function whithout any derivative information, either because it is not available or because such information does not exists. Black box optimization is similar but the objective to minimize is only accessible through a simulation or a code of some sort. The details of this code is not known, hence the name black box. In general, the black boxes are non-smooth, may fail and are expensive to evaluate. The evaluations of the black box form a bottleneck and we must put a limit on the total number of black box evaluations.

A naive idea to solve this type of problem would be to approximate the derivatives and apply some sort of gradient methods, but the approximation of derivatives would cost a lot of evaluations and such methods would not be very efficient. Therefore, other methods have been conceived to tackle this type of problems, with a limited budget of evaluations. Among those methods, we are mostly interested in the Mesh Adaptive Direct Search (MADS) class of algorithm.

This class of algorithm iteratively evaluates points in the search space while keeping track of the best solution found, the incumbent solution. All the points are evaluated on a structure called the mesh, which controls the distance between the new points to evaluate. More details on this algorithme are available in this paper. The basic algorithm is composed of two steps : a global search step and a local poll step. The search step is quite broad, we can evaluate a finite number of points on the mesh. The poll step is stricter and consists of the evaluations of a certain number of points around the incumbent solution.

In general, black box optimization methods are able to tackle problems with a few dozen variables. If there are many variables, the search space becomes very large and a method needs a lot of evaluations, and time, to explore it.

In this project, we are trying to tackle this kind of problems. The idea behind to project is to identify which variables or combination of variables are the most influencial. This would help us to cencentrate our search effort in a smaller space et get closer to interesting solutions. Since the search step of the algorithm leaves us with some room to invent new search strategy, we propose a search step which is based on principal components analysis (PCA) to reduce the dimension of the problem.

PCA-MADS algorithm

Since the method we proposed is from the MADS class of algorithm and is based on principal components analysis, we named it PAC-MADS. The poll step is quite similar to a standard MADS while the search step is more intersting.

Search step

To allow us to identify the most ifluential combinations of variables, we need to apply some sort of sensitivity analysis. Since the total budget of evaluations is limited, we would like to use the points which have already been evaluated during previous iterations, which are stored in a cache. To do so, we need to select s certain set of points from the cache and apply PCA. This analysis allows us to identify some combinations of variables with some intersting links with the objective. We can build a new sub-problem with the first p«n combinations. This sub-problem is therefore of dimension p instead of the original n. Then this problem is minimized using a new instance of MADS.

We are able to evaluate a point of dimension p by reconstructing a point in dimension n using the principal components. The point is then projected onto the mesh in n dimension.

Tests et results

For the first tests, we apply our algorithm on a simple problem, minimizing the Rosenbrock function. We compare two similar implementations of a MADS algorithm without any search step and a PCA-MADS algorithm. The performance of these two methods are compared using performance and data profiles.

On problem of small size, the standard MADS seems to have better performances than PCA-MADS. On larger problems, it is the other way around.

Now we note that there are sevaral parameters which might have an influence on the performances of PCA-MADS. To be able to study and recommend interesting values for these parameters, we need to compare them on a more diverse set of problems. That is why we use a platform called coco and the suite of functions bbob (largescale) for the next numerical tests.

Work progress

Here is a list of the next step of the project :

Comparison and influence of parameters (initial points evaluation,number of poll points, construction and evaluation of the sub problem)
Validation of the parameters
Comparison with other known methods
Comparison with other known methods on real life black box