===== Romain ===== Contact : ==== Directors and supervisors ==== [[https://www.gerad.ca/Charles.Audet/|Charles Audet]] \\ [[https://www.gerad.ca/Sebastien.Le.Digabel/|Sébastien Le Digabel]]\\ [[groupe-dfo-bbo:acteurs:students:postdoc:miguel-martinez:main-page|Miguel Martinez]] ==== Project : dimension reduction in MADS ==== [[ groupe-dfo-bbo:acteurs:students:master:romain-vanden-bulcke:avancement|Work progress]] [[ groupe-dfo-bbo:acteurs:students:master:romain-vanden-bulcke:main-page|Page en français]] === Context === The general field of this project is derivative free or black box optimization. Derivative free optimization is, as the name suggests, the field of applied mathematics where we are interested in minimizing a function whithout any derivative information, either because it is not available or because such information does not exists. Black box optimization is similar but the objective to minimize is only accessible through a simulation or a code of some sort. The details of this code is not known, hence the name black box. In general, the black boxes are non-smooth, may fail and are expensive to evaluate. The evaluations of the black box form a bottleneck and we must put a limit on the total number of black box evaluations. {{groupe-dfo-bbo:acteurs:students:master:romain-vanden-bulcke:simple_black_box.png?500 |A simple black box illustration}} A naive idea to solve this type of problem would be to approximate the derivatives and apply some sort of gradient methods, but the approximation of derivatives would cost a lot of evaluations and such methods would not be very efficient. Therefore, other methods have been conceived to tackle this type of problems, with a limited budget of evaluations. Among those methods, we are mostly interested in the Mesh Adaptive Direct Search (MADS) class of algorithm. This class of algorithm iteratively evaluates points in the search space while keeping track of the best solution found, the incumbent solution. All the points are evaluated on a structure called the mesh, which controls the distance between the new points to evaluate. More details on this algorithme are available in [[https://epubs.siam.org/doi/abs/10.1137/040603371|this paper]]. The basic algorithm is composed of two steps : a global search step and a local poll step. The search step is quite broad, we can evaluate a finite number of points on the mesh. The poll step is stricter and consists of the evaluations of a certain number of points around the incumbent solution. {{ groupe-dfo-bbo:acteurs:students:master:romain-vanden-bulcke:mads_poll_step.png?1000 |A simple example of successive MADS poll step}} In general, black box optimization methods are able to tackle problems with a few dozen variables. If there are many variables, the search space becomes very large and a method needs a lot of evaluations, and time, to explore it. In this project, we are trying to tackle this kind of problems. The idea behind to project is to identify which variables or combination of variables are the most influencial. This would help us to cencentrate our search effort in a smaller space et get closer to interesting solutions. Since the search step of the algorithm leaves us with some room to invent new search strategy, we propose a search step which is based on principal components analysis (PCA) to reduce the dimension of the problem. === PCA-MADS algorithm === Since the method we proposed is from the MADS class of algorithm and is based on principal components analysis, we named it PAC-MADS. The poll step is quite similar to a standard MADS while the search step is more intersting. == Search step == {{ groupe-dfo-bbo:acteurs:students:master:romain-vanden-bulcke:pcamads_example.png?800 |Illustration of PCA-MADS search step}} To allow us to identify the most ifluential combinations of variables, we need to apply some sort of sensitivity analysis. Since the total budget of evaluations is limited, we would like to use the points which have already been evaluated during previous iterations, which are stored in a cache. To do so, we need to select s certain set of points from the cache and apply PCA. This analysis allows us to identify some combinations of variables with some intersting links with the objective. We can build a new sub-problem with the first //p<