RESUMO
We present methods to answer two basic questions that arise when benchmarking optimization algorithms. The first one is: which algorithm is the "best" one? and the second one is: which algorithm should I use for my real-world problem? Both are connected and neither is easy to answer. We present a theoretical framework for designing and analyzing the raw data of such benchmark experiments. This represents a first step in answering the aforementioned questions. The 2009 and 2010 BBOB benchmark results are analyzed by means of this framework and we derive insight regarding the answers to the two questions. Furthermore, we discuss how to properly aggregate rankings from algorithm evaluations on individual problems into a consensus, its theoretical background and which common pitfalls should be avoided. Finally, we address the grouping of test problems into sets with similar optimizer rankings and investigate whether these are reflected by already proposed test problem characteristics, finding that this is not always the case.
Assuntos
Algoritmos , Benchmarking/métodos , Modelos Teóricos , Simulação por ComputadorRESUMO
Meta-modeling has become a crucial tool in solving expensive optimization problems. Much of the work in the past has focused on finding a good regression method to model the fitness function. Examples include classical linear regression, splines, neural networks, Kriging and support vector regression. This paper specifically draws attention to the fact that assessing model accuracy is a crucial aspect in the meta-modeling framework. Resampling strategies such as cross-validation, subsampling, bootstrapping, and nested resampling are prominent methods for model validation and are systematically discussed with respect to possible pitfalls, shortcomings, and specific features. A survey of meta-modeling techniques within evolutionary optimization is provided. In addition, practical examples illustrating some of the pitfalls associated with model selection and performance assessment are presented. Finally, recommendations are given for choosing a model validation technique for a particular setting.