Guide / Tutorial
This tutorial is based on the single objective function usage of BRKGA. However, almost all information also applies to the multi-objective mode with minimal changes, as explained in Section Using BRKGA-MP on multi-objective mode.
If you desire to understand the deeps of the multi-parenting and the implicity path relink, read [ATGR21].
Installation
BRKGA-MP-IPR is a header-only framework, which means that you only need to download the header files and tell your compiler to include the path to where the files were downloaded.
Quick example (unix): assume we are in an empty folder. So, we copy/clone BRKGA-IPR-MP first:
$ git clone --depth 1 https://github.com/ceandrade/brkga_mp_ipr_cpp
Cloning into 'brkga_mp_ipr_cpp'...
remote: Enumerating objects: 283, done.
remote: Counting objects: 100% (283/283), done.
remote: Compressing objects: 100% (222/222), done.
remote: Total 283 (delta 86), reused 180 (delta 55), pack-reused 0
Receiving objects: 100% (283/283), 6.77 MiB | 5.88 MiB/s, done.
Resolving deltas: 100% (86/86), done.
Let’s write a test.cpp
with the following code:
1#include "brkga_mp_ipr.hpp"
2#include <iostream>
3
4int main() {
5 std::cout << "Testing sense: " << BRKGA::Sense::MINIMIZE;
6 return 0;
7}
Then, let’s compile with GCC and see it works:
$ g++ --version
g++ (MacPorts gcc12 12.3.0_0+stdlib_flag) 12.3.0
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ g++ -std=c++20 -Ibrkga_mp_ipr_cpp/brkga_mp_ipr test.cpp -o test
$ ./test
Testing sense: MINIMIZE
With Clang (note that we have to use the flag
-fopenmp
, otherwise we get compilation errors):
$ clang++ --version
clang version 16.0.6
Target: x86_64-apple-darwin22.6.0
Thread model: posix
InstalledDir: /opt/local/libexec/llvm-16/bin
$ clang++ -std=c++20 -fopenmp -Ibrkga_mp_ipr_cpp/brkga_mp_ipr test.cpp -o test
$ ./test
Testing sense: MINIMIZE
Note the Git clones the whole repository that contains the library code, documents, and examples. All the examples were built using GNU/Make, and GCC toolchain or LLVM/Clang toolchain. However, the code is standard C++, and we can quickly adapt it to other toolchains such as Microsoft, or Intel toolchains.
TL;DR - Single objective
The best way to keep it short is to look in the on examples on the git repo.
Let’s start solving the traditional single-objective Traveling Salesman
Problem (TSP).
First, we must ensure that
BRKGA::fitness_t
has the right single-object type. Let’s
take a look at the trimmed version of
fitness_type.hpp.
This is a trimmed copy:
1namespace BRKGA {
2
3using fitness_t = double;
4
5//...
6} // end namespace BRKGA_MP_IPR
Here,
BRKGA::fitness_t
defines the type of the objective function value. In the vast
majority of the cases, double
suffices. Let’s take a look into the main call
main_minimal.cpp.
This is a trimmed copy:
1int main(int argc, char* argv[]) {
2 if(argc < 4) {
3 cerr
4 << "Usage: " << argv[0]
5 << " <seed> <config-file> <maximum-running-time>"
6 << " <tsp-instance-file>"
7 << endl;
8 return 1;
9 }
10
11 try {
12 // Read command-line arguments and the instance
13 const unsigned seed = stoi(argv[1]);
14 const string config_file = argv[2];
15 const string instance_file = argv[4];
16 const unsigned num_threads = 4;
17
18 cout << "Reading data..." << endl;
19 auto instance = TSP_Instance(instance_file);
20
21 // Read algorithm parameters
22 cout << "Reading parameters..." << endl;
23
24 auto [brkga_params, control_params] =
25 BRKGA::readConfiguration(config_file);
26
27 // Overwrite the maximum time from the config file.
28 control_params.maximum_running_time = chrono::seconds {stoi(argv[3])};
29
30 // Build the BRKGA data structure
31 cout << "Building BRKGA data and initializing..." << endl;
32
33 TSP_Decoder decoder(instance);
34
35 BRKGA::BRKGA_MP_IPR<TSP_Decoder> algorithm(
36 decoder, BRKGA::Sense::MINIMIZE, seed,
37 instance.num_nodes, brkga_params, num_threads
38 );
39
40 // Find good solutions / evolve
41 cout << "Running for " << control_params.maximum_running_time << "..."
42 << endl;
43
44 const auto final_status = algorithm.run(control_params, &cout);
45
46 cout
47 << "\nAlgorithm status: " << final_status
48 << "\n\nBest cost: " << final_status.best_fitness
49 << endl;
50 }
51 catch(exception& e) {
52 cerr
53 << "\n" << string(40, '*') << "\n"
54 << "Exception Occurred: " << e.what()
55 << "\n" << string(40, '*')
56 << endl;
57 return 1;
58 }
59 return 0;
60}
You can identify the following basic steps:
Create a data structure to hold your input data. This object should be passed to the decoder object/functor (example tsp/tsp_instance.hpp);
Certify that
BRKGA::fitness_t
has the correct type;Implement a decoder object/functor. This function translates a chromosome (array of numbers in the interval [0, 1)) to a solution for your problem. The decoder must return the solution value or cost to be used as fitness by BRKGA (example decoders/tsp_decoder.hpp);
Load the instance and other relevant data;
Read the algorithm parameters using
BRKGA::readConfiguration()
; or createBRKGA::BrkgaParams
andBRKGA::ControlParams
objects by hand;Create an
BRKGA::BRKGA_MP_IPR
algorithm object;Call
BRKGA::BRKGA_MP_IPR::run()
to optimize;Check the resulting
BRKGA::AlgorithmStatus
for optimization information.
main_minimal.cpp provides a very minimal example to understand the necessary steps to use the BRKGA-MP-IPR framework. However, main_complete.cpp provides a full-featured code, handy for scientific use, such as experimentation and paper writing. This code allows fine-grained control of the optimization, shows several features of BRKGA-MP-IPR such as the resets, chromosome injection, and others. It also logs all optimization steps, creating outputs easy to be parsed. You should use this code for serious business and experimentation.
These are the basic steps, but I do recommend the reading of this guide.
TL;DR - Multi objective
Warning
Remember, BRKGA-MP-IPR multi-objective mode produces lexicographical dominated solutions but no non-dominated solutions (Pareto frontier). Please, see the details in the introduction.
To use BRKGA-MP-IPR in the multi-objective mode, we first must set
BRKGA::fitness_t
according to the number of objectives we want. In the
repo example,
we consider the TSP with two objectives: first, we must minimize the total tour
length, and second, the size of the largest edge in the tour. For that, we must
change the file
fitness_type.hpp
to reflect such a thing. In this example, we use the standard
std::tuple.
1namespace BRKGA {
2
3using fitness_t = std::tuple<double, double>;
4
5//...
6} // end namespace BRKGA_MP_IPR
In this case, the first component of the tuple holds the tour length, and the second contains the largest edge. On Section Using BRKGA-MP on multi-objective mode, we talk with more details about multi-objective problems. Just keep in mind, although you could use any type for your fitness_t, you should prefer to use std::tuple.
The remaining code is almost identical to the single-objective. The only differences are in computing the largest edge, and printing such information on the main call. All the steps described briefly in the previous section are also used here.
Getting started
BRKGA-MP-IPR is pretty simple, and you must provide one required decoder object to translate chromosomes to solutions. In general, such decoder uses the problem information to map a vector of real numbers in the interval [0, 1) to a (valid) solution. In some cases, even though a valid solution cannot be found, library users apply penalization factors and push the BRKGA to find valid solutions.
Before you go further, please take a look at the examples
folder in the
git repo.
Until version 2.0, we maintained a folder with code to solve
combinatorial auction problems [ATRM15].
But due to maintanance burden,
we drop the that code from the repo. However, the classical
Traveling Salesman Problem (TSP) was kept.
In the TSP, we have a set of cities and the distances between them (full
weighted undirect graph). One must find a minimum-cost tour among all cities,
such that each city is visited only once (i.e., find a Hamiltonian cycle of
minimum cost). The folder has the following structure:
src
subdir: contains all the code;instances
subdir: folder containing some TSP instances for testing;
The src
subdir contains all the code to solve TSP both for single and
multi-objective. This is its structure:
tsp
subdir: contains code to load and build data for TSP;decoders
subdir: contains the TSP decoder;heuristics
subdir: contains a simple heuristic that computes a greedy tour;main_minimal.cpp
file: minimal code useful to understand and test the framework. You should start here! Please take a look on this file before continue this tutorial;main_complete.cpp
file: full-featured code, handy for scientific use, such as experimentation and paper writing. This code allows fine-grained control of the optimization, shows several features of BRKGA-MP-IPR such as the path-relinking calls, resets, chromosome injection, and others. It also logs all optimization steps, creating outputs easy to be parsed. You should use this code for serious business and experimentation; Note that this version was much simplified from version 2.0. Still, it offers a lot of control for experimentation through parameter tuning. However, it is still possible to decompose it by inspecting the code on methodBRKGA::BRKGA_MP_IPR::run()
.config.conf
file: example of parameter settings;Makefile
file: the makefile used to build the executables;third_part
subdir: contains the docopt dependence formain_complete.cpp
. This is not strictly necessary for BRKGA-MP-IPR, but it adds a nice command line interface. If you have problems with that, we can changemain_complete.cpp
using traditional argument handling, or another library you prefer.
The first step is to build the code. Here, we are using GNU/Make and GCC toolchain. You may change for the toolchain of your choice.
You may need to edit this file according to your compiler version and
settings. The first thing to note in the makefile is the parameter OPT
that, when set OPT=opt
, it turns on aggressive optimization flags (for
G++). If the flag is not set, aggressive debug options are set. For serious
implementations, and debugging, we do recommend to use such setup. However,
if your compiler does not like such settings, please, go ahead and change
accordingly. By typing just make
, you build both minimal and complete
versions. We also can just type make main_minimal
or make
main_complete
to build one or other version independently.
Typing make clean
will clean up the compilation.
When you call the executables main_minimal
or main_complete
without
arguments, they show the usage. For example, assuming you are using a
terminal:
$./main_minimal
Usage: ./main_minimal <seed> <config-file> <maximum-running-time> <tsp-instance-file>
$ ./main_complete
Arguments did not match expected patterns
Usage:
main_complete
--config <config_file>
--seed <seed>
--stop_rule <stop_rule>
--stop_arg <stop_arg>
--maxtime <max_time>
--instance <instance_file>
[--threads <num_threads>]
[--no_evolution]
main_complete (-h | --help)
Options:
--config <arg> Text file with the BRKGA-MP-IPR parameters.
--seed <arg> Seed for the random number generator.
--stop_rule <arg> Stop rule where:
- (G)enerations: number of evolutionary generations.
- (I)terations: maximum number of generations without
improvement in the solutions.
--stop_arg <arg> Argument value for the stopping rule.
--maxtime <arg> Maximum time in seconds.
--instance <arg> Instance file.
--threads <arg> Number of threads to be used during parallel decoding.
It must in the range [1, 64] [default: 1].
--no_evolution If supplied, no evolutionary operators are applied.
So, the algorithm becomes a simple multi-start algorithm.
-h --help Produce help message.
So, this is a possible output whe calling main_minimal
:
$ ./main_minimal 27000001 config.conf 100 ../instances/brazil58.dat
Reading data...
Reading parameters...
Building BRKGA data and initializing...
Running for 30s...
Custom stopping criteria not supplied by the user. Using max. time = 30s and max. stall_offset = 1000
Using 4 threads for decoding
Exchanged 1 solutions from each population. Iteration 432. Current time: 7.88959s
Exchanged 1 solutions from each population. Iteration 556. Current time: 10.1435s
Exchanged 1 solutions from each population. Iteration 735. Current time: 13.1668s
Path relink at 835 iteration. Block size: 45. Type: DIRECT. Distance: KENDALLTAU. Current time: 13.1668s
- No improvement found. Current time: 14.7002s
Exchanged 1 solutions from each population. Iteration 835. Current time: 14.7006s
Exchanged 1 solutions from each population. Iteration 935. Current time: 16.3598s
Shaking with intensity 0.70553. Type SWAP. Iteration 935. Current time: 16.3598s
Path relink at 1035 iteration. Block size: 45. Type: DIRECT. Distance: KENDALLTAU. Current time: 16.3598s
- Improvement on the elite set but not in the best individual. Current time: 18.0657s
Exchanged 1 solutions from each population. Iteration 1035. Current time: 18.0661s
Exchanged 1 solutions from each population. Iteration 1135. Current time: 20.1444s
Reset population after 500 iterations without improvement. Iteration 1135. Current time: 20.1551s
Path relink at 1235 iteration. Block size: 45. Type: DIRECT. Distance: KENDALLTAU. Current time: 20.1551s
- No improvement found. Current time: 22.0121s
Exchanged 1 solutions from each population. Iteration 1235. Current time: 22.0125s
Shaking with intensity 0.635298. Type SWAP. Iteration 1235. Current time: 22.0125s
Exchanged 1 solutions from each population. Iteration 1335. Current time: 23.5627s
Path relink at 1435 iteration. Block size: 45. Type: DIRECT. Distance: KENDALLTAU. Current time: 23.5627s
- No improvement found. Current time: 25.0922s
Exchanged 1 solutions from each population. Iteration 1435. Current time: 25.0926s
Exchanged 1 solutions from each population. Iteration 1535. Current time: 26.6116s
Shaking with intensity 0.447978. Type SWAP. Iteration 1535. Current time: 26.6116s
Algorithm status:
best_fitness: 27895
current_iteration: 1635
last_update_iteration: 635
current_time: 28.1722s
last_update_time: 11.6496s
largest_iteration_offset: 159
stalled_iterations: 1000
path_relink_time: 0.076028s
num_path_relink_calls: 4
num_homogenities: 0
num_best_improvements: 0
num_elite_improvements: 1
num_exchanges: 11
num_shakes: 3
num_resets: 1
Best cost: 27895
For main_complete
, the output is more verbose, since we want to capture
as much information as possible to do some statistical analysis. The output
should be something close to this:
$./main_complete --config config.conf --seed 2700001 --stop_rule I \
--stop_arg 1000 --maxtime 30 --threads 4 --instance ../../instances/brazil58.dat
[Tue Sep 26 22:13:30 2023] Experiment started
> Instance: '../../instances/brazil58.dat'
> Loading config file: 'config.conf'
> Algorithm parameters:
population_size 2000
elite_percentage 0.30
mutants_percentage 0.15
num_elite_parents 2
total_parents 3
bias_type LOGINVERSE
num_independent_populations 3
pr_number_pairs 0
pr_minimum_distance 0.15
pr_type DIRECT
pr_selection BESTSOLUTION
pr_distance_function_type KENDALLTAU
alpha_block_size 1.00
pr_percentage 1.00
num_exchange_individuals 1
shaking_type SWAP
shaking_intensity_lower_bound 0.25
shaking_intensity_upper_bound 0.75
> Control parameters:
maximum_running_time 30s
exchange_interval 100
shake_interval 300
ipr_interval 200
reset_interval 500
stall_offset 1000
> Seed: 2700001
> Stop rule: Improvement
> Stop argument: 1000
> Number of threads for decoding: 4
[Tue Sep 26 22:13:30 2023] Reading TSP data
Number of nodes: 58
[Tue Sep 26 22:13:30 2023] Generating initial tour
Initial cost: 30774.00
[Tue Sep 26 22:13:30 2023] Building BRKGA
New population size: 580
Chromosome size: 58
[Tue Sep 26 22:13:30 2023] Injecting initial solution
[Tue Sep 26 22:13:30 2023] Optimizing...
* Iteration | Cost | CurrentTime
Custom stopping criteria not supplied by the user. Using max. time = 30s and max. stall_offset = 1000
Using 4 threads for decoding
* 1 | 30774.00 | 0.00s
* 46 | 30365.00 | 0.12s
* 47 | 29956.00 | 0.12s
* 53 | 29618.00 | 0.14s
* 54 | 29343.00 | 0.14s
* 71 | 29332.00 | 0.19s
* 115 | 29304.00 | 0.31s
* 145 | 29215.00 | 0.39s
* 156 | 29206.00 | 0.42s
* 167 | 29172.00 | 0.44s
Exchanged 1 solutions from each population. Iteration 267. Current time: 0.71s
* 364 | 29060.00 | 0.95s
* 370 | 28910.00 | 0.96s
Exchanged 1 solutions from each population. Iteration 470. Current time: 1.22s
* 474 | 28865.00 | 1.23s
* 477 | 28859.00 | 1.24s
* 554 | 28773.00 | 1.43s
* 596 | 28763.00 | 1.54s
* 647 | 28699.00 | 1.67s
* 675 | 28671.00 | 1.74s
* 746 | 28585.00 | 1.91s
* 760 | 28575.00 | 1.95s
Exchanged 1 solutions from each population. Iteration 860. Current time: 2.21s
* 932 | 28301.00 | 2.48s
Exchanged 1 solutions from each population. Iteration 1032. Current time: 2.81s
* 1080 | 28075.00 | 2.98s
* 1113 | 27945.00 | 3.12s
* 1165 | 27709.00 | 3.30s
* 1181 | 27571.00 | 3.35s
* 1184 | 27352.00 | 3.36s
* 1209 | 27294.00 | 3.44s
* 1220 | 27289.00 | 3.48s
Exchanged 1 solutions from each population. Iteration 1320. Current time: 3.79s
Path relink at 1420 iteration. Block size: 25. Type: DIRECT. Distance: KENDALLTAU. Current time: 3.79s
- No improvement found. Current time: 4.09s
Exchanged 1 solutions from each population. Iteration 1420. Current time: 4.09s
Exchanged 1 solutions from each population. Iteration 1520. Current time: 4.42s
Shaking with intensity 0.65. Type SWAP. Iteration 1520. Current time: 4.42s
Path relink at 1620 iteration. Block size: 25. Type: DIRECT. Distance: KENDALLTAU. Current time: 4.42s
- No improvement found. Current time: 4.73s
Exchanged 1 solutions from each population. Iteration 1620. Current time: 4.73s
Exchanged 1 solutions from each population. Iteration 1720. Current time: 5.04s
Reset population after 500 iterations without improvement. Iteration 1720. Current time: 5.04s
Path relink at 1820 iteration. Block size: 25. Type: DIRECT. Distance: KENDALLTAU. Current time: 5.04s
- No improvement found. Current time: 5.36s
Exchanged 1 solutions from each population. Iteration 1820. Current time: 5.36s
Shaking with intensity 0.73. Type SWAP. Iteration 1820. Current time: 5.36s
Exchanged 1 solutions from each population. Iteration 1920. Current time: 5.70s
Path relink at 2020 iteration. Block size: 25. Type: DIRECT. Distance: KENDALLTAU. Current time: 5.70s
- No improvement found. Current time: 5.98s
Exchanged 1 solutions from each population. Iteration 2020. Current time: 5.98s
Exchanged 1 solutions from each population. Iteration 2120. Current time: 6.26s
Shaking with intensity 0.26. Type SWAP. Iteration 2120. Current time: 6.26s
[Tue Sep 26 22:13:37 2023] End of optimization
> Final status:
best_fitness: 27289.00
current_iteration: 2220
last_update_iteration: 1220
current_time: 6.54s
last_update_time: 3.48s
largest_iteration_offset: 197
stalled_iterations: 1000
path_relink_time: 0.01s
num_path_relink_calls: 4
num_homogenities: 0
num_best_improvements: 0
num_elite_improvements: 0
num_exchanges: 13
num_shakes: 3
num_resets: 1
% Best tour cost: 27289
% Best tour: 21 7 0 29 12 39 24 8 31 19 52 49 3 17 43 23 57 4 26 42 11 56 22 54 53 1 47 40 34 9 51 50 46 48 2 20 35 16 25 18 5 27 13 36 14 33 45 55 44 32 28 38 10 15 41 30 6 37
Instance,Seed,Cost,NumNodes,TotalIterations,LastUpdateIteration,TotalTime,LastUpdateTime,LargestIterationOffset,StalledIterations,PRTime,PRCalls,PRNumHomogenities,PRNumPrImprovBest,PRNumImprovElite,NumExchanges,NumShakes,NumResets
brazil58,2700001,27289,58,2220,1220,6.54,3.48,197,1000,0.01,4,0,0,0,13,3,1
Note that your can extract only the last line (e.g., using tail -n1
)
from the log, and add it to a table in a CSV file. In this way, you can load
such table in your favorite statistic tools.
I hope by now you got your system set up and running. Let’s see the essential details on how to use the BRKGA-MP-IPR.
First things first
The decoder function
The core of the BRKGA algorithm is the definition of a decoder function/object. The decoder maps the chromosomes (vectors of real numbers in the interval [0, 1)) to solutions of the problem. In some sense, a decoder is similar to a kernel function from Support Vector Machines: both functions are used to project solutions/distances in different spaces.
Here, we have a small difference between the C++/Python and the Julia implementations. In the Julia version, you must define a data container inherit from AbstractInstance, and a decoder function. The reason you must do that is because structs in Julia have no methods (but constructors), and the decoder function must take both chromosome and input data in the call. In C++/Python, we can encapsulate the input data into the decoder object, resulting in a much more clear API.
The basic form of a decoder should be:
1class Decoder {
2public:
3 BRKGA::fitness_t decode(BRKGA::Chromosome& chromosome, bool rewrite);
4};
The decoder must contain a public decode() method that receives a
BRKGA::Chromosome
reference and an boolean
, and returns a
BRKGA::fitness_t
.
But before going further, let’s talk about the chromosome.
The chromosome or vector of doubles
Note that all long the BRKGA discussion, the chromosome is represented as a
vector of real numbers in the interval [0, 1). Indeed, we could use
straightforward std::vector<double>
. However, sometimes is interesting to
keep more information inside the chromosome for further analysis, such as,
other solution metrics that not the main fitness value. For example, in a
scheduling problem, we may choose to keep both makespan and total completion
time metrics. Therefore, we chose to make the chromosome a “generic” data
structure in our design.
File chomosome.hpp shows the basic represetation of a chromosome:
using Chromosome = std::vector<double>;
If this enough for you, you go already and use such a definition. We do recommend to import and use the definition from chomosome.hpp, instead to redefine in your own code, since it is the same definition the main BRKGA-MP-IPR algorithm uses.
However, if you need more information to be tracked during the optimization,
you can redefine the chromosome. First, your definition must complain with
the std::vector
interface. The easiest way to do that is to inherit
from the std::vector
. For instance, assume we want to keep track of the
makespan and the total completion time for a scheduling problem. We can do
the following:
1class Chromosome: public std::vector<double> {
2public:
3 Chromosome() :
4 std::vector<double>(), makespan(0.0), total_completion_time(0.0)
5 {}
6
7 Chromosome(unsigned _size, double _value = 0.0)
8 std::vector<double>(_size, value),
9 makespan(0.0), total_completion_time(0.0)
10 {}
11
12 Chromosome(const Chromosome& chr) = default;
13
14public:
15 double makespan;
16 double total_completion_time;
17};
In general, most people do not recommend to inherit publicly from
std::vector
because it has no virtual destructor. However, we may do
that as long as:
We remember that every operation provided by
std::vector
must be a semantically valid operation on an object of the derived class;We avoid creating derived class objects with dynamic storage duration;
We DO AVOID polymorphism:
1std::vector<double>* pt = new Chromosome(); // Bad idea. Don't do that!
2delete pt; // Delete does not call the Chromosome destructor.
Back to the decoder
Again, the decoder is the heart of a BRKGA. An easy way to keep the API clean is to define a decoder that has a reference for the input data. This is a TSP decoder defined on file decoders/tsp_decoder.hpp:
1#include "tsp/tsp_instance.hpp"
2#include "brkga_mp_ipr/fitness_type.hpp"
3#include "brkga_mp_ipr/chromosome.hpp"
4
5class TSP_Decoder {
6public:
7 TSP_Decoder(const TSP_Instance& instance);
8 BRKGA::fitness_t decode(BRKGA::Chromosome& chromosome, bool rewrite);
9
10public:
11 const TSP_Instance& instance;
12};
Note that TSP_Decoder
get a const reference to TSP_Instance
, that holds
the input data. Therefore, TSP_Decoder
has direct access to the data for
optimization. This approach also benefits cache efficiency, mainly when
multiple threads are used for decoding, i.e., several threads can use the same
read-only data already in the cache, which speeds up the optimization.
The decode method also has a rewrite
argument that indicates if the decoder
should rewrite the chromosome, in case of local search / local improvements be
performed during the decoder process. This flag is critical if you intend to
use the Implicit Path Relink (details on
BRKGA::BRKGA_MP_IPR::pathRelink()
Even though you do not rewrite the chromosome in your decoder, you must provide
such signature for API compatibility.
The decoder must return a
BRKGA::fitness_t
that is used as the fitness to rank the chromosomes. In general, fitness is
the cost/value of the solution, but you may want to use it to penalize
solutions that violate the problem constraints, for example.
In our TSP example, we have a very simple decoder that generates a permutation
of nodes, and compute the cost of the cycle from that permutation
(note that we don’t use the flag rewrite
in this example):
1BRKGA::fitness_t TSP_Decoder::decode(Chromosome& chromosome, bool /* not-used */) {
2 vector<pair<double, unsigned>> permutation(instance.num_nodes);
3 for(unsigned i = 0; i < instance.num_nodes; ++i)
4 permutation[i] = make_pair(chromosome[i], i);
5
6 sort(permutation.begin(), permutation.end());
7
8 BRKGA::fitness_t cost = instance.distance(permutation.front().second,
9 permutation.back().second);
10
11 for(unsigned i = 0; i < instance.num_nodes - 1; ++i)
12 cost += instance.distance(permutation[i].second,
13 permutation[i + 1].second);
14 return cost;
15}
With the instance data and the decoder ready, we can build the BRKGA data structures and perform the optimization.
Warning
When using multiple threads, you must guarantee that the decoder is thread-safe. You may want to create all read-write data structures on each call or create a separate storage space for each thread. Section Multi-thread decoding brings some tips.
Warning
The decoder must be a function, i.e., given a chromosome, it must output the same solution/fitness in any call. In other words, the decoder must be a deterministic (or, at most, pseudo-random) procedure.
Indeed, this is an essential aspect of the decoder: it must produce the exact solution for the same chromosome. If the decoder cannot do it, we will see a substantial degradation in the BRKGA performance regarding convergence. BRKGA cannot learn well with non-deterministic decoders. Moreover, non-deterministic decoders do not allow reproducibility, impairing their utility for production and academic environments.
However, there are several situations where we must toss a coin to break a tie. In this case, we must guarantee that such a coin always results in the same sequence of values for a given chromosome. In other words, we must ensure that our decoder is pseudo-random or pseudo-non-deterministic. We could create a Random Number Generator (RNG) inside each decoding call with a fixed seed. But this strategy may not explorer the solution space as needed since the seed is the same for all decoding.
We can use several strategies to mitigate such situations, but the most used is to create an (n+1)-sized chromosome such that one allele (in general, the first or the last) is used as a seed to the RNG. In this way, the chromosome also carries the information for breaking ties, and therefore, we can reproduce the solution. This is an example:
1typedef std::mt19937::result_type seed_t;
2
3// This just reinterprets the bits as they are. This is the safest way to
4// guarantee reproducibility since we only use the bits. However, since we
5// are converting the range [0.0, 1.0] from a double, we may have a skewed
6// list of seeds. We are missing the integer part and negative numbers bits.
7// Still, for most applications, this should be good enough.
8auto seed1 = *(reinterpret_cast<seed_t*>(&chromosome[n]))
9
10// This version may grab all the seed's domain. However, we may face
11// numerical issues with precision here. In some cases, the same double may
12// generate two different seeds (depending on the platform), and we will
13// lose reproducibility. We only recommend using this if you really need a
14// very diverse set of seeds to generate millions of random numbers in the
15// decoder.
16auto seed2 = seed_t(numeric_limits<seed_t>::max() * chromosome[n]);
17
18// Just instantiate a local random number generator. Tip: this can hit your
19// performance. Better allocate the RNG before. If you use multiple threads,
20// please read the Section :ref:`Multi-thread decoding <doxid-guide_guide_tips_multi_thread_decoding>`.
21std::mt19937 my_local_rng(seed1);
Building BRKGA-MP-IPR algorithm object
BRKGA::BRKGA_MP_IPR
is the main object that
implements all BRKGA-MP-IPR algorithms such as evolution, path relink, and
other auxiliary procedures. Note that BRKGA::BRKGA_MP_IPR
is a template
parametrized by the decoder type. This strategy avoids runtime polymorphism,
drastically improving the performance of the code.
The first step is to call the algorithm constructor that has the following signature:
1BRKGA_MP_IPR(
2 Decoder& decoder_reference,
3 const Sense sense,
4 const unsigned seed,
5 const unsigned chromosome_size,
6 const BrkgaParams& params,
7 const unsigned max_threads = 1,
8 const bool evolutionary_mechanism_on = true
9);
The first argument is the decoder object that must implement the decode()
method as discussed before. You also must indicate whether you are minimizing
or maximizing through parameter BRKGA::Sense
.
A good seed also must be provided for the (pseudo) random number generator (according to [MWKA07]). BRKGA-MP-IPR uses the Mersenne Twister engine from the standard C++ library.
The chromosome_size
also must be given. It indicates the length of each
chromosome in the population. In general, this size depends on the instance and
how the decoder works. The constructor also takes a BRKGA::BrkgaParams
object that holds several
parameters. We will take about that later.
max_threads
defines how many threads the algorithm should use for decoding
and some other operations. As said before, you must guarantee that the
decoder is thread-safe when using two or more threads. See
Multi-thread decoding
for more information.
Another common argument is evolutionary_mechanism_on
which is enabled by
default. When disabled, no evolution is performed. The algorithm only decodes
the chromosomes and ranks them. On each generation, all population is replaced
excluding the best chromosome. This flag helps on implementations of simple
multi-start algorithms.
All BRKGA and Path Relink hyper-parameters are stored in a
BRKGA::BrkgaParams
object.
Such objects can be read and write from plain text files or can be created by
hand by the user. There is also a companion
BRKGA::ControlParams
object that stores extra control parameters that can be used outside the
BRKGA-MP-IPR to control several aspects of the optimization. For instance,
interval to apply path relink, reset the population, perform population
migration, among others. This is how a configuration file looks like (see
config.conf
for a commented version):
1# BRKGA and IPR parameters
2population_size 2000
3elite_percentage 0.30
4mutants_percentage 0.15
5num_elite_parents 2
6total_parents 3
7bias_type LOGINVERSE
8num_independent_populations 3
9pr_number_pairs 0
10pr_minimum_distance 0.15
11pr_type DIRECT
12pr_selection BESTSOLUTION
13pr_distance_function_type KENDALLTAU
14alpha_block_size 1.0
15pr_percentage 1.0
16num_exchange_individuals 1
17shaking_type SWAP
18shaking_intensity_lower_bound 0.25
19shaking_intensity_upper_bound 0.75
20
21# Control parameters
22maximum_running_time 60
23exchange_interval 100
24ipr_interval 200
25shake_interval 300
26reset_interval 500
27stall_offset 100
To read this file, you can use the function
BRKGA::readConfiguration()
which returns a
std::pair<BrkgaParams, ControlParams>
.
When reading such
file, the function ignores all blank lines, and lines starting with #
. As
commented before,
BRKGA::BrkgaParams
contains all hyper-parameters regarding BRKGA and IPR methods and
BRKGA::ControlParams
contains extra control parameters, and they are not mandatory to
the BRKGA-MP-IPR itself.
Let’s take a look in the example from main_minimal.cpp:
1const unsigned seed = stoi(argv[1]);
2const string config_file = argv[2];
3const string instance_file = argv[4];
4const unsigned num_threads = 4;
5
6auto instance = TSP_Instance(instance_file);
7
8auto [brkga_params, control_params] =
9 BRKGA::readConfiguration(config_file);
10
11control_params.maximum_running_time = chrono::seconds {stoi(argv[3])};
12
13TSP_Decoder decoder(instance);
14
15BRKGA::BRKGA_MP_IPR<TSP_Decoder> algorithm(
16 decoder, BRKGA::Sense::MINIMIZE, seed,
17 instance.num_nodes, brkga_params, num_threads
18);
This code gets some arguments from the command line and loads a TSP instance.
After that, it reads the BRKGA parameters from the configuration file.
Here, instead of using the maximum time given in the config file, we overwrite
it with the maximum time passed by the user through the command line. We then
build the decoder object, and the BRKGA algorithm. Since we are looking for
cycles of minimum cost, we ask for the algorithm MINIMIZE
. The starting
seed is also given. Since TSP_Decode
considers each chromosome key as a
node/city, the length of the chromosome must be the number of nodes, i.e.,
instance.num_nodes
. Finally, we also pass the BRKGA parameters.
Now, we have a
BRKGA::BRKGA_MP_IPR
algorithm/object which will be used to call all other functions during the
optimization. Note that we can build several
BRKGA::BRKGA_MP_IPR
objects using different parameters, decoders, or instance data. These
structures can be evolved in parallel and mixed-and-matched at your will. Each
one holds a self-contained BRKGA state including populations, fitness
information, and a state of the random number generator.
It’s optimization time
Until version 2.0, the user was responsible for creating the main optimization loop. While this strategy gives fine control over the algorithm’s flow, he/she must call the BRKGA-MP-IPR features, such as IPR, shaking, population reset, and others, individually. That generates cumbersome code, which usually takes a lot of time for the developer to make it right.
In version 3.0, we abstract all these details, creating a single method
BRKGA_MP_IPR::run()
containing the complete optimization loop, which may use all the features
provided by this library. In this way, we provide a comprehensive and
easy-to-use single-entry point, like this:
const auto final_status = algorithm.run(control_params, &cout);
run()
takes a BRKGA::ControlParams
object which contains several control parameters in how the main loop behaves.
It is through these control parameters that the user can control the maximum
optimization time and when features like IPR, shaking, etc, are called.
run()
also takes an output stream to log some information along the
optimization.
Once done,
run()
returns a
BRKGA::AlgorithmStatus
object that brings all the details about the
optimization itself, such as the number of iterations, running time, number of
calls for each method, and others. Most importantly,
AlgorithmStatus
also brings the fitness and the chromosome, representing the best solution
found during the optimization (note that it is not the best chromosome in the
current population because it may be fully reset and has lost the best
solution).
The main loop should be like this:
1while(!must_stop) {
2 evolve(); // One generation.
3 if(best solution improvement) {
4 Save best solution;
5 Call observer callbacks;
6 }
7
8 if(!must_stop && ipr_interval > 0 && stalled_iterations > 0 &&
9 stalled_iterations % ipr_interval == 0) {
10 pathRelink();
11 if(best solution improvement) {
12 Save best solution;
13 Call observer callbacks;
14 }
15 }
16
17 if(!must_stop && exchange_interval > 0 && stalled_iterations > 0 &&
18 stalled_iterations % exchange_interval == 0) {
19 exchangeElite();
20 }
21
22 if(!must_stop && shake_interval > 0 && stalled_iterations > 0 &&
23 stalled_iterations % shake_interval == 0) {
24 shake();
25 }
26
27 if(!must_stop && reset_interval > 0 && stalled_iterations > 0 &&
28 stalled_iterations % reset_interval == 0) {
29 reset();
30 }
31}
Therefore, note that the order that
pathRelink()
,
exchangeElite()
,
shake()
,
and
reset()
are called, depends on the thresholds defined in
ControlParams
.
For path relinking, the block size is computed by
\(\lceil \alpha \times \sqrt{p} \rceil\) where
\(\alpha\) is
BrkgaParams::alpha_block_size
and \(p\) is
BrkgaParams::population_size
.
If the size is larger than the chromosome size, the size is set to half of the
chromosome size. For more details on that, refer to
pathRelink()
.
Note
The algorithm always test against maximum running time and for the maximum
stalled iterations/generations given by
ControlParams
indenpendently of the stopping criteria function supplied by the user. This
is especially important when activating the implicit path reliking which is
very timing consuming. If you are using IPR, we STRONGLY RECOMMEND TO
SET A MAXIMUM TIME since this is the core stopping criteria on IPR.
Warning
The decoding is done in parallel using threads, and the user must guarantee that the decoder is THREAD-SAFE. If such property cannot be held, we suggest using a single thread for optimization.
Options before optimization starts
While we can call
run()
right away, some options can enhance the pipeline before running the
optimization. We can:
Set solution observer callbacks that are called when the best solution is updated;
Set a custom stopping-criteria function other than solely time and stalled iterations;
Set a custom shake procedure instead of using the canonical BRKGA-MP-IPR shaking options;
Set custom bias function for chromosome ranking;
Provide warmstart solutions to the algorithm to improve general solution quality and convergence.
We will explore such options in the following sections.
Setting solution observers / callbacks
Usually, tracking the algorithm’s convergence is a good idea.
run()
provides a callback mechanism activated when the best solution found so far
during the optimization is improved. This is done by calling functions provided
by the user. For instance:
1algorithm.addNewSolutionObserver(
2 [](const AlgorithmStatus& status) {
3 std::cout
4 << "> Iter: " << status.current_iteration
5 << " | solution: " << status.best_fitness
6 << " | time: " << status.current_time
7 << std::endl;
8 return false; // Dont' stop the optimization.
9 }
10);
adds a callback function that prints the current iteration, the value of the current best solution, and the time it was found. In this example, we use a lambda function. Obviously, you can define a named function outside this scope and add it as a callback, too.
You have noted that we use the method
BRKGA_MP_IPR::addNewSolutionObserver()
to add the callback function which must have the following signature:
bool observer_callback_name(const AlgorithmStatus& status);
where
BRKGA::AlgorithmStatus
provides the current optimization status, such as current
time, number of iterations, best solution values, best chromosome, and many
other statistics. Indeed,
BRKGA::AlgorithmStatus
is the primary way to track the algorithm’s convergence. Then, this function
returns a boolean
that, if true
, aborts the optimization immediately.
This is useful when one wants only to obtain a solution with a particular value
or characteristic and stop to save time.
You can add as many observers as you want. They will be called in the order they are added.
One interesting usage of such callbacks is to perform (expensive) local search
from the best solution when this cannot be done during the decoder process.
Once the local search is done, we can inject the improved solution/chromosome
back into the population using method
BRKGA_MP_IPR::injectChromosome()
.
Please, see section
Injecting solutions / chromosome into the population
for more details.
Defining custom stopping criteria
By default, the algorithm always test for the maximum running time and for
the maximum stalled iterations/generations given by
ControlParams
.
However, in some situations, the user may want to evaluate additional criteria
to determine whether the optimization must stop or not. For example, in a
minimization problem, we may want to stop the value within a distance from a
lower bound or when we reach a given number of iterations, as shown below:
1fitness_t lower_bound = compute_lower_bound();
2unsigned max_iterations = 100;
3
4algorithm.setStoppingCriteria(
5 [&](const AlgorithmStatus& status) {
6 return
7 status.best_fitness <= lower_bound * 1.1; // 10% from the lower bound
8 ||
9 status.current_iteration == max_iterations;
10 }
11);
For that, we use the method
BRKGA_MP_IPR::setStoppingCriteria()
which takes a function with the signature
bool stopping_callback_name(const AlgorithmStatus& status);
Similar to observer callbacks, the function must take a reference to a
BRKGA::AlgorithmStatus
object and return true
when the optimization must stop or false
otherwise.
Warning
If you are using implicit path relinking (IPR), which is very timing consuming, we STRONGLY RECOMMEND TO SET A MAXIMUM TIME since this is the core stopping criteria on IPR.
If you really mean to have no maximum time and/or maximum stalled iterations set, we recommend to use the following code:
1// After reading your parameters, e.g.,
2// auto [brkga_params, control_params] = readConfiguration("config.conf");
3
4// You can set the time to max...
5control_params.maximum_running_time = std::chrono::seconds::max();
6
7// ... and/or the stalled iterations to max.
8control_params.stall_offset = numeric_limits<unsigned>::max();
Providing custom shake procedure
BRKGA-MP-IPR supplies two canonical ways to perturb the population called shaking procedures (more details here). Shaking was introduced by [ASP19], and the canonical shaking procedures are effective in most cases. However, there are situation that calls for a custom (maybe more effective) procedure (e.g., [HCdM18]). In such cases, one can use method
BRKGA_MP_IPR::setShakingMethod()
which sets a shaking function with the signature:
1void custom_shaking(
2 double lower_bound,
3 double upper_bound,
4 std::vector<std::shared_ptr<Population>>& populations,
5 std::vector<std::pair<unsigned, unsigned>>& shaken
6);
We have that:
Parameters
lower_bound
andupper_bound
is the shaking intensity bounds to be applied. Usually, the define a range where the intensity is sampled;Parameter
populations
are the current BRKGA populations;Parameter
shaken
is a list of<population index, chromosome index>
pairs indicating which chromosomes were shaken on which population, so that they got re-decoded.
Note
If shaken
is empty, all chromosomes of all populations are
re-decoded. This may be slow. Even if you intention is to do so,
it is faster to populate shaken
.
Warning
This procedure can be very intrusive since it must manipulate the population. So, the user must make sure that BRKGA invariants are kept, such as chromosome size and population size. Otherwise, the overall functionaly may be compromised.
In the example below, we implement the standard mutation for vanilla genetic algorithms to the elite population. Note that we kept the random number generator outside, to make sure we generate different sequences on each call:
1// A random number generator.
2std::mt19937 rng(2700001);
3rng.discard(rng.state_size);
4
5// Determines whether we change the allele or not.
6std::bernoulli_distribution must_change(0.50);
7
8 algorithm.setShakingMethod(
9 [&](double lower_bound, double upper_bound,
10 std::vector<std::shared_ptr<Population>>& populations,
11 std::vector<std::pair<unsigned, unsigned>>& shaken) {
12
13 // Determines the value of the allele.
14 std::uniform_real_distribution<> allele_value(lower_bound, upper_bound);
15
16 for(unsigned pop_idx = 0; pop_idx < populations.size(); ++pop_idx) {
17 auto& population = populations[0]->population;
18
19 for(unsigned chr_idx = 0; chr_idx < population.size(); ++chr_idx) {
20 auto& chromosome = population[chr_idx];
21
22 bool change = false;
23 for(unsigned i = 0; i < chromosome.size(); ++i) {
24 if(must_change(rng)) {
25 chromosome[i] = allele_value(rng);
26 change = true;
27 }
28 }
29
30 if(change)
31 shaken.push_back({pop_idx, chr_idx});
32 } // chr for
33 } // pop for
34 }; // lambda
35 ); // setShakingMethod
Setting custom bias function
The bias function controls how alleles are chosen from the (multi) parents
during mating. While BRKGA-MP-IPR framework already provides an extensive set
of functions through
BRKGA::BiasFunctionType
,
one may want to change that behavior using a
custom function (e.g., to
simulate the vanilla BRKGA).
This is done using method
BRKGA_MP_IPR::setBiasCustomFunction()
,
where the user supplies the desired positive non-increasing function with
the signature
double bias_function(const unsigned r);
Warning
The bias function must be a positive non-increasing function, i.e. \(f: \mathbb{N}^+ \to \mathbb{R}^+\) such that \(f(i) \ge 0\) and \(f(i) \ge f(i+1)\) for \(i \in [1, \ldots, total\_parents]\). This is requirement to produce the right probabilities.
Note that
setBiasCustomFunction()
tests the function and throw a
std::runtime_error
in case the funtion is not positive non-increasing.
For instance, the code below sets the inverse quadratic function as bias:
1algorithm.setBiasCustomFunction(
2 [](const unsigned x) {
3 return 1.0 / (x * x);
4 }
5);
Injecting warm-start solutions
One good strategy is to bootstrap the main optimization algorithm with good solutions from fast heuristics ([LAQ+16], [PA18], [ASP19]) or even from relaxations of integer linear programming models [ATRM15] or constraint programming models [APS22].
Since BRKGA-MP-IPR does not know the problem structure, you must encode the warm-start solution as chromosomes (vectors in the interval [0, 1)). In other words, you must do the inverse process that your decoder does. For instance, this is a piece of code from main_complete.cpp showing this process:
1auto initial_solution = greedy_tour(instance);
2//...
3
4std::mt19937 rng(seed);
5vector<double> keys(instance.num_nodes); // It should be == chromosome_size.
6for(auto& key : keys)
7 key = generate_canonical<double,
8 numeric_limits<double>::digits>(rng);
9
10sort(keys.begin(), keys.end());
11
12BRKGA::Chromosome initial_chromosome(instance.num_nodes);
13auto& initial_tour = initial_solution.second;
14for(size_t i = 0; i < keys.size(); i++)
15 initial_chromosome[initial_tour[i]] = keys[i];
16
17algorithm.setInitialPopulation(
18 vector<BRKGA::Chromosome>(1, initial_chromosome)
19);
Here, we create one incumbent solution using the greedy heuristic greedy_tour()
found here.
It gives us initial_solution
which is a std::pair<double, std::vector<unsigned>>
containing the cost of the tour and the tour itself which is a sequence of
nodes to be visited. In the next lines, we encode initial_solution
. First,
we create a vector of sorted random keys
. For that, we create a new random
number generator rng
, the vector keys
, and fill up keys
with random
numbers in the interval [0, 1), using C++ standard library function
generate_canonical<>()
Once we have the keys, we sort them as
TSP_Decoder::decode()
does. We then create the initial_chromosome
, and
fill it up with keys
according to the nodes’ order in initial_solution
.
Finally, we use
BRKGA_MP_IPR::setInitialPopulation()
to assign the incumbent to the initial population. Note that we enclose the
initial solution inside a vector of chromosomes, since
setInitialPopulation()
may take more than one starting solution. See its signature:
void setInitialPopulation(const std::vector<Chromosome>& chromosomes);
Indeed, you can have as much warm-start solutions as you like, limited to the size of the populations.
DIY: building an optimization loop for fine control
While version 3.0 greatly enhances how to utilize all BRKGA-MP-IPR features transparently, one may want to change the algorithm’s flow. For instance, we are developing a hyperheuristic using BRKGA, IPR, and VND so that each algorithm is called when some condition happens. For that, we kept all public interfaces from version 2.0 to version 3.0. The following sections describe each one of these features in detail.
Evolving the population
The core aspect of the BRKGA-MP-IPR is the genetic algorithm itself. The
evolution of each generation/iteration is performed by method
BRKGA_MP_IPR::evolve()
, which
takes the number of generations we must evolve. The call is pretty simple:
algorithm.evolve(num_generations);
evolve()
evolves all populations for num_generations
. If num_genertions
is
omitted, only one generation is evolved.
Note that
run()
calls
evolve()
for one generation evolution per iteration of the main loop. In a custom setting,
one may evolve several generations per main loop iteration, if it make sense
for that scenario.
Note
evolve()
does not check the stopping criteria and only stops when the given
iterations are done. Therefore, we must be careful when evolving multiple
generations at once. Please, check Section
Complex decoders and timing.
Warning
Again, the decoding of each chromosome is done in parallel if multi-thread is enabled. Therefore, we must guarantee that the decoder is THREAD-SAFE. If such property cannot be held, we suggest using a single thread.
Accessing solutions/chromosomes
BRKGA-MP-IPR offers several mechanisms to access a variety of data during the optimization. Most common, we want to access the best chromosome of the current population after some iterations. You can use the companion methods:
double BRKGA_MP_IPR::getBestFitness() const;
const Chromosome& BRKGA_MP_IPR::getBestChromosome() const;
BRKGA_MP_IPR::getBestFitness()
returns the value/fitness of the best chromosome across all current populations.
BRKGA_MP_IPR::getBestChromosome()
returns a reference of the best chromosome across all current populations.
You may want to extract an actual solution from such chromosome, i.e.,
to apply a decoding function that returns the actual solution instead only
its value.
You may have noticed that we insist on the term current population. This is because all getting methods use the current chromosomes, i.e., the population whose state can change after any procedure such as IPR, shaking, or reset. Usually, IPR preserves the best solution when manipulating the population. However, the shaking and reset procedures often perturb and/or deconstruct the population. Therefore, there is a big chance that we will lose the best solution.
Warning
All the get methods return information (fitness and chromosome) from the
current population, not the best solution found overall. Only method
run()
keeps the best solution overall.
You may also want to get a reference of specific chromosome and its fitness
for a given population using
BRKGA_MP_IPR::getChromosome()
and
BRKGA_MP_IPR::getFitness()
.
const Chromosome& getChromosome(unsigned population_index,
unsigned position) const;
fitness_t getFitness(unsigned population_index,
unsigned position) const;
For example, you can get the 3rd best chromosome (and it fitness) from the 2nd population using
third_best_chr = algorithm.getChromosome(1, 2); third_best_fitness = algorithm.getFitness(1, 2);
Note
Just remember that C++ is zero-indexed. So, the first population index is 0 (zero), the second population index is 1 (one), and so forth. The same happens for the chromosomes.
Injecting solutions / chromosome into the population
Now, suppose you get such chromosome or chromosomes and apply a quick local
search procedure on them. It may be useful to reinsert such new solutions in
the BRKGA population for the next evolutionary cycles. You can do that using
BRKGA_MP_IPR::injectChromosome()
.
void injectChromosome(
const Chromosome& chromosome,
unsigned population_index,
unsigned position
)
Note that the chromosome is put in a specific position of a given population. The new chromosome replaces the old one, and the decoder is triggered to compute the new fitness. Once done, the population is re-sorted according to the chromosomes’ fitness.
For example, assuming the algorithm
is your
BRKGA-MP-IPR object and brkga_params
is your BrkgaParams
object, the
following code injects the random chromosome keys
into the population #1 in
the last position (population_size - 1
), i.e., it will replace the worst
solution by a random one:
1std::mt19937 rng(seed);
2Chromosome keys(instance.num_nodes);
3for(auto& key : keys)
4 key = generate_canonical<double, numeric_limits<double>::digits>(rng);
5
6algorithm.injectChromosome(keys, 0, brkga_params.population_size - 1);
Implicit Path Relink
The Implicit Path Relinking (IPR) is a nice addition to the standard BRKGA framework, and it provides an excellent way to create hybrid heuristics and push the optimization further. The good thing about IPR is that you do not need to worry about the path relink implementation, which can be long and tedious if done by hand or customized per problem.
BRKGA-MP-IPR provides a friendly interface to use IPR directly from the BRKGA
population, and you only must provide a few functions and arguments to have a
Path Relink algorithm ready to go. These are the two main signatures:
BRKGA_MP_IPR::pathRelink(/*long args call*/)
and
BRKGA_MP_IPR::pathRelink(/*short args call*/)
,
and these are their signatures:
1// Long list of arguments.
2PathRelinking::PathRelinkingResult pathRelink(
3 PathRelinking::Type pr_type,
4 PathRelinking::Selection pr_selection,
5 std::shared_ptr<DistanceFunctionBase> dist,
6 unsigned number_pairs,
7 double minimum_distance,
8 std::size_t block_size = 1,
9 std::chrono::seconds max_time = std::chrono::seconds{0},
10 double percentage = 1.0
11)
12
13// Short list of arguments.
14PathRelinking::PathRelinkingResult pathRelink(
15 std::shared_ptr<DistanceFunctionBase> dist,
16 std::chrono::seconds max_time = std::chrono::seconds{0}
17)
The first argument defines the type of implicit path relink to be performed
BRKGA::PathRelinking::Type
.
The
DIRECT
path relink exchanges the keys of two chromosomes directly, and
it is usually more suitable to or threshold representations, i.e., where the
key values are used to some kind of discretization, such as “if x < 0.5, then
0, otherwise 1.”
The
PERMUTATION
path relink switches the order of a key
according to its position in the other chromosome. Usually, this kind of path
relink is more suitable to permutation representations, where the chromosome
induces an order or permutation. For example, chromosome [0.4, 0.7, 0.1]
may induce the increasing order (3, 1, 2)
. More details about threshold and
permutation representations in [ATGR21].
BRKGA::PathRelinking::Selection
defines how the algorithm picks the chromosomes for relinking.
BESTSOLUTION
selects, in the order, the best solution of each population.
RANDOMELITE
chooses uniformly random solutions from the elite sets.
The next argument is a pointer to a functor object used to compute the distance
between two chromosomes, and determine if changes in a given (block) of alleles
change the solution. This object must inherit from
BRKGA::DistanceFunctionBase
, which has the following
signature:
1class DistanceFunctionBase {
2public:
3 virtual double distance(
4 const Chromosome& v1,
5 const Chromosome& v2
6 ) = 0;
7
8 virtual bool affectSolution(
9 const Chromosome::value_type key1,
10 const Chromosome::value_type key2
11 ) = 0;
12
13 virtual bool affectSolution(
14 Chromosome::const_iterator v1_begin,
15 Chromosome::const_iterator v2_begin,
16 const std::size_t block_size
17 ) = 0;
18};
Note that BRKGA::DistanceFunctionBase
is an abstract interface,
and children classes must implement all methods.
If the value returned by method distance()
is greater than or equal to
minimum_distance
(on pathRelink()
arguments), the algorithm will perform
the path relink between the two chromosomes. Otherwise, it will look for another
pair of chromosomes. The algorithm will try number_pairs
chromosomes before
gives up. In the presence of multiple populations, the path relinking is
performed between elite chromosomes from different populations, in a circular
fashion. For example, suppose we have 3 populations. The framework performs 3
path relinkings: the first between individuals from populations 1 and 2, the
second between populations 2 and 3, and the third between populations 3 and 1.
In the case of just one population, both base and guiding individuals are
sampled from the elite set of that population.
Note that in traditional path relink algorithms, method distance()
depends
on the problem structure. On IPR, you can use a generic distance function, or
provide one that incorporates more knowledge about the problem. BRKGA-MP-IPR
provides a class/functor to compute the (modified)
Hamming distance
for threshold representations (BRKGA::HammingDistance
),
and a class/functor that computes the
Kendall Tau distance
distance for permutation representations (BRKGA::KendallTauDistance
). Again, details about
threshold and permutation representations in [ATGR21].
As a simple example, suppose you are using a threshold representation where each chromosome key can represent one of 3 different values (a ternary threshold representation). So, one possible way to compute the distance between two chromosomes can be:
1class TernaryHammingDistance: public DistanceFunctionBase {
2protected:
3 double value(const double key) const {
4 return key < 0.33 ? 0.0 : (key < 0.66 ? 1.0 : 2.0);
5 }
6
7public:
8 explicit TernaryHammingDistance() {}
9 virtual ~TernaryHammingDistance() {}
10
11 virtual double distance(const std::vector<double>& vector1,
12 const std::vector<double>& vector2) override {
13 double dist = 0.0;
14 for(std::size_t i = 0; i < vector1.size(); ++i)
15 dist += std::fabs(value(vector1[i]) - value(vector2[i]));
16 return dist;
17 }
18
19 virtual bool affectSolution(const double key1, const double key2) override {
20 return std::fabs(value(key1) - value(key2)) > 0.0;
21 }
22
23 virtual bool affectSolution(std::vector<double>::const_iterator v1_begin,
24 std::vector<double>::const_iterator v2_begin,
25 const std::size_t block_size) override {
26 for(std::size_t i = 0; i < block_size;
27 ++i, ++v1_begin, ++v2_begin) {
28 if(std::fabs(value(*v1_begin) - value(*v2_begin)) > 0.0)
29 return true;
30 }
31 return false;
32 }
33};
To avoid changes that do not lead to new solutions, we must verify if such key
exchanges affect the solution. For that, the distance functor object must
implement the methods affectSolution()
, which come with two signatures,
depicted in the previous example.
In the first version, affectSolution()
takes two keys and check whether the
exchange of key1
by key2
could change the solution. If so, the method
returns true
. The second version takes two iterators for two chromosomes
and checks block_size
keys from those iterators. The idea is, instead to
check only individual keys, we check an entire block of keys. This is very
usual for path relinks that exchange blocks of keys instead individual ones.
For instance, suppose that the alleles/keys are used as threshold such that
values > 0.5 activate a feature. Suppose we have chromosome1 = [0.3, 0.4,
0.1, 0.8]
and chromosome2 = [0.6, 0.1, 0.2, 0.9]
. If the key blocks start
on the first keys, and block_size = 2
, affectSolution()
will return
true
since the very first keys have different activation value. However, if
we start from the 3rd keys and exchange blocks of 2 keys ([0.4, 0.1]
by
[0.1, 0.2]
), nothing changes since both values have the same activation
level (< 0.5). The blocks can hold only one key/allele, sequential key blocks,
or even the whole chromosome.
Note
Note that affectSolution()
is crucial to the IPR performance since this
function helps to avoid exploring regions already surveyed. Also, note that
affectSolution()
can incorporate some problem knowledge.
Note
The current implementation of permutation path relink does not make use of
affectSolution()
. However, pathRelink()
requires the function.
Therefore, we can implement simple constant methods:
1 virtual bool affectSolution(const double, const double) override {
2 return true;
3 }
4
5 virtual bool affectSolution(std::vector<double>::const_iterator,
6 std::vector<double>::const_iterator,
7 const std::size_t) override {
8 return true;
9 }
block_size
defines the number of keys / size of the chromosome block to be
exchanged during the direct path relink. This parameter is also critical for
IPR performance since it avoids too many exchanges during the path building.
Usually, we can compute this number based on the size of the chromosome by some
factor (
BrkgaParams::alpha_block_size
in the configuration file), chosen by you.
Again, details in [ATGR21].
Note
Experiments have shown that a good choice is \(block\_size = alpha\_block\_size \times \sqrt{size~of~chromosome}\)
The last two parameters are stopping criteria. The algorithm stops either when
max_time
seconds is reached or percentage%
of the path is built.
Warning
IPR is a very time-intensive process. You must set the stopping criteria accordingly.
Let’s see how can we call IPR. As example, take the TSP for which we use the permutation-based IPR, and the Kendall Tau distance functions.
1std::shared_ptr<DistanceFunctionBase> dist_func {new KendallTauDistance};
2
3auto result = algorithm.pathRelink(
4 brkga_params.pr_type,
5 brkga_params.pr_selection,
6 dist_func,
7 brkga_params.pr_number_pairs,
8 1, // block_size doesn't not matter for permutation.
9 max_time - elapsedFrom(start_time),
10 brkga_params.pr_percentage
11);
Note that most parameters come from brkga_params
. The maximum IPR time is
set to the remaining time for optimization (global maximum_time
minus the
elapsed time).
pathRelink()
returns a
BRKGA::PathRelinking::PathRelinkingResult
object which defines the status of the IPR optimization. Four situation may
happen:
TOO_HOMOGENEOUS
The chromosomes among the populations are too homogeneous and the path relink will not generate improveded solutions. This status is directly linked to the chosen distance function and minimum distance. If the minimum distance is too large, IPR may not be able to find a pair of chromosomes far enough for path relink;NO_IMPROVEMENT
Path relink was done but no improveded solution was found;ELITE_IMPROVEMENT
An improved solution among the elite set was found, but the best solution was not improved;BEST_IMPROVEMENT
The best solution was improved.
If the found solution is the best solution found so far, IPR replaces the worst
solution by it. Otherwise, IPR computes the distance between the found solution
and all other solutions in the elite set, and replaces the worst solution by it
if and only if the found solution is, at least, minimum_distance
from all
them.
Important notes about IPR
IPR will call decode()
function always with writeback = false
. The
reason is that if the decoder rewrites the chromosome, the path between
solutions is lost and inadvertent results may come up. Note that at the end of
the path relinking, the method calls the decoder with writeback = true
in
the best chromosome found to guarantee that this chromosome is re-written to
reflect the best solution found.
Warning
Make sure your decoder does not rewrite the chromosome when called with the
argument writeback = false
.
BRKGA-MP-IPR pathRelink()
implementation is multi-threaded. Instead of to
build and decode each chromosome one at a time, the method builds a list of
candidates, altering the alleles/keys according to the guide solution, and then
decode all candidates in parallel. Note that
\(O(chromosome\_size^2~/~block\_size)\) additional memory is necessary to
build the candidates, which can be costly if the chromosome_size
is very
large.
Warning
As it is in method evolve()
, the decoding is done in parallel using
threads, and the user must guarantee that the decoder is THREAD-SAFE.
If such property cannot be held, we suggest using a single thread for
optimization.
Shaking
Sometimes, BRKGA gets stuck, converging to local maxima/minima, for several
iterations. When such a situation happens, it is a good idea to perturb the
population, or even restart from a new one completely new. BRKGA-MP-IPR offers
BRKGA_MP_IPR::shake()
,
an improved variation of the original version proposed in
[ASP19]. This is the signature:
1void shake(
2 unsigned intensity,
3 ShakingType shaking_type,
4 unsigned population_index = std::numeric_limits<unsigned>::infinity()
5)
shake()
,
method gets an intensity
parameter that measures how many times
the perturbation is applied on the elite set for a given population_index
(if not given, all populations are shaken). This method offers two
canonial/generic
BRKGA::ShakingType
’s.
With
CHANGE
,
direct modifications are done in the keys/alleles. This kind of
shaking is recommended when the chromosome uses direct or threshold
representations.
SWAP
exchanges keys/alleles inducing new permutations.
For representational definitions, please refer to
[ATGR21]. For instance, the following code shakes
all populations using 10 swap moves:
algorithm.shake(10, BRKGA::ShakingTypeSWAP);
Sometimes, the provided shaking are not appropriated to the problem being solved. In this case, the user can provide a custom shaking procedure. Please, take a look on section Providing custom shake procedure.
Resetting
Sometimes, even shaking the populations does not help to escape from local
maxima/minima. So, we need a drastic measure, restarting from scratch the role
population. This can be easily accomplished with
BRKGA::BRKGA_MP_IPR::reset()
.
algorithm.reset();
Note
When using reset()
, all warm-start solutions provided by
setInitialPopulation()
are discarded. You may use injectChromosome()
to insert those solutions again.
Warning
Again, the decoding of each chromosome is done in parallel if multi-thread is enabled. Therefore, we must guarantee that the decoder is THREAD-SAFE. If such property cannot be held, we suggest using a single thread..
Multi-population and migration
Multi-population or island model was introduced in genetic algorithms in [WRH98]. The idea is to evolve parallel and independent populations and, once a while, exchange individuals among these populations. In several scenarios, this approach is very beneficial for optimization.
BRKGA-MP-IPR is implemented using such island idea from the core. If you read
the guide until here, you may notice that several methods take into account
multiple populations. To use multiple populations, you must set
BrkgaParams::num_independent_populations
with 2 or more populations, and build the BRKGA algorithm from such parameters.
The immigration process is implemented by method
BRKGA_MP_IPR::exchangeElite()
which has the following signature:
void exchangeElite(unsigned num_immigrants);
BRKGA_MP_IPR::exchangeElite()
copies num_immigrants
from one population to another, replacing the worst
num_immigrants
individuals from the recipient population. Note that the
migration is done for all pairs of populations. For instance, the following
code exchanges 3 best individuals from each population:
algorithm.exchangeElite(3);
Simulating the standard BRKGA
Sometimes, it is a good idea to test how the standard BRKGA algorithm performs for a problem. You can use BRKGA-MP-IPR framework to quickly implement and test a standard BRKGA.
First, you must guarantee that, during the crossover, the algorithm chooses
only one elite individual and only one non-elite individual. This is easily
accomplished setting num_elite_parents = 1
and total_parents = 2
. Then,
you must set up a bias function that ranks the elite and no-elite individual
according to the original BRKGA bias parameter \(\rho\) (rho).
You can use
BRKGA_MP_IPR::setBiasCustomFunction()
for that task. The given function should receive the index of the chromosome and
returns a ranking for it. Such ranking is used in the roulette method to choose
the individual from which each allele comes to build the new chromosome. Since
we have one two individuals for crossover in the standard BRKGA, the bias
function must return the probability to one or other individual. In the
following code, we do that with a simple if...else
lambda function.
// Create brkga_params by hand or reading from a file, // then set the following by hand. brkga_params.num_elite_parents = 1; brkga_params.total_parents = 2; // This is the original parameter rho form the vanilla BRKGA. const double rho = 0.75; algorithm.setBiasCustomFunction!( [&](const unsigned x) { return x == 1 ? rho : 1.0 - rho; } );
Here, we first set the num_elite_parents = 1
and total_parents = 2
as
explained before. Following, we set a variable rho = 0.75
. This is the
\(\rho\) from standard BRKGA, and you may set it as you wish. Then, we set
the bias function as a very simple lambda function (note that we must use
[&]
to capture rho
in the outside context). So, if the index of the
chromosome is 1 (elite individual), it gets a 0.75 rank/probability. If the
index is 2 (non-elite individual), the chromosome gets 0.25 rank/probability.
Warning
Note that we consider the index 1 as the elite individual instead index 0,
and index 2 to the non-elite individual opposed to index 1. The reason
for this is that, internally, BRKGA always pass r > 0
to the bias
function to avoid division-by-zero exceptions.
Reading and writing parameters
Although we can build the BRKGA algorithm data by set up a
BrkgaParams
object
manually, the easiest way to do so is to read such parameters from a
configuration file. For this, we can use
BRKGA::readConfiguration()
that reads a simple plain text file and returns a tuple of
BRKGA::BrkgaParams
and
BRKGA::ControlParams
objects. For instance,
auto [brkga_params, control_params] = BRKGA::readConfiguration("tuned_conf.txt");
The configuration file must be plain text such that contains pairs of parameter name and value. In examples folder, we have config.conf that looks like this:
1# BRKGA and IPR parameters
2population_size 2000
3elite_percentage 0.30
4mutants_percentage 0.15
5num_elite_parents 2
6total_parents 3
7bias_type LOGINVERSE
8num_independent_populations 3
9pr_number_pairs 0
10pr_minimum_distance 0.15
11pr_type DIRECT
12pr_selection BESTSOLUTION
13pr_distance_function_type KENDALLTAU
14alpha_block_size 1.0
15pr_percentage 1.0
16num_exchange_individuals 1
17shaking_type SWAP
18shaking_intensity_lower_bound 0.25
19shaking_intensity_upper_bound 0.75
20
21# Control parameters
22maximum_running_time 60
23exchange_interval 100
24ipr_interval 200
25shake_interval 300
26reset_interval 500
27stall_offset 100
It does not matter whether we use lower or upper cases. Blank lines and lines
starting with #
are ignored. The order of the parameters should not
matter either. And, finally, this file should be readable for both C++, Julia,
and Python framework versions (when all come to the same version number).
In some cases, you define some of the parameters at the running time, and you
may want to save them for debug or posterior use. To do so, you can use
BRKGA::writeConfiguration()
,
call upon a BrkgaParams
object.
1BRKGA::writeConfiguration("test.conf", brkga_params);
2//or
3BRKGA::writeConfiguration ("test.conf", brkga_params, control_params);
If control_params
is not given, default values are used in its place.
Note
BRKGA::writeConfiguration()
rewrites the given file. So, watch out to not lose previous configurations.
Using BRKGA-MP-IPR on multi-objective mode
As stated in the introduction, BRKGA-MP-IPR also deals with multiple objectives in a lexicographical or priority dominance order. Differing from classical non-dominance order (using Pareto frontiers), the lexicographical order defines a strict preference order among the objective functions. This leads us to a partial ordering of the values of the solutions (composed of several values, each one from one objective function). So, we have the following definition (abusing a little bit of notation).
Definition
Let \(A = (f_1, f_2, \ldots, f_n)\) and \(A' = (f'_1, f'_2, \ldots, f'_n)\) be two vectors for \(n\) functions \(f_1, f_2, \ldots, f_n\). \(A\) is lexicographical smaller than \(A'\), i.e., \(A < A'\) if and only if \(f_1 < f'_1\), or \(f_1 = f'_1\) and \(f_2 < f'_2\), or \(\ldots, f_1 = f'_1, \ldots, f_{n-1} = f'_{n-1}\) and \(f_n < f'_n\).
For instance, let’s assume we have three minimizing objective functions and four solutions described in the following table:
Solution |
\(f_1\) |
\(f_2\) |
\(f_3\) |
---|---|---|---|
A |
50 |
30 |
30 |
B |
30 |
55 |
40 |
C |
30 |
20 |
50 |
D |
30 |
20 |
25 |
Note that Solution B is better than Solution A because \(f_1(A) < f_1(B),\) even though A has much better values for \(f_2\) and \(f_3\). Now, Solution C is better B because, although \(f_1(B) = f_1(C),\) we have that \(f_2(B) < f_2(C),\) regardless of the value of \(f_3.\) Solution D has the best value for all objective functions. Therefore \(D < C < B < A.\)
In many problems in the real-life, the users usually require a particular priority order among several objective functions, and therefore, the lexicographical approach is very appropriate. However, if the objective functions have no apparent order in your scenario, you may need to use a non-dominated approach.
Warning
If you really want an algorithm to produce a non-dominated set of solutions (Pareto frontier), this is not the right algorithm for you. We recommend taking a look at the NSGA-II and MOAB.
Note that we could use the single-objective version of the BRKGA, by doing a linear (or affine) comnination of the objective function like this:
where \(n\) is the number of objective functions. In the minimization case, to guarantee that \(f_i\) is more important than \(f_{i + 1}\), we must choose \(\alpha_i > \sup(D_{i + 1})\), the supremum of set \(D_{i+1}\) which is the image of \(f_{i+1}\), i.e., the value \(f_{i+1}\) can generate. In other words, \(\alpha_i\) must be larger than the highest value \(f_{i+1}\) can take.
For instance, suppose we have two objective integer functions \(f_1\) and \(f_2\), for a minimization problem. Function \(f_1\) values vary from 0 to 10, and \(f_2\) varies from 100 to 500. To guarantee that a single unit change in \(f_1\) is more important than all \(f_2\), we must make alpha1 = 501. Therefore, if we reduce \(f_1\) off one unit, this impact in the combined objective function will be 501 units, which is larger than the largest possible value of \(f_2\).
The problem with this approach is that it can lead to numerical issues too fast and too frequent for lots of practical applications. Depending on the domain of our functions, precision errors and overflow can occur, impairing the optimization process at all. Second, this is harder to debug and gets more work from the developer to break the total value on every single objective value.
That said, to use BRKGA-MP-IPR in the native multi-objective mode,
we first must set
BRKGA::fitness_t
according to the number of objectives we want. For that, we must
change the file
fitness_type.hpp
to reflect such a thing. In this example, we use the standard
std::tuple:
1namespace BRKGA {
2using fitness_t = std::tuple<double, double>;
3} // end namespace BRKGA_MP_IPR
In theory, we can use any custom structure or class, providing it implements the comparison operators (operator<, operator>, and operator==), and the streaming operator std::ostream& operator<<(std::ostream& os, const CustomFitness& fitness) where CustomFitness is your fitness structure.
Internally, BRKGA-MP-IPR doesn’t use operator==
directly. BRKGA implements
the custom function
BRKGA::close_enough()
.
For fundamental numerical types, it compares the absolute difference with a given
BRKGA::EQUALITY_THRESHOLD
i.e., two numbers \(a\) and \(b\) equal if \(|a - b| <
EQUALITY\_THRESHOLD\). For all other types (except std::tuple
), we use
operator==
. For std::tuple
, we have a specialized close_enough()
that iterates over each element of the tuples calling the base
close_enough()
.
Once defined your fitness_t
, you must also provide
BRKGA::FITNESS_T_MIN
and
BRKGA::FITNESS_T_MAX
,
if your fitness_t
is not a fundamental type or a tuple of
fundamental types. The following is an example of a custom fitness_t
should
looks like.
1class MyCrazyFitness {
2public:
3 constexpr MyCrazyFitness() = default;
4
5
6 constexpr MyCrazyFitness(const double _main_part,
7 const double _secondary_part,
8 const double _threshold):
9 main_part(_main_part),
10 secondary_part(_secondary_part),
11 threshold(_threshold)
12 {}
13
14 bool operator<(const MyCrazyFitness& second) const {
15 if(this->main_part < (second.main_part - threshold))
16 return true;
17 if(this->secondary_part < second.secondary_part)
18 return true;
19 return false;
20 }
21
22 bool operator>(const MyCrazyFitness& second) const {
23 if(this->main_part > (second.main_part + threshold))
24 return true;
25 if(this->secondary_part > second.secondary_part)
26 return true;
27 return false;
28 }
29
30 bool operator==(const MyCrazyFitness& second) const {
31 return !((*this < second) || (*this > second));
32 }
33
34public:
35 double main_part;
36 double secondary_part;
37 double threshold;
38};
39
40std::ostream& operator<<(std::ostream& os, const MyCrazyFitness& fitness) {
41 os
42 << "(main_part: " << fitness.main_part << ", "
43 << "secondary_part: " << fitness.secondary_part << ", "
44 << "threshold: " << fitness.threshold << ")";
45 return os;
46}
47
48// The following three definitions are mandatory!
49using fitness_t = MyCrazyFitness;
50
51static constexpr fitness_t FITNESS_T_MIN {-100.0, -10.0, 1e-16};
52
53static constexpr fitness_t FITNESS_T_MAX {1000.0, 100.0, 1e-1};
Warning
Unless we really need a custom fitness_t
, you should use std::tuple
.
(Probable Valuable) Tips
Algorithm warmup
While in Julia framework version is primordial to do a dry-run to precompile all functions (and a good idea on Python version), in C++ and Python this warmup is not necessary. However, few dry-runs can help the OS/processor with cache locality and give some speedup.
Besides the dry-runs, I would recommend the pre-allocation of all resource/memory that you need, if you know in advance how much will be necessary. This pre-allocation speeds the decoding process dramatically. There is some argument. whether or not we should pre-allocate some data structures since it might incur false sharing. Therefore, more experimentation and fine-tuning are needed in this space.
Complex decoders and timing
Some problems require complex decoders while for others, the decoder contains local search procedures, that can be time-consuming. In general, the decoding is the most time-expensive component of a BRKGA algorithm, and it may skew some stopping criteria based on running time. Therefore, if your decoder is time-consuming, it is a good idea to implement a timer or chronometer kind of thing inside the decoder.
Testing for stopping time uses several CPU cycles, and you need to be careful when/where to test it, otherwise, you spend all the optimization time doing system calls to the clock.
IMHO, the most effective way to do it is to test time at the very begining of
the decoding. If the current time is larger than the maximum time allowed,
simple return Inf
or -Inf
according to your optimization direction. In
this way, we make the solution invalid since it violates the maximum time
allowed. The BRKGA framework takes care of the rest.
Multi-thread decoding
Since Moore’s law is not holding its status anymore, we, simple mortals, must appeal to the wonders of multi-threading. This paradigm can be tricky to code, and Amdahl’s law plays against us. Several genetic algorithms, and in particular, BRKGA, can use parallel solution evaluation (or decoding), which makes the use of multi-threading relatively straightforward. BRKGA-MP-IPR C++ framework is not different, and it uses OpenMP capabilities to do so.
First, as commented several times in this guide, the decoder must be THREAD-SAFE. So, each thread must have its own read/write data structures and may share other read-only data. The simplest way to do it is to create those structures inside the decoder (like most people do). But be aware, this strategy slows down the algorithm significantly depending on the size and format of the structures, and I do not recommend it.
IMHO, the best way to do that is to preallocate the data structure per thread and pass them to the decoder through the problem instance. Then, inside the decoder, you can use omp_get_thread_num() and recover the memory you want to use.
Let’s see a simple example considering the TSP example. TSP_Decode
uses a
single array to create the permutation of nodes. Let’s pre-allocate its memory
per thread. To keep things separeted and easy to understand, we created a new
class
TSP_Decoder_pre_allocating
so that we allocate, for each thread, a vector to hold the permutation during
the decoding:
1class TSP_Decoder_pre_allocating {
2public:
3 TSP_Decoder_pre_allocating(const TSP_Instance& instance,
4 const unsigned num_threads = 1);
5
6 double decode(BRKGA::Chromosome & chromosome, bool rewrite = true);
7
8public:
9 const TSP_Instance& instance;
10
11protected:
12 using Permutation = std::vector<std::pair<double, unsigned>>;
13 std::vector<Permutation> permutation_per_thread;
14};
Note that the constructor has one more argument indicating how many threads we
are using. We also define a type Permutation
with is a simple vector of
key-node pairs. The important structure is permuration_per_thread
which is
a simple vector of the size of the number of threads where we allocate the
permutation vectors for each thread.
Then, in the implementation, we allocate all memory in the constructor
(RAII principle).
In decode
, we use omp_get_thread_num()
to identify which thread called the decoder, and retrieve the respective data
strucuture.
1#include <omp.h>
2
3TSP_Decoder_pre_allocating::TSP_Decoder_pre_allocating(
4 const TSP_Instance& _instance, const unsigned num_threads):
5 instance(_instance),
6 // Pre-allocate space for permutations for each thread.
7 permutation_per_thread(num_threads, Permutation(instance.num_nodes))
8{}
9
10double TSP_Decoder_pre_allocating::decode(Chromosome& chromosome,
11 bool /* not-used */) {
12 // If you have OpenMP available, get the allocated memory per thread ID.
13 #ifdef _OPENMP
14 auto& permutation = permutation_per_thread[omp_get_thread_num()];
15 #else
16 auto& permutation = permutation_per_thread[0];
17 #endif
18
19 for(unsigned i = 0; i < instance.num_nodes; ++i)
20 permutation[i] = make_pair(chromosome[i], i);
21
22 sort(permutation.begin(), permutation.end());
23
24 double cost = instance.distance(permutation.front().second,
25 permutation.back().second);
26
27 for(unsigned i = 0; i < instance.num_nodes - 1; ++i)
28 cost += instance.distance(permutation[i].second,
29 permutation[i + 1].second);
30
31 return cost;
32}
Note
Pre-allocation and multi-threading only make sense for large data structures and time-consuming decoders. Otherwise, the code spends too much time on context switching and system calls.
Warning
Multi-threading consumes many resources of the machine and have diminishing returns. I recommend using at most the number of physical cores (may -1) to avoid racing and too much context switching. You must test which is the best option for your case. In my experience, complex decoders benefit more from multi-threading than simple and fast decoders.
Multi-thread mating
One of the nice additions to BRKGA-MP-IPR 2.0 is the capability of performing
the mating in parallel. Such capability speeds up the algorithm substantially,
mainly for large chromosomes and large populations. However, when performing
parallel mating, we have some points regarding reproducibility described below.
The parallel mating is controlled in compilation time by the preprocessing flags
MATING_FULL_SPEED
,
MATING_SEED_ONLY
,
and
MATING_SEQUENTIAL
,
Compiling your code with
MATING_SEQUENTIAL
,
enabled will remove the parallel mating at all, and the algorithm will behave
like the previous versions. This option can be very slow for large chromosomes
and large populations. But it makes debugging easier. Following we have a quick
example using the TSP decoder with 4 threads, for 1,000 generations. We only
run the evolutionary portion, disabling all other features
(see main_maximum_iterations.cpp).
Note that the average total time (real time) is 4m 46s.
$ make
...
...
g++ -DMATING_SEQUENTIAL -std=c++20 -O3 -fomit-frame-pointer -funroll-loops \
-ftracer -fpeel-loops -fprefetch-loop-arrays -pthread -fopenmp \
-I. -I./brkga_mp_ipr -c main_maximum_iterations.cpp -o main_maximum_iterations.o
...
$ for i in 1 2 3; do time ./main_maximum_iterations 270001 config.conf 1000 ../../instances/rd400.dat > /dev/null; done
real 4m48.649s
user 6m58.208s
sys 0m0.432s
real 4m42.132s
user 6m48.032s
sys 0m0.418s
real 4m46.317s
user 6m53.796s
sys 0m0.243s
Setting
MATING_SEED_ONLY
,
BRKGA will perform the mating in parallel, however, in
a more controlled way. On each evolutionary step, the algorithm creates a
sequence of seeds (one per to-be-generated offspring). Such seeds are fed to a
pseudo-random number generator (RNG) for each thread: when starting the mating,
the RNG is seeded before being used. This occurs in each iteration. In this way,
all generators have their state controlled by the main generator, and therefore,
the unique seed supplied by the user. This is another example, running the
same machine and also 4 threads. The average total time is 1m 58s, a whooping
reduction of ~59%.
$ make
...
...
g++ -DMATING_SEED_ONLY -std=c++20 -O3 -fomit-frame-pointer -funroll-loops \
-ftracer -fpeel-loops -fprefetch-loop-arrays -pthread -fopenmp \
-I. -I./brkga_mp_ipr -c main_maximum_iterations.cpp -o main_maximum_iterations.o
...
$ for i in 1 2 3; do time ./main_maximum_iterations 270001 config.conf 1000 ../../instances/rd400.dat > /dev/null; done
real 1m58.493s
user 7m34.826s
sys 0m2.335s
real 1m57.311s
user 7m38.627s
sys 0m0.941s
real 1m57.403s
user 7m47.691s
sys 0m0.950s
Finally,
MATING_FULL_SPEED
,
enables parallel mating at full speed. In this case,
each thread has a unique RNG previously seeded at the beginning of the algorithm
(BRKGA constructor). This initialization is done in a chain: the first RNG is
seeded with the seed provided by the user. For the following, the algorithm uses
the previous RNG state as seed. In this way, we guarantee that all RNGs have
different states, but all depend on the user seed. However, in this case, the
reproducibility depends on both the seed given by the user and the number of
threads. This is because to guarantee reproducibility, we must ensure that the
same chromosome region is handled by the same thread (id) since we have an RNG
per thread. In other words, the random states must be the same for each thread,
on different runs. When we increase or decrease the number of threads,
different threads will handle the chromosomes. However, this is the fastest
method. Here is another example, using the same conditions as before.
The average here is 1m 56s, a marginal improve regarding to
MATING_SEED_ONLY
,
$ make ... ... g++ -DMATING_FULL_SPEED -std=c++20 -O3 -fomit-frame-pointer -funroll-loops \ -ftracer -fpeel-loops -fprefetch-loop-arrays -pthread -fopenmp \ -I. -I./brkga_mp_ipr -c main_maximum_iterations.cpp -o main_maximum_iterations.o ... $ for i in 1 2 3; do time ./main_maximum_iterations 270001 config.conf 1000 ../../instances/rd400.dat > /dev/null; done real 1m55.840s user 7m40.191s sys 0m1.382s real 1m56.824s user 7m44.251s sys 0m0.845s real 1m56.293s user 7m41.350s sys 0m2.114s
Warning
Remember, when using
MATING_FULL_SPEED
,
the results depend on both seed and number of threads.
Multiple runs with the same seed and number of threads should produce the
same results. Changing one or other, the results will change.
Of course, these results depend on the problem and decoder implementation. But you can see that, or simple/fast decoders, both parallel options represent a very significant improvement over the serial version.
Note
We thank Alberto F. Kummer very much for the first version of the parallel mating and the fruitful discussions about it and other topics. Please, also consider citing his paper [NA20].
Known issues
Since BRKGA-MP-IPR is header-only, we may have compilation/linking issues if we
include the BRKGA-MP-IPR header in different modules (translation units on C++
jargon) and compile them separately (which normally we do). For example,
suppose we have two pieces of code, module_a.cpp and module_b.cpp, such that we
include BRKGA in both (i.e., #include "brkga_mp_ipr"
in both files.
File module_a.cpp:
1#include "brkga_mp_ipr.hpp"
2int main() {
3 auto params = BRKGA::readConfiguration("config.conf");
4 return 0;
5}
File module_b.cpp:
1#include "brkga_mp_ipr.hpp"
2void test() {
3 auto params = BRKGA::readConfiguration("config.conf");
4}
Let’s compile each one with GCC and link them:
1$ g++ -std=c++20 -I../brkga_mp_ipr -c module_a.cpp -o module_a.o
2
3$ g++ -std=c++20 -I../brkga_mp_ipr -c module_b.cpp -o module_b.o
4
5$ g++ module_a.o module_b.o -o test
6
7duplicate symbol '__ZN5BRKGA6EnumIOINS_13PathRelinking9SelectionEE10enum_namesB5cxx11Ev' in:
8 module_a.o
9 module_b.o
10duplicate symbol '__ZN5BRKGA6EnumIOINS_5SenseEE10enum_namesB5cxx11Ev' in:
11 module_a.o
12 module_b.o
13duplicate symbol '__ZN5BRKGA6EnumIOINS_16BiasFunctionTypeEE10enum_namesB5cxx11Ev' in:
14 module_a.o
15 module_b.o
16duplicate symbol '__ZN5BRKGA6EnumIOINS_13PathRelinking20DistanceFunctionTypeEE10enum_namesB5cxx11Ev' in:
17 module_a.o
18 module_b.o
19duplicate symbol '__ZN5BRKGA6EnumIOINS_11ShakingTypeEE10enum_namesB5cxx11Ev' in:
20 module_a.o
21 module_b.o
22duplicate symbol '__ZN5BRKGA6EnumIOINS_13PathRelinking4TypeEE10enum_namesB5cxx11Ev' in:
23 module_a.o
24 module_b.o
25duplicate symbol '__ZN5BRKGA17readConfigurationERSiRSo' in:
26 module_a.o
27 module_b.o
28duplicate symbol '__ZN5BRKGA17readConfigurationERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERSo' in:
29 module_a.o
30 module_b.o
31duplicate symbol '__ZN5BRKGAlsERSoRKNS_15AlgorithmStatusE' in:
32 module_a.o
33 module_b.o
34duplicate symbol '__ZN5BRKGAlsERSoRKNS_13ControlParamsE' in:
35 module_a.o
36 module_b.o
37duplicate symbol '__ZN5BRKGA18writeConfigurationERSoRKNS_11BrkgaParamsERKNS_13ControlParamsE' in:
38 module_a.o
39 module_b.o
40duplicate symbol '__ZN5BRKGA18writeConfigurationERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERKNS_11BrkgaParamsERKNS_13ControlParamsE' in:
41 module_a.o
42 module_b.o
43duplicate symbol '__ZN5BRKGAlsERSoRKNS_11BrkgaParamsE' in:
44 module_a.o
45 module_b.o
46ld: 13 duplicate symbols for architecture x86_64
47collect2: error: ld returned 1 exit status
Now, we try with Clang:
1$ clang++ -std=c++20 -fopenmp -I../brkga_mp_ipr -c module_a.cpp -o module_a.o
2
3$ clang++ -std=c++20 -fopenmp -I../brkga_mp_ipr -c module_b.cpp -o module_b.o
4
5$ clang++ -std=c++20 -fopenmp module_a.o module_b.o -o test
6
7duplicate symbol 'BRKGA::EnumIO<BRKGA::PathRelinking::Selection>::enum_names()' in:
8 module_a.o
9 module_b.o
10duplicate symbol 'BRKGA::EnumIO<BRKGA::Sense>::enum_names()' in:
11 module_a.o
12 module_b.o
13duplicate symbol 'BRKGA::EnumIO<BRKGA::BiasFunctionType>::enum_names()' in:
14 module_a.o
15 module_b.o
16duplicate symbol 'BRKGA::EnumIO<BRKGA::PathRelinking::DistanceFunctionType>::enum_names()' in:
17 module_a.o
18 module_b.o
19duplicate symbol 'BRKGA::EnumIO<BRKGA::ShakingType>::enum_names()' in:
20 module_a.o
21 module_b.o
22duplicate symbol 'BRKGA::EnumIO<BRKGA::PathRelinking::Type>::enum_names()' in:
23 module_a.o
24 module_b.o
25duplicate symbol 'BRKGA::operator<<(std::__1::basic_ostream<char, std::__1::char_traits<char>>&, BRKGA::AlgorithmStatus const&)' in:
26 module_a.o
27 module_b.o
28duplicate symbol 'BRKGA::writeConfiguration(std::__1::basic_ostream<char, std::__1::char_traits<char>>&, BRKGA::BrkgaParams const&, BRKGA::ControlParams const&)' in:
29 module_a.o
30 module_b.o
31duplicate symbol 'BRKGA::writeConfiguration(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, BRKGA::BrkgaParams const&, BRKGA::ControlParams const&)' in:
32 module_a.o
33 module_b.o
34duplicate symbol 'BRKGA::operator<<(std::__1::basic_ostream<char, std::__1::char_traits<char>>&, BRKGA::ControlParams const&)' in:
35 module_a.o
36 module_b.o
37duplicate symbol 'BRKGA::operator<<(std::__1::basic_ostream<char, std::__1::char_traits<char>>&, BRKGA::BrkgaParams const&)' in:
38 module_a.o
39 module_b.o
40duplicate symbol 'BRKGA::readConfiguration(std::__1::basic_istream<char, std::__1::char_traits<char>>&, std::__1::basic_ostream<char, std::__1::char_traits<char>>&)' in:
41 module_a.o
42 module_b.o
43duplicate symbol 'BRKGA::readConfiguration(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::basic_ostream<char, std::__1::char_traits<char>>&)' in:
44 module_a.o
45 module_b.o
46duplicate symbol 'operator<<(std::__1::basic_ostream<char, std::__1::char_traits<char>>&, std::__1::chrono::duration<double, std::__1::ratio<1l, 1l>> const&)' in:
47 module_a.o
48 module_b.o
49ld: 14 duplicate symbols for architecture x86_64
50clang: error: linker command failed with exit code 1 (use -v to see invocation)
So, note that we have several duplicated symbols (which varies a little bit
per compiler), including the EnumIO utilities, output stream operators,
BRKGA::readConfiguration()
,
BRKGA::writeConfiguration()
,
even the definitions of
BRKGA::AlgorithmStatus
,
BRKGA::BrkgaParams
,
and
BRKGA::ControlParams
.
To avoid such a situation, we have to inline
these functions and template
specializations. We can do that passing the compiler directive
BRKGA_MULTIPLE_INCLUSIONS
,
which inlines the functions and template specializations properly.
$ g++ -std=c++20 -I../brkga_mp_ipr -DBRKGA_MULTIPLE_INCLUSIONS -c module_a.cpp -o module_a.o
$ g++ -std=c++20 -I../brkga_mp_ipr -DBRKGA_MULTIPLE_INCLUSIONS -c module_b.cpp -o module_b.o
$ g++ module_a.o module_b.o -o test
However, we have two side-effects. First, such inlining can make the object code grows large since we include several copies of the same functions (which are I/O functions which already are large by their nature). Second, the compiler may complain about too many inline functions, if you are already using several inline functions.
Warning
Avoid including brkga_mp_ip.hpp
in several files/translation units.
If unavoidable, use the compiler directive BRKGA_MULTIPLE_INCLUSIONS
.
But now, suppose we must use multiple inclusions of BRKGA header, and our compiler finds issues on inline such functions. The last resource is to move the implementation of such functions to a separated translation unit (.cpp) and compile them isolated, adding to the linking stage.
References
Carlos E. Andrade, Lucia S. Pessoa, and Slawomir Stawiarski. The physical cell identity assignment problem: a practical optimization approach. IEEE Transactions on Evolutionary Computation, ():1–1, 2022. To appear. doi:10.1109/TEVC.2022.3185927.
Carlos E. Andrade, Thuener Silva, and Luciana S. Pessoa. Minimizing flowtime in a flowshop scheduling problem with a biased random-key genetic algorithm. Expert Systems with Applications, 128:67–80, Aug 2019. doi:10.1016/j.eswa.2019.03.007.
Carlos E. Andrade, Rodrigo F. Toso, José F. Gonçalves, and Mauricio G. C. Resende. The multi-parent biased random-key genetic algorithm with implicit path-relinking and its real-world applications. European Journal of Operational Research, 289(1):17–30, 2021. doi:10.1016/j.ejor.2019.11.037.
Carlos E. Andrade, Rodrigo F. Toso, Mauricio G. C. Resende, and Flávio K. Miyazawa. Biased random-key genetic algorithms for the winner determination problem in combinatorial auctions. Evolutionary Computation, 23:279–307, 2015. doi:10.1162/EVCO_a_00138.
William Higino, Antônio Augusto Chaves, and Vinicius Veloso de Melo. Biased random-key genetic algorithm applied to the vehicle routing problem with private fleet and common carrier. 2018 IEEE Congress on Evolutionary Computation (CEC), pages 1–8, 2018. doi:10.1109/CEC.2018.8477905.
Mauro C. Lopes, Carlos E. Andrade, Thiago A. Queiroz, Mauricio G. C. Resende, and Flávio K. Miyazawa. Heuristics for a hub location-routing problem. Networks, 68(1):54–90, 2016. doi:10.1002/net.21685.
Makoto Matsumoto, Isaku Wada, Ai Kuramoto, and Hyo Ashihara. Common defects in initialization of pseudorandom number generators. ACM Trans. Model. Comput. Simul., 17(4):15–es, sep 2007. doi:10.1145/1276927.1276928.
Alberto F. Kummer N. and Luciana S. Buriol Olinto C. B. Araújo. A biased random key genetic algorithm applied to the vrptw with skill requirements and synchronization constraints. In Proceedings of the 2020 Genetic and Evolutionary Computation Conference, GECCO '20, 717–724. New York, NY, USA, 2020. Association for Computing Machinery. doi:10.1145/3377930.3390209.
Luciana S. Pessoa and Carlos E. Andrade. Heuristics for a flowshop scheduling problem with stepwise job objective function. European Journal of Operational Research, 266(3):950–962, 2018. doi:10.1016/j.ejor.2017.10.045.
Darrell Whitley, Soraya Rana, and Robert B. Heckendorn. The island model genetic algorithm: On separability, population size and convergence. Journal of Computing and Information Technology, 7:33–47, 1998. URL: https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.36.7225.