Simulated annealing from scratch in Python

[translated from: Simulated Annealing From Scratch in Python]

[Note: Jason Brownlee likes PhD's articles very much, so he will do some translation and learning practice in his spare time. Here is the practice record of the corresponding work, hoping to help people in need!]

Simulated annealing is a stochastic global search optimization algorithm. This means that it takes randomness as part of the search process. This makes the algorithm suitable for nonlinear objective function, while other local search algorithms can not run well. Like the random mountain climbing local search algorithm, it modifies a single solution and searches the relative local area of the search space until the local optimal value is found. Unlike the mountain climbing algorithm, it may accept a poor solution as the current working solution. The possibility of accepting the worse solution is very high at the beginning of the search, and decreases with the progress of the search, so that the algorithm has the opportunity to locate the region for the global optimal first, avoid the local optimal, and then climb to the optimal itself.

In this tutorial, you will find a simulated annealing optimization algorithm for functional optimization. After completing this tutorial, you will know:

Simulated annealing is a random global search algorithm for function optimization.
How in Python Simulated annealing algorithm is implemented from scratch.
How to use simulated annealing algorithm and check the algorithm results.

 

Tutorial overview

This tutorial is divided into three parts: they are:

simulated annealing
 Simulated annealing
 Simulated annealing working example

simulated annealing

Simulated annealing is a stochastic global search optimization algorithm. The algorithm is inspired by the annealing process in metallurgy, which quickly heats the metal to a high temperature and then cools it slowly, thus improving its strength and making it easier to use. The working principle of the annealing process is: first, excite the atoms in the material at high temperature to make the atoms move a lot, and then slowly reduce their excitability to make the atoms fall into a new and more stable structure.

The simulated annealing optimization algorithm can be regarded as an improved version of random mountain climbing. Random mountain climbing maintains a single candidate solution, and takes a random but limited size step from the candidates in the search space. If the new point is better than the current point, replace the current point with the new point. This process will continue with a fixed number of iterations. Simulated annealing performs the search in the same way. The main difference is that new points that are not as good as the current point (almost) are sometimes accepted. Accept a worse point in probability, where the possibility of accepting a worse solution than the current solution depends on the search temperature and how much worse the solution is than the current solution.

The initial temperature of the search is provided as a super parameter and decreases as the search proceeds. Although the temperature is usually calculated according to the number of iterations, many different schemes (annealing plan) can be used to reduce the temperature in the search process from the initial value to a very low value. A popular example of calculating temperature is the so-called "rapid simulated annealing", which is calculated as follows

Temperature = initial temperature / (number of iterations + 1)
When the number of iterations starts from zero, we increase the number of iterations by 1 to avoid the error of dividing by zero. The acceptance of the worse solution uses the temperature and the difference between the objective function evaluation of the worse solution and the current solution. Use this information to calculate a value between 0 and 1, indicating the possibility of accepting a worse solution. The distribution is then sampled with a random number. If the random number is less than this value, it indicates that a poor solution is acceptable.

This is referred to as the urban acceptance standard, and for the purpose of minimization, it is calculated as follows:

                                          criterion = exp( -(objective(new) – objective(current)) / temperature)          
Where exp () is e (mathematical constant), which is raised to the power of the provided independent variable, while objective (New) and objective (current) are the objective function evaluation of new (worse) and current candidate solutions. As a result, poor solutions are more likely to be accepted early in the search and less likely to be accepted later in the search. The purpose is that the high temperature at the beginning of the search will help the search to find the basin with the global optimal value, while in the later stage of the search, the low temperature will help the algorithm improve the global optimal value.

Now that we are familiar with the simulated annealing algorithm, let's see how to implement it from scratch.

Simulated annealing

In this section, we will explore how to implement simulated annealing optimization algorithm from scratch. First, we must define the Objective function and the bounds of each input variable to the Objective function. The Objective function is just a Python function, which we call Objective (). The boundary will be a 2D array, and each input variable has a dimension that defines the minimum and maximum values of the variable. For example, the one-dimensional Objective function and bounds will be defined as follows:

# objective function
def objective(x):
	return 0

# define range for input
bounds = asarray([[-5.0, 5.0]])

Next, we can generate the initial point into a random point within the scope of the problem, and then use the objective function to evaluate it.

# generate an initial point
best = bounds[:, 0] + rand(len(bounds)) * (bounds[:, 1] - bounds[:, 0])
# evaluate the initial point
best_eval = objective(best)

We need to maintain the "current" solution, which is the focus of the search and can be replaced by a better solution.

# current working solution
curr, curr_eval = best, best_eval

Now we can traverse the predefined number of iterations of the algorithm defined as "n_iterations", such as 100 or 1000.

# run the algorithm
for i in range(n_iterations):

The first step of algorithm iteration is to generate new candidate solutions from the current working solutions, such as taking a step. This requires a predefined "step_size" parameter, which is relative to the boundary of the search space. We will adopt the random step of Gaussian distribution, where the mean is our current point and the standard deviation is defined by "step_size". This means that about 99% of the steps will be 3 * step at the current point_ Within size.

# take a step
candidate = solution + randn(len(bounds)) * step_size

We don't have to do this. You may want to use a uniform distribution between 0 and steps. For example:

# take a step
candidate = solution + rand(len(bounds)) * step_size

Next, we need to evaluate it.

# evaluate candidate point
candidte_eval = objective(candidate)

Then, we need to check whether the evaluation result of this new point is equal to or better than the current best point. If so, we will replace the current best point with this new point. This is different from the current work solution, which is the focus of search.

# check for new best solution
if candidate_eval < best_eval:
	# store new best point
	best, best_eval = candidate, candidate_eval
	# report progress
	print('>%d f(%s) = %.5f' % (i, best, best_eval))

Next, we need to be prepared to replace the current working solution. The first step is to calculate the difference between the objective function evaluation of the current solution and the current working solution.

# difference between candidate and current point evaluation
diff = candidate_eval - curr_eval

Next, we need to use the rapid annealing program to calculate the current temperature, where "temp" is the initial temperature provided as a parameter.

# calculate temperature for current epoch
t = temp / float(i + 1)

Then, we can calculate the possibility of accepting a solution with worse performance than the current working solution.

# calculate metropolis acceptance criterion
metropolis = exp(-diff / t)

Finally, if the objective function evaluation is better (the difference is negative) or the objective function is poor, we can accept the new point as the current working solution, but we may decide to accept it.

# check if we should keep the new point
if diff < 0 or rand() < metropolis:
	# store the new current point
	curr, curr_eval = candidate, candidate_eval

That's it. We can implement the simulated annealing algorithm as a reusable function. The function takes the name of the objective function, the boundary of each input variable, the total number of iterations, step size and initial temperature as parameters, and returns the found best solution and its evaluation.

# simulated annealing algorithm
def simulated_annealing(objective, bounds, n_iterations, step_size, temp):
	# generate an initial point
	best = bounds[:, 0] + rand(len(bounds)) * (bounds[:, 1] - bounds[:, 0])
	# evaluate the initial point
	best_eval = objective(best)
	# current working solution
	curr, curr_eval = best, best_eval
	# run the algorithm
	for i in range(n_iterations):
		# take a step
		candidate = curr + randn(len(bounds)) * step_size
		# evaluate candidate point
		candidate_eval = objective(candidate)
		# check for new best solution
		if candidate_eval < best_eval:
			# store new best point
			best, best_eval = candidate, candidate_eval
			# report progress
			print('>%d f(%s) = %.5f' % (i, best, best_eval))
		# difference between candidate and current point evaluation
		diff = candidate_eval - curr_eval
		# calculate temperature for current epoch
		t = temp / float(i + 1)
		# calculate metropolis acceptance criterion
		metropolis = exp(-diff / t)
		# check if we should keep the new point
		if diff < 0 or rand() < metropolis:
			# store the new current point
			curr, curr_eval = candidate, candidate_eval
	return [best, best_eval]

Now that we know how to implement the simulated annealing algorithm in Python, let's see how to use it to optimize the objective function.

Simulated annealing working example

In this section, we apply the simulated annealing optimization algorithm to the objective function. First, let's define the objective function. We will use a simple one-dimensional x ^ 2 objective function with a boundary of [- 5,5]. The following example defines the function, then creates a line graph of the function response surface for the grid of input values, and marks the optimal value of f (0.0) = 0.0 with a red line

# convex unimodal optimization function
from numpy import arange
from matplotlib import pyplot

# objective function
def objective(x):
	return x[0]**2.0

# define range for input
r_min, r_max = -5.0, 5.0
# sample input range uniformly at 0.1 increments
inputs = arange(r_min, r_max, 0.1)
# compute targets
results = [objective([x]) for x in inputs]
# create a line plot of input vs result
pyplot.plot(inputs, results)
# define optimal input value
x_optima = 0.0
# draw a vertical line at the optimal input
pyplot.axvline(x=x_optima, ls='--', color='red')
# show the plot
pyplot.show()

Running the example will create a line chart of the objective function and clearly mark the optimal value of the function.

Before applying the optimization algorithm to the problem, let's take a moment to better understand the acceptance criteria. Firstly, the fast annealing plan is an exponential function of the number of iterations. This can be made clear by creating a temperature map for each algorithm iteration. We will use the initial temperature of 10 and 100 algorithm iterations, both of which are arbitrary. A complete example is listed below.

# explore temperature vs algorithm iteration for simulated annealing
from matplotlib import pyplot
# total iterations of algorithm
iterations = 100
# initial temperature
initial_temp = 10
# array of iterations from 0 to iterations - 1
iterations = [i for i in range(iterations)]
# temperatures for each iterations
temperatures = [initial_temp/float(i + 1) for i in iterations]
# plot iterations vs temperatures
pyplot.plot(iterations, temperatures)
pyplot.xlabel('Iteration')
pyplot.ylabel('Temperature')
pyplot.show()

The running example will calculate the temperature of each algorithm iteration and create a graph of algorithm iteration (x-axis) and temperature (y-axis).

We can see that the temperature drops rapidly, exponentially rather than linearly, so after 20 iterations, the temperature is less than 1 and remains low for the rest of the search.

Next, we can better understand how the acceptance standards of metropolises change over time. Recall that this criterion is a function of temperature, but it is also a function of how different the objective evaluation of new points is from the current working solution. In this way, we will draw criteria for several different "objective function value differences" to see their impact on acceptance probability. A complete example is listed below.

# explore metropolis acceptance criterion for simulated annealing
from math import exp
from matplotlib import pyplot
# total iterations of algorithm
iterations = 100
# initial temperature
initial_temp = 10
# array of iterations from 0 to iterations - 1
iterations = [i for i in range(iterations)]
# temperatures for each iterations
temperatures = [initial_temp/float(i + 1) for i in iterations]
# metropolis acceptance criterion
differences = [0.01, 0.1, 1.0]
for d in differences:
	metropolis = [exp(-d/t) for t in temperatures]
	# plot iterations vs metropolis
	label = 'diff=%.2f' % d
	pyplot.plot(iterations, metropolis, label=label)
# inalize plot
pyplot.xlabel('Iteration')
pyplot.ylabel('Metropolis Criterion')
pyplot.legend()
pyplot.show()

The run example will use the temperature displayed for each iteration (shown in the previous section) to calculate the acceptance criteria for each algorithm iteration. The graph has three lines for the three differences between the new worse solution and the current effective solution. We can see that the worse the solution (the greater the difference), as we expected, the less likely the model is to accept the worse solution regardless of the algorithm iteration. We can also see that in all cases, the possibility of accepting worse solutions decreases with the iteration of the algorithm.

Now that we know more about the behavior of temperature and acceptance criteria over time, let's apply simulated annealing to our test problems. First, we will seed the pseudo-random number generator. Usually this is not necessary, but in this case, I want to make sure that I get the same result (the same sequence of random numbers) every time I run the algorithm, so that I can draw the result later.

# seed the pseudorandom number generator
seed(1)

Next, we can define the configuration of the search. In this case, we will search 1000 iterations of the algorithm and use a step size of 0.1. Suppose we use a Gaussian function to generate steps, which means that about 99% of all steps will be within the distance of a given point (0.1 * 3), such as three standard deviations. We will also use an initial temperature of 10.0. The search process is more sensitive to the annealing schedule than the initial temperature, so the initial temperature value is almost arbitrary.

n_iterations = 1000
# define the maximum step size
step_size = 0.1
# initial temperature
temp = 10

Next, we can perform the search and report the results.

# perform the simulated annealing search
best, score = simulated_annealing(objective, bounds, n_iterations, step_size, temp)
print('Done!')
print('f(%s) = %f' % (best, score))

Complete examples are as follows:

# simulated annealing search of a one-dimensional objective function
from numpy import asarray
from numpy import exp
from numpy.random import randn
from numpy.random import rand
from numpy.random import seed

# objective function
def objective(x):
	return x[0]**2.0

# simulated annealing algorithm
def simulated_annealing(objective, bounds, n_iterations, step_size, temp):
	# generate an initial point
	best = bounds[:, 0] + rand(len(bounds)) * (bounds[:, 1] - bounds[:, 0])
	# evaluate the initial point
	best_eval = objective(best)
	# current working solution
	curr, curr_eval = best, best_eval
	# run the algorithm
	for i in range(n_iterations):
		# take a step
		candidate = curr + randn(len(bounds)) * step_size
		# evaluate candidate point
		candidate_eval = objective(candidate)
		# check for new best solution
		if candidate_eval < best_eval:
			# store new best point
			best, best_eval = candidate, candidate_eval
			# report progress
			print('>%d f(%s) = %.5f' % (i, best, best_eval))
		# difference between candidate and current point evaluation
		diff = candidate_eval - curr_eval
		# calculate temperature for current epoch
		t = temp / float(i + 1)
		# calculate metropolis acceptance criterion
		metropolis = exp(-diff / t)
		# check if we should keep the new point
		if diff < 0 or rand() < metropolis:
			# store the new current point
			curr, curr_eval = candidate, candidate_eval
	return [best, best_eval]

# seed the pseudorandom number generator
seed(1)
# define range for input
bounds = asarray([[-5.0, 5.0]])
# define the total iterations
n_iterations = 1000
# define the maximum step size
step_size = 0.1
# initial temperature
temp = 10
# perform the simulated annealing search
best, score = simulated_annealing(objective, bounds, n_iterations, step_size, temp)
print('Done!')
print('f(%s) = %f' % (best, score))

The running example will report the search progress, including the number of iterations, function input, and the response of the objective function each time an improvement is detected.

At the end of the search, find the best solution and report its evaluation results.

Note: your results may be different due to the randomness of the algorithm or evaluation program, or the difference of numerical accuracy. Consider running the example several times and comparing the average results.

In this case, we can see that there are about 20 improvements in 1000 iterations of the algorithm, and the solution is very close to the optimal input of 0.0, that is, f (0.0) = 0.0.

>34 f([-0.78753544]) = 0.62021
>35 f([-0.76914239]) = 0.59158
>37 f([-0.68574854]) = 0.47025
>39 f([-0.64797564]) = 0.41987
>40 f([-0.58914623]) = 0.34709
>41 f([-0.55446029]) = 0.30743
>42 f([-0.41775702]) = 0.17452
>43 f([-0.35038542]) = 0.12277
>50 f([-0.15799045]) = 0.02496
>66 f([-0.11089772]) = 0.01230
>67 f([-0.09238208]) = 0.00853
>72 f([-0.09145261]) = 0.00836
>75 f([-0.05129162]) = 0.00263
>93 f([-0.02854417]) = 0.00081
>144 f([0.00864136]) = 0.00007
>149 f([0.00753953]) = 0.00006
>167 f([-0.00640394]) = 0.00004
>225 f([-0.00044965]) = 0.00000
>503 f([-0.00036261]) = 0.00000
>512 f([0.00013605]) = 0.00000
Done!
f([0.00013605]) = 0.000000

It may be interesting to see the progress of the search in the form of a line chart showing the evaluation changes of the best solution after each improvement.

Whenever there are improvements, we can update simulated_ Annoaling() to track the objective function evaluation and return this list of scores.

# simulated annealing algorithm
def simulated_annealing(objective, bounds, n_iterations, step_size, temp):
	# generate an initial point
	best = bounds[:, 0] + rand(len(bounds)) * (bounds[:, 1] - bounds[:, 0])
	# evaluate the initial point
	best_eval = objective(best)
	# current working solution
	curr, curr_eval = best, best_eval
	# run the algorithm
	for i in range(n_iterations):
		# take a step
		candidate = curr + randn(len(bounds)) * step_size
		# evaluate candidate point
		candidate_eval = objective(candidate)
		# check for new best solution
		if candidate_eval < best_eval:
			# store new best point
			best, best_eval = candidate, candidate_eval
			# keep track of scores
			scores.append(best_eval)
			# report progress
			print('>%d f(%s) = %.5f' % (i, best, best_eval))
		# difference between candidate and current point evaluation
		diff = candidate_eval - curr_eval
		# calculate temperature for current epoch
		t = temp / float(i + 1)
		# calculate metropolis acceptance criterion
		metropolis = exp(-diff / t)
		# check if we should keep the new point
		if diff < 0 or rand() < metropolis:
			# store the new current point
			curr, curr_eval = candidate, candidate_eval
	return [best, best_eval, scores]

Then, we can create a line chart of these scores to see the relative changes of each improved objective function found in the search process.

# line plot of best scores
pyplot.plot(scores, '.-')
pyplot.xlabel('Improvement Number')
pyplot.ylabel('Evaluation f(x)')
pyplot.show()

Taken together, the following is a complete example of performing a search and plotting the objective function score of an improved solution during the search.

# simulated annealing search of a one-dimensional objective function
from numpy import asarray
from numpy import exp
from numpy.random import randn
from numpy.random import rand
from numpy.random import seed
from matplotlib import pyplot

# objective function
def objective(x):
	return x[0]**2.0

# simulated annealing algorithm
def simulated_annealing(objective, bounds, n_iterations, step_size, temp):
	# generate an initial point
	best = bounds[:, 0] + rand(len(bounds)) * (bounds[:, 1] - bounds[:, 0])
	# evaluate the initial point
	best_eval = objective(best)
	# current working solution
	curr, curr_eval = best, best_eval
	scores = list()
	# run the algorithm
	for i in range(n_iterations):
		# take a step
		candidate = curr + randn(len(bounds)) * step_size
		# evaluate candidate point
		candidate_eval = objective(candidate)
		# check for new best solution
		if candidate_eval < best_eval:
			# store new best point
			best, best_eval = candidate, candidate_eval
			# keep track of scores
			scores.append(best_eval)
			# report progress
			print('>%d f(%s) = %.5f' % (i, best, best_eval))
		# difference between candidate and current point evaluation
		diff = candidate_eval - curr_eval
		# calculate temperature for current epoch
		t = temp / float(i + 1)
		# calculate metropolis acceptance criterion
		metropolis = exp(-diff / t)
		# check if we should keep the new point
		if diff < 0 or rand() < metropolis:
			# store the new current point
			curr, curr_eval = candidate, candidate_eval
	return [best, best_eval, scores]

# seed the pseudorandom number generator
seed(1)
# define range for input
bounds = asarray([[-5.0, 5.0]])
# define the total iterations
n_iterations = 1000
# define the maximum step size
step_size = 0.1
# initial temperature
temp = 10
# perform the simulated annealing search
best, score, scores = simulated_annealing(objective, bounds, n_iterations, step_size, temp)
print('Done!')
print('f(%s) = %f' % (best, score))
# line plot of best scores
pyplot.plot(scores, '.-')
pyplot.xlabel('Improvement Number')
pyplot.ylabel('Evaluation f(x)')
pyplot.show()

Running the sample will perform the search and report the results as before. Create a line graph showing the evaluation of each improved objective function during the mountain climbing search. In the search process, we can see that there are about 20 changes in the evaluation of the objective function, of which the initial change is large, and as the search algorithm converges to the optimal value, the change is very small to imperceptible at the end of the search.

 

 

Tags: Machine Learning

Posted by randomthinker on Sun, 17 Apr 2022 20:02:40 +0930