The acceptance probability in simulated annealing is a crucial mechanism that allows the algorithm to explore the solution space effectively and escape local optima. It defines the likelihood of transitioning to a new candidate state, balancing exploration and exploitation throughout the optimization process.
Specifically, the acceptance probability for a new state in step k is determined by the change in the objective function and a temperature parameter. The objective function f is typically minimized in simulated annealing.
Understanding the Acceptance Formula
The probability P(accept new)
for a new state, given an old state, is defined as follows:
Condition | Acceptance Probability P(accept new) |
---|---|
If Δ = f(new) - f(old) < 0 (new state is better) |
1 (100% accepted) |
If Δ = f(new) - f(old) ≥ 0 (new state is worse) |
exp(−Δ/Tk) |
Where:
Δ
(Delta) represents the change in the objective function, calculated asf(new) - f(old)
.f(new)
is the objective function value of the proposed new state.f(old)
is the objective function value of the current (old) state.Tk
is the "temperature" at step k, which is a strictly decreasing positive sequence. This sequence typically starts at a high value and gradually approaches zero (lim k→∞ Tk = 0
) as the simulation progresses.
How the Acceptance Probability Works
This formulation ensures a strategic balance in the search for an optimal solution:
- Accepting Better Solutions: If a proposed new state results in a lower objective function value (i.e.,
Δ < 0
), it is always accepted with a probability of1
. This ensures that the algorithm always moves towards improving solutions. - Accepting Worse Solutions: If a proposed new state results in a higher objective function value (i.e.,
Δ ≥ 0
), it is accepted with a probabilityexp(−Δ/Tk)
. This is the key feature that distinguishes simulated annealing from simpler local search algorithms:- Dependence on
Δ
: A larger positiveΔ
(meaning the new state is significantly worse) results in a lower acceptance probability. - Dependence on
Tk
(Temperature):- High Temperature (Early Stages): When
Tk
is high, the term−Δ/Tk
is smaller (less negative), makingexp(−Δ/Tk)
closer to 1. This means even significantly worse states have a relatively high chance of being accepted. This promotes wide exploration of the search space, allowing the algorithm to jump out of potential local minima. - Low Temperature (Later Stages): As
Tk
decreases, the term−Δ/Tk
becomes larger (more negative), makingexp(−Δ/Tk)
closer to 0. This means worse states are increasingly unlikely to be accepted. The algorithm becomes more focused on exploitation, refining the solution within promising regions, similar to a greedy search as it "cools down."
- High Temperature (Early Stages): When
- Dependence on
Analogy and Practical Insights
The concept of "temperature" (Tk
) is derived from the physical process of annealing metals, where a material is heated and then slowly cooled to alter its physical properties.
- "Melting" Phase (High
Tk
): At high temperatures, the particles in a metal have enough energy to move freely and explore various configurations. Similarly, in simulated annealing, the highTk
allows the algorithm to accept unfavorable moves, enabling it to explore diverse regions of the solution space and avoid getting trapped in suboptimal local solutions. - "Crystallization" Phase (Low
Tk
): As the metal cools, particles settle into a low-energy, stable crystal structure. In simulated annealing, asTk
decreases, the algorithm becomes less likely to accept worse solutions, gradually converging towards a globally or near-globally optimal solution.
The choice of the cooling schedule (how Tk
decreases over time) is critical for the performance of simulated annealing. A slow cooling schedule can lead to better solutions but requires more computational time, while a fast schedule might converge quickly but risk settling in a local minimum.