What is the Acceptance Probability of Simulated Annealing?

The acceptance probability in simulated annealing is a crucial mechanism that allows the algorithm to explore the solution space effectively and escape local optima. It defines the likelihood of transitioning to a new candidate state, balancing exploration and exploitation throughout the optimization process.

Specifically, the acceptance probability for a new state in step k is determined by the change in the objective function and a temperature parameter. The objective function f is typically minimized in simulated annealing.

Understanding the Acceptance Formula

The probability P(accept new) for a new state, given an old state, is defined as follows:

Condition	Acceptance Probability `P(accept new)`
If `Δ = f(new) - f(old) < 0` (new state is better)	`1` (100% accepted)
If `Δ = f(new) - f(old) ≥ 0` (new state is worse)	`exp(−Δ/Tk)`

Where:

Δ (Delta) represents the change in the objective function, calculated as f(new) - f(old).
f(new) is the objective function value of the proposed new state.
f(old) is the objective function value of the current (old) state.
Tk is the "temperature" at step k, which is a strictly decreasing positive sequence. This sequence typically starts at a high value and gradually approaches zero (lim k→∞ Tk = 0) as the simulation progresses.

How the Acceptance Probability Works

This formulation ensures a strategic balance in the search for an optimal solution:

Accepting Better Solutions: If a proposed new state results in a lower objective function value (i.e., Δ < 0), it is always accepted with a probability of 1. This ensures that the algorithm always moves towards improving solutions.
Accepting Worse Solutions: If a proposed new state results in a higher objective function value (i.e., Δ ≥ 0), it is accepted with a probability exp(−Δ/Tk). This is the key feature that distinguishes simulated annealing from simpler local search algorithms:
- Dependence on Δ: A larger positive Δ (meaning the new state is significantly worse) results in a lower acceptance probability.
- Dependence on Tk (Temperature):
  - High Temperature (Early Stages): When Tk is high, the term −Δ/Tk is smaller (less negative), making exp(−Δ/Tk) closer to 1. This means even significantly worse states have a relatively high chance of being accepted. This promotes wide exploration of the search space, allowing the algorithm to jump out of potential local minima.
  - Low Temperature (Later Stages): As Tk decreases, the term −Δ/Tk becomes larger (more negative), making exp(−Δ/Tk) closer to 0. This means worse states are increasingly unlikely to be accepted. The algorithm becomes more focused on exploitation, refining the solution within promising regions, similar to a greedy search as it "cools down."

Analogy and Practical Insights

The concept of "temperature" (Tk) is derived from the physical process of annealing metals, where a material is heated and then slowly cooled to alter its physical properties.

"Melting" Phase (High Tk): At high temperatures, the particles in a metal have enough energy to move freely and explore various configurations. Similarly, in simulated annealing, the high Tk allows the algorithm to accept unfavorable moves, enabling it to explore diverse regions of the solution space and avoid getting trapped in suboptimal local solutions.
"Crystallization" Phase (Low Tk): As the metal cools, particles settle into a low-energy, stable crystal structure. In simulated annealing, as Tk decreases, the algorithm becomes less likely to accept worse solutions, gradually converging towards a globally or near-globally optimal solution.

The choice of the cooling schedule (how Tk decreases over time) is critical for the performance of simulated annealing. A slow cooling schedule can lead to better solutions but requires more computational time, while a fast schedule might converge quickly but risk settling in a local minimum.