In our previous installment, we discussed the data structures that we will use to represent
the graph which we will use for pathfinding on the terrain, as well as the initial pre-processing
that was necessary to populate that graph with the information that our pathfinding algorithm
will make use of. Now, we are ready to actually implement our pathfinding algorithm.
We’ll be using A*, probably the most commonly used graph search algorithm for
pathfinding.
A* is one of the most commonly used pathfinding algorithms in games because it is fast,
flexible, and relatively simple to implement. A* was originally a refinement of
Dijkstra’s graph search algorithm. Dijkstra’s algorithm is guaranteed to determine
the shortest path between any two nodes in a directed graph, however, because Dijkstra’s
algorithm only takes into account the cost of reaching an intermediate node from the start node,
it tends to consider many nodes that are not on the optimal path. An alternative to
Dijkstra’s algorithm is Greedy Best-First search. Best-First uses a heuristic
function to estimate the cost of reaching the goal from a given intermediate node, without
reference to the cost of reaching the current node from the start node. This means that
Best-First tends to consider far fewer nodes than Dijkstra, but is not guaranteed to produce the
shortest path in a graph which includes obstacles that are not predicted by the heuristic.
A* blends these two approaches, by using a cost function (f(x)) to evaluate
each node based on both the cost from the start node (g(x)) and the estimated
cost to the goal (h(x)). This allows A* to both find the optimum shortest
path, while considering fewer nodes than pure Dijkstra’s algorithm. The number of
intermediate nodes expanded by A* is somewhat dependent on the characteristics of the heuristic
function used. There are generally three cases of heuristics that can be used to control
A*, which result in different performance characteristics:
- When h(x) underestimates the true cost of reaching the goal from the current node, A* will
expand more nodes, but is guaranteed to find the shortest path.
- When h(x) is exactly the true cost of reaching the goal, A* will only expand nodes along
the shortest path, meaning that it runs very fast and produces the optimal path.
- When h(x) overestimates the true cost of reaching the goal from the current node, A* will
expand fewer intermediate nodes. Depending on how much h(x) underestimates the true cost,
this may result in paths that are not the true shortest path; however, this does allow the
algorithm to complete more quickly.
For games, we will generally use heuristics of the third class. It is important that we
generate good paths when doing pathfinding for our units, but it is generally not necessary that
they be mathematically perfect; they just need to look good enough, and the speed savings are
very important when we are trying to cram all of our rendering and update code into just a few
tens of milliseconds, in order to hit 30-60 frames per second.
A* uses two sets to keep track of the nodes that it is operating on. The first set is
the closed set, which contains all of the nodes that A* has previously considered; this is
sometimes called the interior of the search. The other set is the open set, which contains
those nodes which are adjacent to nodes in the closed set, but which have not yet been processed
by the A* algorithm. The open set is generally sorted by the calculated cost of the node
(f(x)), so that the algorithm can easily select the most promising new node to
consider. Because of this, we usually consider the open list to be a priority queue.
The particular implementation of this priority queue has a large impact on the speed of A*; for
best performance, we need to have a data structure that supports fast membership checks (is a
node in the queue?), fast removal of the best element in the queue, and fast insertions into the
queue. Amit Patel provides a good overview of the pros and cons of different data
structures for the priority queue on his A*
page; I will be using a priority queue derived from Blue
Raja’s Priority Queue class, which is essentially a binary heap. For our closed
set, the primary operations that we will perform are insertions and membership tests, which makes
the .Net HashSet<T> class a good choice.