In our previous installment, we discussed the data structures that we will use to represent the graph which we will use for pathfinding on the terrain, as well as the initial pre-processing that was necessary to populate that graph with the information that our pathfinding algorithm will make use of. Now, we are ready to actually implement our pathfinding algorithm. We’ll be using A*, probably the most commonly used graph search algorithm for pathfinding.
A* is one of the most commonly used pathfinding algorithms in games because it is fast, flexible, and relatively simple to implement. A* was originally a refinement of Dijkstra’s graph search algorithm. Dijkstra’s algorithm is guaranteed to determine the shortest path between any two nodes in a directed graph, however, because Dijkstra’s algorithm only takes into account the cost of reaching an intermediate node from the start node, it tends to consider many nodes that are not on the optimal path. An alternative to Dijkstra’s algorithm is Greedy Best-First search. Best-First uses a heuristic function to estimate the cost of reaching the goal from a given intermediate node, without reference to the cost of reaching the current node from the start node. This means that Best-First tends to consider far fewer nodes than Dijkstra, but is not guaranteed to produce the shortest path in a graph which includes obstacles that are not predicted by the heuristic.
A* blends these two approaches, by using a cost function (f(x)) to evaluate each node based on both the cost from the start node (g(x)) and the estimated cost to the goal (h(x)). This allows A* to both find the optimum shortest path, while considering fewer nodes than pure Dijkstra’s algorithm. The number of intermediate nodes expanded by A* is somewhat dependent on the characteristics of the heuristic function used. There are generally three cases of heuristics that can be used to control A*, which result in different performance characteristics:
- When h(x) underestimates the true cost of reaching the goal from the current node, A* will expand more nodes, but is guaranteed to find the shortest path.
- When h(x) is exactly the true cost of reaching the goal, A* will only expand nodes along the shortest path, meaning that it runs very fast and produces the optimal path.
- When h(x) overestimates the true cost of reaching the goal from the current node, A* will expand fewer intermediate nodes. Depending on how much h(x) underestimates the true cost, this may result in paths that are not the true shortest path; however, this does allow the algorithm to complete more quickly.
For games, we will generally use heuristics of the third class. It is important that we generate good paths when doing pathfinding for our units, but it is generally not necessary that they be mathematically perfect; they just need to look good enough, and the speed savings are very important when we are trying to cram all of our rendering and update code into just a few tens of milliseconds, in order to hit 30-60 frames per second.
A* uses two sets to keep track of the nodes that it is operating on. The first set is the closed set, which contains all of the nodes that A* has previously considered; this is sometimes called the interior of the search. The other set is the open set, which contains those nodes which are adjacent to nodes in the closed set, but which have not yet been processed by the A* algorithm. The open set is generally sorted by the calculated cost of the node (f(x)), so that the algorithm can easily select the most promising new node to consider. Because of this, we usually consider the open list to be a priority queue. The particular implementation of this priority queue has a large impact on the speed of A*; for best performance, we need to have a data structure that supports fast membership checks (is a node in the queue?), fast removal of the best element in the queue, and fast insertions into the queue. Amit Patel provides a good overview of the pros and cons of different data structures for the priority queue on his A* page; I will be using a priority queue derived from Blue Raja’s Priority Queue class, which is essentially a binary heap. For our closed set, the primary operations that we will perform are insertions and membership tests, which makes the .Net HashSet<T> class a good choice.