The article doesn't explicitly state it in this manner in one concise place, but the way I would always think about A* from a "practical/easy-to-remember" perspective back when I was doing competitive programming is that they're all the same algorithm, but with different priorities on the priority queue:
Breadth-first Search: Priority is order of discovery of edges (that is, no priority queue/just a regular queue)
Dijkstra: Priority is distance so far + next edge distance
A*: Priority is distance so far + next edge distance + estimate of distance to target node.
This also helps me remember whether the estimate must over- or under-estimate: Since Dijkstra is making the estimate "0", clearly the "admissible heuristic" criteria must be an under-estimation.
Another way I like to think about it is that every graph traversal can be represented as a white set of unknown nodes, a grey set of known but unvisited nodes, and a black set of visited nodes. The data structure used to represent the grey set defines the algorithm:
DFS = queue
BFS = stack
Dijstra's = priority queue keyed by edge weight
A* = priority queue with heuristic function
Beam search = bounded priority queue with heuristic function
Topological sort = priority queue keyed by number of unvisited inbound edges
Copying garbage collector = pointer address
Mark & sweep garbage collector = dirty bit on the object pointed to
Generational garbage collector = multi-level grey set represented by the write barriers between each generation.
A* = priority queue with heuristic function _with specific properties_ ... in particular it must be an "admissible" hueristic that never overestimates the true value.
It'll still find a path with an inadmissible heuristic, as the page explains. It's just not guaranteed to be the shortest path in that case. This is commonly done.
More specifically Dijkstra's is a priority queue. A* is a priority queue with an added estimate to the cost function to prioritize searching nodes closer to the destination.
Likewise, you can represent a queue as a priority queue with key = i, where i is an integer monotonically increasing at insertion time. And you can represent a stack as a priority queue where key = -i.
This is the insight behind the decorate-sort-undecorate pattern; it's just heapsort, with a different key function allowing you to represent several different algorithms.
In (theoretical) computer science, we write "#xs" to denote "number of xs".
My sentence was supposed to be read as "g(n) = number of edges", and implicitly, of course (since we're talking about BFS), that means number of edges seen up until now, from ns perspective. And yes, n usually denotes the size of the graph, however, in the context of A*, we usually write n to denote the current node (as per AI:MA).
I take full responsibility. (Disclaimer: I'm a CS professor teaching BFS and Dijkstra's algorithm every semester and A* every 2nd year.)
I think it is true, although "#edges" needs to be understood as "the number of the edges in the path from the starting point to the node", which was not one of my first three candidate interpretations.
> This also helps me remember whether the estimate must over- or under-estimate: Since Dijkstra is making the estimate "0", clearly the "admissible heuristic" criteria must be an under-estimation.
You're thinking too hard. :-) Just think of a map the same way a 10-year-old would.
Straight-line (Euclidean) distance is the most obvious heuristic on a map for estimating distance, and it's admissible.
Straight lines minimize distances i.e. they never overestimate. Which is enough to remind you that you want an underestimate.
> the way I would always think about A* from a "practical/easy-to-remember" perspective back when I was doing competitive programming is that they're all the same algorithm, but with different priorities on the priority queue
A much less obvious but much more fascinating (IMO) way to look at A* is that A* is actually Dijkstra but with a modified graph, where you adjust the heuristic delta between each edge's vertices to the edge's weight.
To remember the sign of the adjustment with this method, just imagine the next vertex getting super close to the destination, and then work out whether the weight needs to increase or decrease significantly in that case. (It needs to decrease, given you're getting closer to the destination.)
Yes, but it doesn't come right away. When preferring deeper nodes for a moment you have a loop-safe Depth-first traversal, and then when simplifying things in case the Graph is a Tree you get your regular stack-based Depth-first traversal, in which if you settle for the first goal you get back a tail-call optimised DFS.
Breadth-first Search: Priority is order of discovery of edges (that is, no priority queue/just a regular queue)
Dijkstra: Priority is distance so far + next edge distance
A*: Priority is distance so far + next edge distance + estimate of distance to target node.
This also helps me remember whether the estimate must over- or under-estimate: Since Dijkstra is making the estimate "0", clearly the "admissible heuristic" criteria must be an under-estimation.