Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

ii.) I'd noticed a while ago that all graph traversal algorithms follow the same pattern. There is a white set of undiscovered nodes, a grey set of discovered-but-unvisited nodes, and a black set of visited nodes. To execute the graph traversal, you push a start node into the grey set, and then recursively pop a node off the grey set, add any children of it that are currently in the white set to the grey set, and repeat until everything is in the black set. The structure of the grey set determines the particular traversal algorithm:

1. If it is a stack, then you have a depth-first search, which visits all children first and then moves on to siblings.

2. If it is a queue, then you have a breadth-first search, which visits all siblings first and then starts recursing into children.

3. If it is a priority queue keyed by edge weights, then you have Dijkstra's algorithm, which will find the shortest path to any node in the graph.

4. If it is a priority queue keyed by in-degree of nodes, then you have a topological sort.

5. If it is a priority queue keyed by the lowest edge weight such that the endpoint of the edge is in the black set, then you have Prim's algorithm for minimum spanning trees.

You can have more exotic data structures too, eg.:

6. If it is a linked-list encoding of a stack done through pointer reversal, you have a mark & sweep garbage collector.

7. If it is a test on whether the pointer points to from-space or to-space, you have a Cheney-style breadth-first copying garbage collector.

I'd never seen a formalism that described this - it was just something I figured out when I was implementing my 3rd or so topological sort. I assumed that something like that must exist though - the notion of white/grey/black sets for graph traversal is well established in at least the garbage collection literature, and from there it's a short leap to figure out what structure of the grey set corresponds to which graph algorithm.

I just took a quick look at the A* entry in Wikipedia - I've heard of it but never had a reason to implement it myself - but it seems like it's a generalization of Dijkstra's algorithm where edge weights can be arbitrary functions. In this case, it's the same as case #3, but the keys to the priority queue are a function of the node instead of a straight list of edge weights.



Very nice presentation!

In A* if you use a heuristic to select an edge (using a priority queue, i.e., your #3) it gives a best-first search. But it is is possible to just use a stack/queue an forget about the heuristic, then you get a nice generic way of tree (graph) traversal.

Regarding your use of white/black/grey lists: in a tree (which is acylic), you would only need a list of nodes to visit, however, in a general graph you need to keep a list of nodes that you have already seen/visited so as not to get into a cycle when the graph is cyclic. In tracing garbage collection algorithms this is often used because the live program data generated by the mutator is cyclic, or at least potentially can be. Thanks for mentioning this, though I know my way around gc stuff, too, I would have never come up with this link here! Probably it is a good advice to anybody interested to spend some time reading into garbage collection algorithms, they contain many interesting algorithms on graphs, such as the Deutsch-Schorr-Waite algorithm for tree traversing without requiring an additional stack. The definitive book on garbage collection is 1996's Jones and Lins: http://www.amazon.com/Garbage-Collection-Algorithms-Automati...


Hi nostrademons, do you mind if I put this observation onto Wikipedia if it is not already there? I would add the cyclic/acyclic observation of sb's reply also. http://en.wikipedia.org/wiki/Graph_traversal is just a stub, it could nearly go there.


I don't mind, but I'm guessing it'll get tagged with [citation needed], and I don't know what an appropriate citation for it is.


The extent of my experience on the topic is in half-comprehending hobbyist's awe and the inclusion of okmij on my "personal heroes and demigods" list. The only word I could half-heartedly associate with this structure is the incredibly lame "hylomorphism"... which actually kinda generally expresses the pattern of much of computing. It's 5 AM and I'm going to sleep, but I'll leave you with the the most plausible leaves of a very brief search towards finding some recognized formal name for this structure:

http://blog.sigfpe.com/2009/07/monad-for-combinatorial-searc... http://spivey.oriel.ox.ac.uk/mike/search-jfp.pdf http://www.cs.au.dk/~gerth/papers/jfp96.pdf

If you really don't like yourself, try digging here:

http://hackage.haskell.org/packages/archive/pointless-haskel... http://comonad.com/reader/2009/recursion-schemes/


It's a nice theory but there is a second class which basically consists of two shades of White, Grey, Black, and Red.

Let's say you’re building a system to quickly find the degree of separation between two people on a network. For simplicity assume each node connects to 4 random other nodes. If you start from one end and do a breadth first search you would need to try ~(4^N) nodes to find the link. However, if you start from both ends and do a breadth first search you can try ~2 * (4^(N/2)) nodes to find that same connection.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: