Interestingly, the updates are done on the basis of only 2 external inputs [1]: ...

fenomas · on Nov 10, 2016

Wow, that's really unexpected.

This isn't my field, but given the simplicity of the inputs and network, and the way commenters are seeing the demo achieve perfect play after anything between 2 and 200 generations, it makes me wonder if this isn't more of a brute-force search than actual learning?

That is, it smells like there's a "correct" set of neuron values - where any genome within some tolerance of those values wins forever, and any other genome dies quickly. If that's the case, the system can't really evolve towards a solution, can it? It would just cycle randomly through lots of genomes that die immediately, until by pure chance one lives forever. I only tried the demo a few times but that's what it looked like it was doing.

matsemann · on Nov 10, 2016

"All" evolutionary algorithms are basically a local search with smart heuristics. Where a local search is a brute-force where you move in small directions based on feedback on where you are in the solution space.

fenomas · on Nov 10, 2016

I understand how the algorithms work. What I'm suggesting is that the demo seems to behave like a hill-climbing algorithm that's been unleashed on a terrain that's flat everywhere except the solution.

matsemann · on Nov 10, 2016

Ah, I see what you mean now. Yeah, the problem space seems a bit simple. I never see any "learning" before it suddenly achieves perfect play.

raverbashing · on Nov 10, 2016

Not really, if you add new individuals from random then you're doing some global search (or just have a higher mutation rate - but that has some problems)

shmageggy · on Nov 10, 2016

Given the perfect play reported elsewhere, this suggests that the holes are just tall enough to accomodate a single flap, so the network essentially just has to learn the less-than function and then tune the threshold.

edit: With sigmoid or hard-threshold activation, this function is really simple to implement. If we want to flap iff bird is lower than pipe, we can do that with no hidden layers and a weight vector of [1 -1]. I'd be curious to see someone fork and implement this.

hanoz · on Nov 10, 2016

> Given the perfect play reported elsewhere, this suggests that the holes are just tall enough to accomodate a single flap, so the network essentially just has to learn the less-than function and then tune the threshold.

Yes, slightly disappointingly the neural network can be replaced by the line:

if((this.birds[i].y - 70) / this.height > nextHoll) this.birds[i].flap();

onion2k · on Nov 10, 2016

The pipes are evenly spaced so horizontal distance just goes away.

hanoz · on Nov 10, 2016

The fact that the pipes are evenly spaced doesn't make horizontal distance go away, their distance is still a variable at each flap decision.

What does make the distance irrelevant is that the holes are high enough to safely flap whilst inside them, so you never have to make a timed leap through, a luxury not afforded to users of the original game if I recall.

a1k0n · on Nov 10, 2016

Yes. And the optimal policy is trivial given those inputs, as long as the aperture between pipes is larger than the "jump" height of the bird: if bird y + bird y velocity > bottom pipe y, jump. So finding that with a neural network or genetic algorithm is fairly silly.

If the aperture is smaller than the jump height, then you need to do something smart to time your jumps.