Excuse me, another followup question (can't edit on mobile): can you ELI5 how do...

atschantz · on Nov 21, 2018

As a general answer, the theory suggests that organisms maximize a quantity known as model evidence, which is just a way of saying 'how much evidence does some data provide for my model of the world?'

There are two complementary ways to maximize this - change your model or change your world.

If we now grant that actions also maximize model evidence, then actions can either be conducted to sample data that make the model a better fit of the data (exploration), or they can be conducted to sample observations that are consistent with the current model (exploitation).

snrji · on Nov 22, 2018

And the optimization process itself would determine whether updating the model or changing the world is optimal, I guess. Thanks.

eli_gottlieb · on Nov 29, 2018

The equation for free-energy/ELBO has two terms, an energy and an entropy. You can rewrite it as "log-likelihood minus KL from prior". If you write your model in a certain way, you can then read it as, "Fit to the data, minus cost" (second formulation) or "Accuracy + exploitation + exploration" (first formulation).

paraschopra · on Nov 21, 2018

In forumations of FEP, there are two terms: cost and ambiguity. Minimisation of this combined term happens in a Bayesian optimal way. So you don't have to explicitly code weights for exploration and exploitation.

Although what you do have to code is prior preferences, and since it is a distribution, you implicitly code the range of those preferences. But once you do that the FEP, algorithm figures out when to collect more data to build a better model and when to use the existing model to get near the prior preferences.

snrji · on Nov 22, 2018

I see. Much more elegant than explicitly coding the trade-off, actually :)