The averaging part makes it sound like the usual RL self-play against regular ch...

		gwern on Feb 3, 2017 \| parent \| context \| favorite \| on: An Introduction to Counterfactual Regret Minimizat... The averaging part makes it sound like the usual RL self-play against regular checkpoints of oneself.