Reinforcement Learning from Human Feedback | Hacker News

HN2new | past | comments | ask | show | jobs | submit

		Reinforcement Learning from Human Feedback (rlhfbook.com)
		133 points by onurkanbkrc 28 days ago \| hide \| past \| favorite \| 5 comments
		https://arxiv.org/abs/2504.12501

verdverm 28 days ago | [–]

Last time I saw Nathan say something about the book, he's actively working on the next version and looking for feedback, check his socials

leggerss 27 days ago | | [–]

You could say he's also learning from human feedback

dang 27 days ago | | [–]

Related. Others?

RLHF Book - https://hackertimes.com/item?id=42902936 - Feb 2025 (37 comments)

klelatti 28 days ago | | [–]

Web version with links, etc:

https://rlhfbook.com/

dang 27 days ago | | [–]

Thanks! We've switched to that above from https://arxiv.org/abs/2504.12501, and put the latter in the toptext.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact