News | drihu.com

By onurkanbkrc, 9 hours ago

URL: rlhfbook.com

4 comments

By dang, 4 hours ago

Related. Others?

By verdverm, 8 hours ago

Last time I saw Nathan say something about the book, he's actively working on the next version and looking for feedback, check his socials

By leggerss, 6 hours ago

You could say he's also learning from human feedback

By klelatti, 9 hours ago

Web version with links, etc:

By dang, 4 hours ago

Thanks! We've switched to that above from https://arxiv.org/abs/2504.12501, and put the latter in the toptext.

By iisweetheartii, 8 hours ago

[dead]

Reinforcement Learning from Human Feedback