Reinforcement Learning from Human Feedback

4 comments

Last time I saw Nathan say something about the book, he's actively working on the next version and looking for feedback, check his socials

You could say he's also learning from human feedback

[dead]