Reinforcement Learning from Human Feedback (RLHF) in Notebooks

72 points | by ash_at_hny 3 days ago

3 comments