Reinforcement Learning with Human Feedback(RLHF) is a technique combined with Reinforcement learning and human feedback to better align the LLMs with human preferences. This blog covers most of the concepts that i learnt.
Before diving deep, let's un...