Disclaimer: I recently started implementing GRPO for a small project consisting of balancing a pendulum. These are fresh thoughts, and I still have a lot to learn on the subject. Hopefully later posts will be more insightful and referenced.
Reminder
...