Nov 28, 2023 Starling-7B: Increasing LLM Helpfulness & Harmlessness with RLAIF Oct 16, 2023 Rethinking the Role of PPO in RLHF – The Berkeley Artificial Intelligence Research Blog