-
How I topped the OpenAI Parameter Golf challenge, twice
How I built an agentic system Hive to top the OpenAI Parameter Golf challenge, twice.
-
Training Any Agentic Program without Code Changes
-
Starling-7B: Increasing LLM Helpfulness & Harmlessness with RLAIF
-
Rethinking the Role of PPO in RLHF – The Berkeley Artificial Intelligence Research Blog
The BAIR Blog