Tianhao Wu


I am a 3rd year Ph.D. student jointly advised by Prof. Jiantao Jiao and Prof. Kannan Ramchandran at UC Berkeley. During my undergrad, I worked with Prof. Liwei Wang and graduated with a B.S. degree in Mathematics :mortar_board:

Currently, my research focuses on fine-tuning LLMs using reinforcement learning with human feedback (RLHF). Building on this, my ambition is to construct an AI agent with the inherent ability to self-evolve without human supervision. I envision a paradigm where computational resources can be directly translated to the intelligence an AI possesses. Such an AI would vastly outpace human intelligence :scream_cat:

I’m also quite intrigued by the idea of forming an AI Society. Imagine in the future, each person possesses a personalized AI agent :eyes:. These agents could link together in a modular fashion to form a more capable collective intelligence, as well as being flexible enough to be added or removed from the collective. This decentralized paradigm could mitigate the huge memory and computing demand that limit centralized AI systems today.


Jan 17, 2024 I’ll be joining Meta as a research intern in summer 2024.
Nov 28, 2023 We released Starling-7B, an open-source large language model leveraging RLAIF
Sep 12, 2023 A paper regarding RLHF will be coming soon by the end of Sep :sparkles: :smile:

latest posts

selected publications


  1. starling.png
    Starling-7B: Improving LLM Helpfulness & Harmlessness with RLAIF
    Banghua Zhu, Evan Frick, Tianhao Wu, Hanlin Zhu, and 1 more author
    Nov 2023
  2. wave-mechanics.gif
    Pairwise Proximal Policy Optimization: Harnessing Relative Feedback for LLM Alignment
    Tianhao Wu, Banghua Zhu, Ruoyu Zhang, Zhaojin Wen, and 2 more authors
    Nov 2023