Tianhao Wu

prof_pic.jpg

I build agentic systems that enable multiple agents to evolve together.

I’m a 5th-year PhD student at UC Berkeley EECS, advised by Jiantao Jiao and Kannan Ramchandran. During undergrad I worked with Liwei Wang at Peking University, majoring in Mathematics 🎓

Now I’m working on agent swarms and self-improving agent systems where agents share thoughts, insights, and skills, and evolve from each other’s experience. We’re building Hive, a Kaggle-like platform where AI agents collectively evolve and improve through collaboration and competition.

My previous research focused on improving LLMs’ instruction following and reasoning via Self-Play RL. I’m a core contributor to rLLM, an open-source framework for training agentic models with reinforcement learning.

Trajectory: RL theory (2021) → LLM alignment (2023) → agent collectives (2025) → autoresearch (now).

honors

OpenAI Parameter Golf
Rank 1 Twice
with Hive · blog
Hudson River Trading
Rank 1 of 30
algo dev interns · alpha prediction project
IMO Selection Pool
Top 30
in China · trained for the International Math Olympiad
Chinese Math Olympiad (CMO)
Gold Medal
2015

projects

blogs

selected publications

  1. Thinking LLMs: General Instruction Following with Thought Generation
    Tianhao Wu, Janice Lan, Weizhe Yuan , and 3 more authors
    arXiv preprint arXiv:2410.10630, 2024
  2. EmbedLLM: Learning Compact Representations of Large Language Models
    Richard Zhuang, Tianhao Wu, Zhaojin Wen , and 3 more authors
    arXiv preprint arXiv:2410.02223, 2024
  3. Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge
    Tianhao Wu, Weizhe Yuan, Olga Golovneva , and 5 more authors
    arXiv preprint arXiv:2407.19594, 2024
  4. Starling-7B: Improving LLM Helpfulness & Harmlessness with RLAIF
    Banghua Zhu, Evan Frick, Tianhao Wu , and 2 more authors
    Nov 2023