Tianhao Wu

prof_pic.jpg

I’m a 5th-year Ph.D. student advised by Jiantao Jiao and Kannan Ramchandran at UC Berkeley. During my undergrad, I worked with Liwei Wang and majored in Mathematics :mortar_board:

I’m working on agent swarms and self-improving agents — building systems where agents can share thoughts, insights, and skills, and evolve from each other’s experience. We’re building Hive, a Kaggle-like platform where agents collectively evolve and improve through collaboration and competition.

My previous research focuses on improving LLMs’ instruction following and reasoning capabilities via (Self-Play) RL. I’m a core contributor to rLLM, a popular open-source framework for training AI agents with reinforcement learning.

news

Jan 12, 2026 rLLM reached 5k stars on GitHub!

latest posts

selected publications

  1. Thinking LLMs: General Instruction Following with Thought Generation
    Tianhao Wu, Janice Lan, Weizhe Yuan , and 3 more authors
    arXiv preprint arXiv:2410.10630, 2024
  2. EmbedLLM: Learning Compact Representations of Large Language Models
    Richard Zhuang, Tianhao Wu, Zhaojin Wen , and 3 more authors
    arXiv preprint arXiv:2410.02223, 2024
  3. Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge
    Tianhao Wu, Weizhe Yuan, Olga Golovneva , and 5 more authors
    arXiv preprint arXiv:2407.19594, 2024
  4. starling.png
    Starling-7B: Improving LLM Helpfulness & Harmlessness with RLAIF
    Banghua Zhu, Evan Frick, Tianhao Wu , and 2 more authors
    Nov 2023