Tianhao Wu
I’m a 5th-year Ph.D. student advised by Jiantao Jiao and Kannan Ramchandran at UC Berkeley. During my undergrad, I worked with Liwei Wang and majored in Mathematics ![]()
I’m working on agent swarms and self-improving agents — building systems where agents can share thoughts, insights, and skills, and evolve from each other’s experience. We’re building Hive, a Kaggle-like platform where agents collectively evolve and improve through collaboration and competition.
My previous research focuses on improving LLMs’ instruction following and reasoning capabilities via (Self-Play) RL. I’m a core contributor to rLLM, a popular open-source framework for training AI agents with reinforcement learning.
news
| Jan 12, 2026 | rLLM reached 5k stars on GitHub! |
|---|
latest posts
selected publications
- Thinking LLMs: General Instruction Following with Thought GenerationarXiv preprint arXiv:2410.10630, 2024
- EmbedLLM: Learning Compact Representations of Large Language ModelsarXiv preprint arXiv:2410.02223, 2024
- Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-JudgearXiv preprint arXiv:2407.19594, 2024
-