Tianhao Wu

I’m a 3rd-year Ph.D. student advised by Jiantao Jiao and Kannan Ramchandran at UC Berkeley. During my undergrad, I worked with Liwei Wang and majored in Mathematics

My research focuses on improving LLMs’ instruction following and reasoning capabilities via (Self-Play) RL. My ambition is to construct large-scale models that can solve complex tasks requiring multi-step reasoning.

I’m also working on AI Society, a group of agents that can link together in a modular fashion to form a more capable collective intelligence. This decentralized paradigm could mitigate the computing demands that limit centralized AI systems today.

news

Jan 17, 2024	I’ll be joining Meta as a research intern in summer 2024.

latest posts

Nov 28, 2023	Starling-7B: Increasing LLM Helpfulness & Harmlessness with RLAIF
Oct 16, 2023	Rethinking the Role of PPO in RLHF – The Berkeley Artificial Intelligence Research Blog

selected publications

Thinking LLMs: General Instruction Following with Thought Generation

Tianhao Wu, Janice Lan, Weizhe Yuan , and 3 more authors

arXiv preprint arXiv:2410.10630, 2024

HTML
EmbedLLM: Learning Compact Representations of Large Language Models

Richard Zhuang, Tianhao Wu, Zhaojin Wen , and 3 more authors

arXiv preprint arXiv:2410.02223, 2024

HTML
Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge

Tianhao Wu, Weizhe Yuan, Olga Golovneva , and 5 more authors

arXiv preprint arXiv:2407.19594, 2024

HTML
Starling-7B: Improving LLM Helpfulness & Harmlessness with RLAIF

Banghua Zhu, Evan Frick, Tianhao Wu , and 2 more authors

Nov 2023

HTML