Kaiyan Zhang
PhD Candidate, Tsinghua University
I am a third-year PhD student at the Department of Electronic Engineering, Tsinghua University, under the guidance of Professor Bowen Zhou. I previously earned a Master’s degree in Computer Science and Technology in 2022 from the Harbin Institute of Technology (HIT), where I was supervised by Weinan Zhang and Ting Liu in the HIT-SCIR lab.
My primary research interests focus on the alignment and collaboration of large language models (LLMs), aiming to develop trustworthy and scalable collaborative intelligence systems. I am currently focused on developing LLM-based multi-agent reinforcement training framework aimed at advancing reasoning capabilities beyond basic levels (R1 and O1). My research also explores improving collaboration among agents in real-world, agentic tasks.
I am open to potential collaborations and discussions on these topics (like multi-agent in COLM 2024 / ACL 2024 / Arxiv 2406, reinforcement learning in Arxiv 2412 / Arxiv 2502, test-time scaling in ICLR 2025 / Arxiv 2502 / Arxiv 2503 / Arxiv 2504, and more) and would be glad to connect with others who share similar interests.
news
Mar 31, 2025 | We released collections of RL recipes (see Awesome-RL-Reasoning-Recipes |
---|---|
Mar 24, 2025 | Video-T1 was released, which firstly evaluated TTS on video generation (see Video-T1 |
Feb 10, 2025 | We explored compute-optimal test-time scaling (see compute-optimal-tts |
Jan 23, 2025 | One first-author paper has accepted to ICLR 2025 (see OpenPRM). |
Dec 24, 2024 | One paper has accepted to AAAI 2025 (Congrats to Xinwei). |
Sep 27, 2024 | One first-author paper has accepted to NeurIPS 2024 D&B Track (see UltraMedical |
Sep 20, 2024 | One paper has accepted to EMNLP 2024 (see LPA). |
Jul 10, 2024 | One co-first author paper has accepted to COLM 2024 (see LLM4BioHypoGen). |
May 16, 2024 | Two papers have accepted to ACL 2024 (One first-author, see CoGenesis). |
Mar 13, 2024 | One paper has accepted to NAACL 2024 (see PAD). |
Oct 06, 2023 | One first-author paper has accepted to EMNLP 2023 (see CRaSh). |
selected publications
- ICLR 2025OpenPRM: Building Open-domain Process-based Reward Models with Preference TreesThe Thirteenth International Conference on Learning Representations, 2025
- Arxiv