Kaiyan Zhang (张开颜)
CTO, Frontis.AI · Ph.D., Tsinghua University
I’m currently serving as the CTO of Frontis.AI, where I work on agent self-evolution, recursive self-improvement, and AI for AI. I earned my Ph.D. (2026) from the Department of Electronic Engineering, Tsinghua University, under the guidance of Professor Bowen Zhou. Before that, I earned B.S. (2020) and M.S. (2022) degrees in Computer Science and Technology from the Harbin Institute of Technology (HIT), where I was supervised by Weinan Zhang and Ting Liu in the HIT-SCIR lab.
We are hiring interns! If you are passionate about agent self-evolution, recursive self-improvement, and AI for AI, feel free to reach out. We publish papers and release open-source work.
My mission is to build AI that improves itself — shifting from human-supervised training toward agents that learn from their own experience and recursively bootstrap stronger successors (the ExpertAGI vision). This pursuit runs along two intertwined threads.
The first is the learning machinery for self-improvement: scalable and test-time reinforcement learning, multi-agent training, and reward modeling that let models supervise and improve other models — work like TTRL, SSRL, and MARTI, surveyed in our overview of RL for large reasoning models. The second is putting self-improving agents to work in high-value settings such as AI for AI — charted in our survey on self- to meta-evolution — alongside rigorous benchmarks like NatureBench and EnterpriseClawBench that keep our claims of capability honest.
news
| Jun 28, 2026 | We release a survey on self- and meta-evolution of self-improving agents: Awesome-Self-Improving-Agents. |
|---|---|
| Jun 24, 2026 | We release two agentic benchmarks: NatureBench (AI for AI) and EnterpriseClawBench (real-world enterprise tasks). |
| Jun 18, 2026 | Two papers are accepted to ECCV 2026, congrats to the collaborators. |
| Apr 04, 2026 | One paper is accepted to ACL 2026, congrats to the collaborators. |
| Jan 26, 2026 | Five papers are accepted to ICLR 2026, congrats to the collaborators. |
| Sep 19, 2025 | TTRL |
| Sep 11, 2025 | Excited to share our new survey paper on RL for Large Reasoning Models |
| Aug 21, 2025 | One paper is accepted to EMNLP 2025 (see ReviewRL). |
| Aug 15, 2025 | We investigate agentic search RL without reliance on external search engine while maintaining strong sim2real generalization. (see SSRL |
| Jun 26, 2025 | Two papers are accepted to ICCV 2025, congrats to the collaborators. |
| May 27, 2025 | We are very excited to release MARTI: A framework for LLM-based Multi-Agent Reinforced Training and Inference. (see MARTI |
| May 16, 2025 | Two papers are accepted to ACL 2025 Main, congrats to the collaborators. |
| May 14, 2025 | Just shared our latest work on TTS, RL and TTRL at QingkeTalk. |
| May 02, 2025 | Four papers are accepted to ICML 2025, congrats to the collaborators. |
| Apr 23, 2025 | We release Test-time Reinforcement Learning (TTRL), which investigates Reinforcement Learning (RL) on data without explicit labels for reasoning tasks in LLMs. (see TTRL |
| Mar 31, 2025 | We release collections of RL recipes (see Awesome-RL-Reasoning-Recipes |
| Mar 24, 2025 | Video-T1 is released, which firstly evaluate TTS on video generation (see Video-T1 |
| Feb 10, 2025 | We explore compute-optimal test-time scaling (see compute-optimal-tts |
| Jan 23, 2025 | One first-author paper is accepted to ICLR 2025 (see OpenPRM). |
| Dec 24, 2024 | One paper is accepted to AAAI 2025 (Congrats to Xinwei). |
| Sep 27, 2024 | One first-author paper is accepted to NeurIPS 2024 D&B Track (see UltraMedical |
| Sep 20, 2024 | One paper is accepted to EMNLP 2024 (see LPA). |
| Jul 10, 2024 | One co-first author paper is accepted to COLM 2024 (see LLM4BioHypoGen). |
| May 16, 2024 | Two papers are accepted to ACL 2024 (One first-author, see CoGenesis). |
| Mar 13, 2024 | One paper is accepted to NAACL 2024 (see PAD). |
| Oct 06, 2023 | One first-author paper is accepted to EMNLP 2023 (see CRaSh). |
selected publications
- Arxiv
- ECCV 2026TIR-Agent: Training an Explorative and Efficient Agent for Image RestorationEuropean Conference on Computer Vision (ECCV), 2026
- Arxiv
- ArxivSelf-Improving Agents in the Era of Experience: A Survey of Self- to Meta-EvolutionPreprint, 2026
- ICLR 2025OpenPRM: Building Open-domain Process-based Reward Models with Preference TreesThe Thirteenth International Conference on Learning Representations, 2025
- Arxiv