About me

I am a Postdoctoral Fellow at Hong Kong University of Science and Technology (HKUST), advised by Prof. Wei Wang, and a Researcher at Alibaba Group, working on AI infrastructure and distributed systems for large language models and world models. I received my Ph.D. from Nanyang Technological University (NTU), supervised by Prof. Yonggang Wen and co-supervised by Prof. Tianwei Zhang. I received my B.Eng. from Beihang University in 2019.

My current research focuses on RL post-training systems, agentic AI infrastructure, and world model training and serving, with broader interests in scheduling and system optimization across the full lifecycle of large model workloads. I am actively working on efficient infrastructure for world model training and serving at scale, and building robust systems for agentic RL. I am particularly interested in building efficient, robust, and scalable infrastructure for large-scale AI.

I am a core contributor to ROLL, an open-source reinforcement learning post-training framework (3K+ GitHub stars).

Research Interests:

RL Post-Training Agent Infrastructure World Model Training MLSys LLM Training Cluster Scheduling

📧 Email: csgaowei@ust.hk    Google Scholar    GitHub

News

  • [2026.05] Preprint DisagFusion released on arXiv.
  • [2026.05] Preprint ROSE released on arXiv.
  • [2026.05] Two papers conditionally accepted to OSDI 2026: RollArt and Weave.
  • [2026.04] Preprint Crab released on arXiv.
  • [2026.03] Paper ResiHP accepted to HPDC 2026 (corresponding author).
  • [2025.12] Paper RollPacker accepted to NSDI 2026.
  • [2025.12] Technical report ROME released on arXiv.
  • [2025.10] Technical report ROLL Flash released on arXiv.
  • [2025.06] Technical report ROLL released on arXiv.
  • [2025.04] Paper Rethinking KV Cache Compression accepted to MLSys 2025.
  • [2025.01] Paper IceFrog accepted to IEEE TPDS 2025. [CCF-A]

Selected Publications

[RollArt] Wei Gao*, Yuheng Zhao* et al., “RollArt: Scaling Agentic RL Training via Disaggregated Infrastructure”, OSDI 2026. [CCF-A]

[RollPacker] Wei Gao*, Yuheng Zhao* et al., “RollPacker: Taming Long-Tail Rollouts for RL Post-Training with Tail Batching”, NSDI 2026. [CCF-A]

[ResiHP] Tenghui Ma, Jihu Guo, Wei Gao† et al., “ResiHP: Taming LLM Training Failures with Dynamic Hybrid Parallelism”, HPDC 2026. [CCF-A]

[KV Cache] Wei Gao*, Xinyu Zhou* et al., “Rethinking Key-Value Cache Compression Techniques for Large Language Model Serving”, MLSys 2025.

[IceFrog] Wei Gao et al., “IceFrog: A Layer-Elastic Scheduling System for Deep Learning Training in GPU Clusters”, IEEE TPDS 2025. [CCF-A]

(* Equal contribution  /  † Corresponding author)   See full list on the Publications page.