About me

I am a Postdoctoral Fellow at Hong Kong University of Science and Technology (HKUST), advised by Prof. Wei Wang, and a Researcher at Alibaba Group, working on AI infrastructure and distributed systems for large language models and world models. I received my Ph.D. from Nanyang Technological University (NTU), supervised by Prof. Yonggang Wen and co-supervised by Prof. Tianwei Zhang. I received my B.Eng. from Beihang University in 2019.

My current research focuses on RL post-training systems, agentic AI infrastructure, and world model training and serving, with broader interests in scheduling and system optimization across the full lifecycle of large model workloads. I am actively working on efficient infrastructure for world model training and serving at scale, and building robust systems for agentic RL. I am particularly interested in building efficient, robust, and scalable infrastructure for large-scale AI.

I am a core contributor to ROLL, an open-source reinforcement learning post-training framework (3K+ GitHub stars).

Research Interests:

📧 Email: csgaowei@ust.hk

Google Scholar

GitHub

News

[2026.07] Paper OctoPipe accepted to SC 2026 (corresponding author).
[2026.06] Preprint SpecGen and Spotlight released on arXiv.
[2026.05] Preprint DisagFusion and ROSE released on arXiv.
[2026.04] Preprint Crab released on arXiv.
[2026.04] Paper ResiHP accepted to HPDC 2026 (corresponding author).
[2026.03] Two papers conditionally accepted to OSDI 2026: RollArt and Weave.
[2025.12] Paper RollPacker accepted to NSDI 2026.
[2025.12] Technical report ROME released on arXiv.
[2025.10] Technical report ROLL Flash released on arXiv.
[2025.06] Technical report ROLL released on arXiv.
[2025.04] Paper Rethinking KV Cache Compression accepted to MLSys 2025.

Selected Publications

[RollArt] Wei Gao*, Yuheng Zhao* et al., “RollArt: Scaling Agentic RL Training via Disaggregated Infrastructure”, OSDI 2026. [CCF-A]

[RollPacker] Wei Gao*, Yuheng Zhao* et al., “RollPacker: Taming Long-Tail Rollouts for RL Post-Training with Tail Batching”, NSDI 2026. [CCF-A]

[ResiHP] Tenghui Ma, Jihu Guo, Wei Gao† et al., “ResiHP: Taming LLM Training Failures with Dynamic Hybrid Parallelism”, HPDC 2026. [CCF-A]

[OctoPipe] Jihu Guo, Tenghui Ma, Wei Gao† et al., “OctoPipe: Reducing Pipeline Bubbles for Heterogeneous Models via Co-Optimizing Partitioning, Placement, and Scheduling”, SC 2026. [CCF-A]

[KV Cache] Wei Gao*, Xinyu Zhou* et al., “Rethinking Key-Value Cache Compression Techniques for Large Language Model Serving”, MLSys 2025.

(* Equal contribution / † Corresponding author) See full list on the Publications page.