🍼 About Me
I am a third-year Ph.D. student (硕博连读) at the School of Artificial Intelligence, Nanjing University. I am a member of the LAMDA Group (计算机软件新技术国家重点实验室), advised by Associate Professor Han-Jia Ye (叶翰嘉) and Professor De-Chuan Zhan (詹德川).
My research currently focuses on LLM RL Training and LLM Inference Routing.
🥟 Research & Publications
LLM RL Training
做了哪些事: 研究在 RL 过程中,如何构建全新的 Value Model,以及如何用 Off-policy Guidance 下稳定学习。
V0: A Generalist Value Model for Any Policy at State Zero
V0.5: Generalist Value Model as a Prior for Sparse RL Rollouts
LongCat-Flash-Thinking-2601 (Contributor)
Spot Me: Bridging the Intention-Execution Gap with Expert-Guided Reinforcement Fine-tuning
LLM Inference Routing [Demo: http://lambda-router.org]
做了哪些事: 研究部署时,如何将指令路由到开源/闭源、小/大模型上,我们的工作贯穿了整个 Routing 的发展。
Model Spider: Learning to Rank Pre-Trained Models Efficiently
NeurIPS 2023 (Spotlight)
Other Related Applications
[Multimodal LLM Data Engine] ZooProbe: A Data Engine for Evaluating, Exploring, and Evolving Large-scale Training Data for Multimodal LLMs
ICLR 2025
[Multimodal LLM Architecture] Wings: Learning Multimodal LLMs without Text-only Forgetting
NeurIPS 2024
[Stable Training in CV] Learning Debiased Representations via Conditional Attribute Interpolation
CVPR 2023
🧩 Internship Experience
美团 - LongCat Life Agent Team
2025.10 - Present
阿里通义 - Agent Team
2025.08 - 2025.10
小米 - MiMo-Embodied Team
2025.05 - 2025.08
阿里国际 - Ovis Multimodal LLM Team
2024.03 - 2025.05
🍚 Education Background
Nanjing University, School of Artificial Intelligence
Ph.D. in Computer Science and Technology (Enrolled as Master in 2021)
Ph.D. in Computer Science and Technology (Enrolled as Master in 2021)
2023.09 - Present
Nanjing University, Computer Science and Technology Department
B.Sc. in Computer Science and Technology (Minor in Math & Statistics)
B.Sc. in Computer Science and Technology (Minor in Math & Statistics)
2017.09 - 2021.07
🍰 Selected Awards
国家奖学金 (National Scholarship) 2022
南京大学优秀研究生标兵 2023
挑战杯全国铜奖 - Team Leader 2023
华为突出贡献奖 2023
兴业银行、江苏银行等奖学金 Multiple Years
🍞 Service Work
南京大学人工智能学院研究生会主席
南京大学研究生常任代表
学院乒乓球队队长