Welcome to my website! I am an Applied Scientist at Amazon AGI, working on LLM and VLM post-training (SFT+RL). I completed my PhD in the Machine Learning Program of Georgia Institute of Technology, advised by Prof. James Rehg and co-advised by Prof. Zsolt Kira. In the meantime, I was also a visiting student at CS department of UIUC. Prior to my PhD, I got my Master's degree majoring in ECE and Bachelor's degree majoring in Information Engineering from Shanghai Jiao Tong University. I worked with Prof. Ya Zhang during my master.
I interned at Meta GenAI (now Meta Superintelligence Labs) for three times working on generative models, spanning LLMs, unified understanding and generation, image/video diffusion models, and fundamental research on tokenizers. Please check my employment experiences and publications below for details.
My expertise and interests lie in two threads: (1) LLM/VLM Post-training including SFT and RL; (2) Image/Video Generation with diffusion and autoregressive architectures. I also have rich engineering experience in coding on large-scale codebases, distributed training and data pipelines.
My career goal is to build omni multimodal systems that can understand, reason, and generate across text, image, video, and audio -- by integrating LLM planning/reasoning agents and high-fidelity diffusion backends into one autoregressive architecture.
I'm always open to academic collaboration with self-motivated graduate/ungraduate students. Don't hesitate to reach out to me if you are interested in my research.
News
Scroll for more news ↓
Feb 2026: 🎉 Two papers got accepted by CVPR2026. Please check out FreqWarm (first author) and Omni-MMSI (mentor). See you @Denver!⛰️
Feb 2026: 🎉 The work Online-MMSI that I mentored was accepted by TMLR.
Sep 2025: 🎉 One paper got accepted by NeurIPS2025.
Aug 2025: 👨💻 I'm on the job market now! Seeking a position starting in Dec. 2025 or Jan. 2026.
Jun 2025: 🏅 Our LEGO paper was recognized as Distinguished Paper by EgoVis at CVPR2025.
May 2025: I started my internship at Meta AGI Foundations in Seattle working with Dr. Ishan Misra.
Mar 2025: 🎉 I successfully passed my thesis proposal. Thank all committee members (James Rehg, Zsolt Kira, James Hays, Judy Hoffman)! I'll be on the job market in September of 2025. 👨💻
Feb 2025: 🎉 I have one first-author paper (InstaManip) and two co-author papers (VideoMindPalace and SocialGesture) accepted by CVPR 2025. Thank all collaborators. See you @Nashville.🎸
Oct 2024: 🔍 We released a thorough survey in action anticipation. Please check out if you are interested in this field.
Oct 2024: 🏅 Our LEGO paper was nominated in the Best Paper Finalist @ECCV2024. Congratulations to all co-authors!
Aug 2024: 🎤 Our LEGO paper got Oral presentation.
July 2024: 🎉 Two first-author papers were accepted by ECCV! Please check out our latest work: LEGO (for action generation) and CSTS (for gaze forecasting). Thank all the co-authors!
May 2024: 👨💻👨💻 I started my second intenrship at GenAI, Meta in Bay Area.
Mar 2024: 🎉 One co-author paper was accepted by CVPR (Oral). See you in Seattle!
Jul 2023: 🎉 Our expansion of prior work GLC was accepted by IJCV!
May 2023: 👨💻 I started my internship at GenAI Meta in Bay Area!
Apr 2023: 🎉 I successfully passed the qualifying exam.
Mar 2023: 🎉 One paper was accepted to the Findings of ACL2023. Please check out our new dataset for social understanding: Werewolf Among Us.
Nov 2022: 🏅 We won the Best Student Paper Prize on BMVC. Thanks to all co-authors!
Sep 2022: 🎉 Our work GLC was accepted by BMVC 2022!
Jan 2022: I started working with Prof. James Rehg at Georgia Tech.