Bolin Lai

Hi! I am a third-year PhD student in the Machine Learning Program of Georgia Institute of Technology, advised by Prof. James Rehg and co-advised by Prof. Zsolt Kira. Prior to starting my PhD, I got my Master's degree majoring in ECE and Bachelor's degree majoring in Information Engineering from Shanghai Jiao Tong University. I worked with Prof. Ya Zhang during my master.

Email  /  Google Scholar  /  Github  /  LinkedIn  /  Twitter

I'm looking for self-motivated graduate/ungraduate students to collaborate. Don't hesitate to reach out to me if you are interested.

profile photo

Research Interests

My research interests lie in Multi-Modal Learning, Generative Models and Video Understanding. Currently, I'm focusing on advancing multi-modal unerstanding and generation through the integration of Large Language Models (LLMs) and Diffusion Models (DMs), aiming to connect and leverage the latent representation spaces of these two model architectures.

News

  • July 2024: Two first-author papers were accepted by ECCV! Please check out our latest work: LEGO (for action generation) and CSTS (for gaze forecasting). Thank all the co-authors!
  • May 2024: I started my second intenrship at GenAI, Meta in Bay Area.
  • Mar 2024: One co-author paper was accepted by CVPR (Oral). See you in Seattle!
  • Jul 2023: Our expansion of prior work GLC was accepted by IJCV!
  • May 2023: I started my internship at GenAI Meta in Bay Area!
  • Apr 2023: I successfully passed the qualifying exam.
  • Mar 2023: One paper was accepted to the Findings of ACL2023. Please check out our new dataset for social understanding: Werewolf Among Us.
  • Nov 2022: We won the Best Student Paper Prize on BMVC. Thanks to all co-authors!
  • Sep 2022: Our work GLC was accepted by BMVC 2022!
  • Jan 2022: I started working with Prof. James Rehg at Georgia Tech.

Publications

LEGO: Learning EGOcentric Action Frame Generation via Visual Instruction Tuning

Bolin Lai, Xiaoliang Dai, Lawrence Chen, Guan Pang, James M. Rehg, Miao Liu
ECCV, 2024
Webpage / Paper / Code / Dataset / Supplementary
Listen to Look into the Future: Audio-Visual Egocentric Gaze Anticipation

Bolin Lai, Fiona Ryan, Wenqi Jia, Miao Liu*, James M. Rehg*
ECCV, 2024
Webpage / Paper / Code / Data Split / Supplementary
Modeling Multimodal Social Interactions: New Challenges and Baselines with Densely Aligned Representations

Sangmin Lee, Bolin Lai, Fiona Ryan, Bikram Boote, James M. Rehg,
CVPR (Oral), 2024 [Acceptance Rate 0.8%]
Webpage / Paper / Code / Split & Annotations / Supplementary
Werewolf Among Us: Multimodal Resources for Modeling Persuasion Behaviors in Social Deduction Games

Bolin Lai*, Hongxin Zhang*, Miao Liu*, Aryan Pariani*, Fiona Ryan, Wenqi Jia, Shirley Anugrah Hayati, James M. Rehg, Diyi Yang
ACL Findings, 2023
Webpage / Paper / Code / Dataset / Video
In the Eye of Transformer: Global-Local Correlation for Egocentric Gaze Estimation and Beyond

Bolin Lai, Miao Liu, Fiona Ryan, James M. Rehg
International Journal of Computer Vision (IJCV), 2023
Webpage / Paper / Code
In the Eye of Transformer: Global-Local Correlation for Egocentric Gaze Estimation

Bolin Lai, Miao Liu, Fiona Ryan, James M. Rehg
BMVC, 2022 (Best Student Paper)
Webpage / Paper / Code / Data Split / Supplementary / Video / Poster
---------------- Research before my PhD, mainly about medical image analysis ----------------
Semi-supervised Vein Segmentation of Ultrasound Images for Autonomous Venipuncture

Yu Chen, Yuxuan Wang, Bolin Lai, Zijie Chen, Xu Cao, Nanyang Ye, Zhongyuan Ren, Junbo Zhao, Xiao-Yun Zhou, Peng Qi
IROS, 2021
[Paper]
Hetero-Modal Learning and Expansive Consistency Constraints for Semi-Supervised Detection from Multi-Sequence Data

Bolin Lai, Yuhsuan Wu, Xiao-Yun Zhou, Peng Wang, Le Lu, Lingyun Huang, Mei Han, Jing Xiao, Heping Hu, Adam P. Harrison
Machine Learning in Medical Imaging, 2021
[Paper]
Liver Tumor Localization and Characterization from Multi-phase MR Volumes Using Key-Slice Prediction: A Physician-Inspired Approach

Bolin Lai*, Yuhsuan Wu*, Xiaoyu Bai*, Xiao-Yun Zhou, Peng Wang, Jinzheng Cai, Yuankai Huo, Lingyun Huang, Yong Xia, Jing Xiao, Le Lu, Heping Hu, Adam P. Harrison
International Workshop on PRedictive Intelligence In MEdicine, 2021
[Paper]
Spatial Regularized Classification Network for Spinal Dislocation Diagnosis

Bolin Lai, Shiqi Peng, Guangyu Yao, Ya Zhang, Xiaoyun Zhang, Yanfeng Wang, Hui Zhao
Machine Learning in Medical Imaging, 2019
[Paper]

Service

Reviewer for
- Computer Vision and Pattern Recognition Conference (CVPR)
- European Conference on Computer Vision (ECCV)
- The Association for Computational Linguistics (ACL)
- Empirical Methods in Natural Language Processing (EMNLP)
- International Journal of Computer Vision (IJCV)
- Association for the Advancement of Artificial Intelligence (AAAI)
- International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI)
- Journal of Biomedical and Health Informatics (JBHI)
- IEEE Signal Processing Letters (SPL)

Taught ECE4871 as a teacher assistant at Georgia Tech in 2021 and 2022.

This website is adapted from this source code.