Bolin Lai

Hi! I am a third-year PhD student in the Machine Learning Program of Georgia Institute of Technology, advised by Prof. James Rehg and co-advised by Prof. Zsolt Kira. Prior to starting my PhD, I got my Master's degree majoring in ECE and Bachelor's degree majoring in Information Engineering from Shanghai Jiao Tong University. I worked with Prof. Ya Zhang during my master.

Email  /  Google Scholar  /  Github  /  LinkedIn  /  Twitter

I'm looking for self-motivated graduate/ungraduate students to collaborate. Don't hesitate to reach out to me if you are interested.

profile photo

Research Interests

My research interests lie in Multi-Modal Learning, Generative Models and Video Understanding. Currently, I'm focusing on joint representation learning of vision and language in videos as well as image/video generation using diffusion models.

News

  • Mar 2024: One co-author paper was accepted by CVPR (Oral). See you in Seattle!
  • Jul 2023: Our expansion of prior work GLC was accepted by IJCV!
  • May 2023: I started my internship at GenAI Meta in Bay Area!
  • Apr 2023: I successfully passed the qualifying exam.
  • Mar 2023: One paper was accepted to the Findings of ACL2023.
  • Nov 2022: We won the Best Student Paper Prize on BMVC. Thanks for all co-authors!
  • Sep 2022: Our work GLC was accepted by BMVC 2022!
  • Jan 2022: I started working with Prof. James Rehg at Georgia Tech.

Publications

LEGO: Learning EGOcentric Action Frame Generation via Visual Instruction Tuning

Bolin Lai, Xiaoliang Dai, Lawrence Chen, Guan Pang, James M. Rehg, Miao Liu
Preprint
Website / Paper / Code / Dataset / Supplementary
Listen to Look into the Future: Audio-Visual Egocentric Gaze Anticipation

Bolin Lai, Fiona Ryan, Wenqi Jia, Miao Liu*, James M. Rehg*
Preprint
Website / Paper / Code / Data Split / Supplementary
Modeling Multimodal Social Interactions: New Challenges and Baselines with Densely Aligned Representations

Sangmin Lee, Bolin Lai, Fiona Ryan, Bikram Boote, James M. Rehg,
CVPR (Oral), 2024 [Acceptance Rate 0.8%]
[Paper]
Werewolf Among Us: Multimodal Resources for Modeling Persuasion Behaviors in Social Deduction Games

Bolin Lai*, Hongxin Zhang*, Miao Liu*, Aryan Pariani*, Fiona Ryan, Wenqi Jia, Shirley Anugrah Hayati, James M. Rehg, Diyi Yang
ACL Findings, 2023
Website / Paper / Code / Dataset
In the Eye of Transformer: Global-Local Correlation for Egocentric Gaze Estimation and Beyond

Bolin Lai, Miao Liu, Fiona Ryan, James M. Rehg
International Journal of Computer Vision (IJCV), 2023
(Expansion of the BMVC work)
Website / Paper / Code
In the Eye of Transformer: Global-Local Correlation for Egocentric Gaze Estimation

Bolin Lai, Miao Liu, Fiona Ryan, James M. Rehg
BMVC, 2022 (Best Student Paper)
Website / Paper / Code / Data Split / Supplementary / Video / Poster
---------------- Research before my PhD, mainly about medical image analysis ----------------
Semi-supervised Vein Segmentation of Ultrasound Images for Autonomous Venipuncture

Yu Chen, Yuxuan Wang, Bolin Lai, Zijie Chen, Xu Cao, Nanyang Ye, Zhongyuan Ren, Junbo Zhao, Xiao-Yun Zhou, Peng Qi
IROS, 2021
[Paper]
Hetero-Modal Learning and Expansive Consistency Constraints for Semi-Supervised Detection from Multi-Sequence Data

Bolin Lai, Yuhsuan Wu, Xiao-Yun Zhou, Peng Wang, Le Lu, Lingyun Huang, Mei Han, Jing Xiao, Heping Hu, Adam P. Harrison
Machine Learning in Medical Imaging, 2021
[Paper]
Liver Tumor Localization and Characterization from Multi-phase MR Volumes Using Key-Slice Prediction: A Physician-Inspired Approach

Bolin Lai*, Yuhsuan Wu*, Xiaoyu Bai*, Xiao-Yun Zhou, Peng Wang, Jinzheng Cai, Yuankai Huo, Lingyun Huang, Yong Xia, Jing Xiao, Le Lu, Heping Hu, Adam P. Harrison
International Workshop on PRedictive Intelligence In MEdicine, 2021
[Paper]
Spatial Regularized Classification Network for Spinal Dislocation Diagnosis

Bolin Lai, Shiqi Peng, Guangyu Yao, Ya Zhang, Xiaoyun Zhang, Yanfeng Wang, Hui Zhao
Machine Learning in Medical Imaging, 2019
[Paper]

Service

Reviewer for
- Computer Vision and Pattern Recognition Conference (CVPR)
- European Conference on Computer Vision (ECCV)
- The Association for Computational Linguistics (ACL)
- International Journal of Computer Vision (IJCV)
- Association for the Advancement of Artificial Intelligence (AAAI)
- International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI)
- Journal of Biomedical and Health Informatics (JBHI)
- IEEE Signal Processing Letters (SPL)

Taught ECE4871 as a teacher assistant at Georgia Tech in 2021 and 2022.

This website is adapted from this source code.