📌 My research interests are multi-modal learning, large language/vision models, and 3D computer vision
✉️ I anticipate graduating in 2025 and am open to both academic and industrial research positions in North America and Asia. If you are interested, please feel free to contact me.
✉️ I'm also looking for self-motivated undergraduate and graduate students for academic cooperation.
Education
[2017-2021] 🎉 I received my B.E. degree from Peking University, awarded Outstanding Graduate (Top 5%).
[2020-2021] I worked as a visiting student in University of Pennsylvania, supervised by Prof. Jianbo Shi.
[2021-Now] 💪 I'm pursuing my Ph.D. in MMLab, CUHK, supervised by Prof. Hongsheng Li and Prof. Xiaogang Wang.
[2021-2024] I worked as a research intern at Shanghai AI Lab, supervised by Dr. Peng Gao.
[2024-Now] I'm working as a research intern at ByteDance, Seattle, supervised by Dr. Chunyuan Li.
News
[2024-07] Four papers accepted by ECCV 2024
[2024-05] Three papers accepted by ICML 2024
[2024-03] Seven papers accepted by CVPR 2024, two of them gain Highlight 🎉
[2024-02] One paper accepted by ICRA 2024
[2024-01] Four papers accepted by ICLR 2024
[2023-12] Four papers accepted by AAAI 2024
[2023-09] One paper accepted by NeurIPS 2023
[2023-08] Two papers accepted by IJCV 2023
[2023-07] Five papers accepted by ICCV 2023
[2023-04] One paper accepted by IJCAI 2023
[2023-02] Six papers accepted by CVPR 2023
[2022-11] Two papers accepted by AAAI 2023, one gains the Best Student Paper 🎉
[2022-09] One paper accepted by NeurIPS 2022
[2022-07] Three papers accepted by ECCV 2022
[2022-03] One paper accepted by CVPR 2022
[2021-09] One paper accepted by NeurIPS 2021
Selected Preprints and Projects
* Equal contribution, # Corresponding author
LLaVA-OneVision: Easy Visual Task Transfer
B Li, Y Zhang, D Guo, R Zhang, F Li, H, Zhang, K Zhang, Y Li, Z Liu, C Li
🔥 The new generation of LLaVA models for multi-modal learning
🔥 Code [1.4k+ Stars 🌟]
MAVIS: Mathematical Visual Instruction Tuning
R Zhang*, X Wei*, D Jiang, Y Zhang, Z Guo, C Tong, J Liu, A Zhou, B Wei, S Zhang, P Gao, H Li
📐 The first public large-scale multi-modal mathematical dataset for tuning large models
🔥 Code [50+ Stars 🌟]
MMSearch: Benchmarking the Potential of Large Models as Multi-modal Search Engines
D Jiang*, R Zhang*#, Z Guo, Y Wu, J Lei, P Qiu, P Lu, Z Chen, G Song, Y Liu, P Gao, C Li, H Li
🔥 The first multi-modal search engine pipeline and benchmark, surpassing Perplexity Pro
🔥 Code [350k+ Stars 🌟]
Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following
Z Guo*, R Zhang*#, X Zhu, Y Tang, X Ma, J Han, K Chen, P Gao, X Li#, H Li, P Heng
🧠 A 3D multi-modal model for general 3D learning, Point-Bind, and the first 3D large language model, Point-LLM
🔥 Code [300+ Stars 🌟]
Selected Publications
* Equal contribution, # Corresponding author
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?
R Zhang*, D Jiang*, Y Zhang*, H Lin, Z Guo, P Qiu, A Zhou, P Lu, KW Chang, P Gao, H Li
📐 The first benchmark to evaluate the real capabilities of MLLMs for visual mathematical reasoning
ECCV 2024, 🔥 Code [100k+ Stars 🌟]
Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners
R Zhang*, X Hu*, B Li, S Huang, H Deng, H Li, Y Qiao, P Gao#
🚀 Collaborate large models (GPT, CLIP, DINO, DALL-E) for image understanding
CVPR 2023, Code [300+ Stars 🌟]
Selected Awards
[2021-06] Outstanding Graduate, Peking University (Top 5%)
[2020-09] Academic Excellent Scholarship (Ranked 1st/73)
[2020-09] Merit Student PaceSetter, Peking University (Ranked 1st/73)
[2019-09] Academic Excellent Scholarship (Ranked 4th/73)
[2019-09] Merit Student, Peking University (Ranked 4th/73)
[2016-07] China Youth Technology Innovation Award (The Only 1 in Province)
[2016-10] 1st Prize in Provincial Chinese Physics Olympiad (Ranked 18th in Province)
[2015-10] 2nd Prize in The Chinese 15th Awarding Program for Future Scientist (Ranked 1st in Province)
[2013-03] 1st Prize in Provincial China Adolescent Robotics Competition (Ranked 1st in Province)
Hobbies
Soccer ⚽️, Moive 🎬, Singing 🎤, Piano 🎹, Violin 🎻, Games 🎮, Snorkeling 🤿, HotToys 🦸♂️
|