Chenhui Gou
PhD Student @ Monash University / ByteDance Seed Edge
Monash University
Melbourne, Australia
I am a PhD student at Monash University, also working with ByteDance Seed Edge. My research focuses on AI Agents, Large Language Models (LLMs), Vision-Language Models (VLMs), and Generative AI.
Designing evolving agents while evolving myself.
AI Agents · LLMs · VLMs · Generative AI
Data · Model · Agent · Evaluation
AI Agents · LLMs · VLMs · Generative AI
Data · Model · Agent · Evaluation
news
| Mar 01, 2026 | Two papers accepted at CVPR 2026: VQ-VA World and An Empirical Study on How Video-LLMs Answer Video Questions. |
|---|---|
| Jan 01, 2025 | BAGEL (Emerging Properties in Unified Multimodal Pretraining) released, reaching 5.5k GitHub stars. |
selected publications
- Tech Report
- BAGELEmerging Properties in Unified Multimodal PretrainingCore contributor. 5.5k GitHub stars. , 2025
- CVPRVQ-VA World: Towards High-Quality Visual Question-Visual AnsweringIn IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026
- CVPRDrVideo: Document Retrieval Based Long Video UnderstandingIn IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025
- CVPRAn Empirical Study on How Video-LLMs Answer Video QuestionsIn IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026
- NeurIPSRTFormer: Efficient Design for Real-Time Semantic Segmentation with TransformerIn Advances in Neural Information Processing Systems (NeurIPS). Spotlight Presentation , 2022