Chenhui Gou

PhD Student @ Monash University / ByteDance Seed Edge

prof_pic.png

Monash University

Melbourne, Australia

I am a PhD student at Monash University, also working with ByteDance Seed Edge. My research focuses on AI Agents, Large Language Models (LLMs), Vision-Language Models (VLMs), and Generative AI.

Designing evolving agents while evolving myself.

news

Mar 01, 2026 Two papers accepted at CVPR 2026: VQ-VA World and An Empirical Study on How Video-LLMs Answer Video Questions.
Oct 01, 2025 Joined ByteDance Seed Edge as a research intern, working on Agentic and Self-Evolving AI.
Jun 01, 2025 Two papers accepted at CVPR 2025: DrVideo and Point-Cache.
Jan 01, 2025 BAGEL (Emerging Properties in Unified Multimodal Pretraining) released, reaching 5.5k GitHub stars.
Jun 01, 2024 Co-organized the CVPR 2024 Workshop on Robot Visual Perception in Human Crowded Environments and the JRDB-PanoTrack Challenge.

selected publications

  1. BAGEL
    Emerging Properties in Unified Multimodal Pretraining
    ByteDance Seed Team
    . Core contributor. 5.5k GitHub stars. , 2025
  2. CVPR
    VQ-VA World: Towards High-Quality Visual Question-Visual Answering
    Chenhui Gou*, Zilong Chen*, Zeyu Wang*, and 10 more authors
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026
  3. CVPR
    DrVideo: Document Retrieval Based Long Video Understanding
    Ziyu Ma*, Chenhui Gou*, Hengcan Shi, and 4 more authors
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025
  4. CVPR
    An Empirical Study on How Video-LLMs Answer Video Questions
    Chenhui Gou, Ziyu Ma, Zicheng Duan, and 6 more authors
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026
  5. NeurIPS
    RTFormer: Efficient Design for Real-Time Semantic Segmentation with Transformer
    Jian Wang*, Chenhui Gou*, Qiman Wu*, and 4 more authors
    In Advances in Neural Information Processing Systems (NeurIPS). Spotlight Presentation , 2022