Chenhui Gou

PhD Student @ Monash University / ByteDance Seed Edge

prof_pic.png

Monash University

Melbourne, Australia

I am a PhD student at Monash University, also working with ByteDance Seed Edge. My research focuses on AI Agents, Large Language Models (LLMs), Vision-Language Models (VLMs), and Generative AI.

Designing evolving agents while evolving myself.
AI Agents ยท LLMs ยท VLMs ยท Generative AI
AI Agents Foundation LLM/VLM Generative Model

news

Mar 01, 2026 Two papers accepted at CVPR 2026: VQ-VA World and An Empirical Study on How Video-LLMs Answer Video Questions.
Jan 01, 2025 BAGEL (Emerging Properties in Unified Multimodal Pretraining) released. GitHub stars

๐Ÿš€ experience

2020-12 โ€” 2021-06
Aibee Inc
Research Intern
Beijing, China
2020
2021-06 โ€” 2021-12
NIO Inc, Autonomous Driving
Research Intern
Beijing, China
2021
2021-12 โ€” 2022-08
Baidu, Vision Technology Department
Research Intern
Beijing, China
2021
2022-08 โ€” 2022-10
University of Technology Sydney
Research Project
Sydney, Australia
2022
2022-07 โ€” 2023-07
Australian National University
Research Project
Canberra, Australia
2022
2022-12 โ€” 2023-03
Sensetime Inc
Research Intern
ShenZhen, China
2022
2023-06 โ€” 2024-12
Vision-CAIR Group, KAUST
Research Intern
2023
2024-12 โ€” 2025-10
ByteDance Seed VLM-BAGEL Group
Research Intern
2024
2025-10 โ€” Now
ByteDance Seed Edge
Research Intern
2025
โ† drag to explore โ†’

selected publications

  1. Tech Report
    Seed1.5-VL Technical Report
    ByteDance Seed Team
    Contributor. , 2025
  2. BAGEL
    Emerging Properties in Unified Multimodal Pretraining
    ByteDance BAGEL Team
    Core contributor. GitHub stars , 2025
  3. CVPR
    VQ-VA World: Towards High-Quality Visual Question-Visual Answering
    Chenhui Gou*, Zilong Chen*, Zeyu Wang*, and 10 more authors
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). HF Downloads HF Downloads , 2026
  4. CVPR
    DrVideo: Document Retrieval Based Long Video Understanding
    Ziyu Ma*, Chenhui Gou*, Hengcan Shi, and 4 more authors
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025
  5. CVPR
    An Empirical Study on How Video-LLMs Answer Video Questions
    Chenhui Gou, Ziyu Ma, Zicheng Duan, and 6 more authors
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026
  6. NeurIPS
    RTFormer: Efficient Design for Real-Time Semantic Segmentation with Transformer
    Jian Wang*, Chenhui Gou*, Qiman Wu*, and 4 more authors
    In Advances in Neural Information Processing Systems (NeurIPS). Spotlight Presentation , 2022

๐ŸŒ visitors