Shaoguang Wang
MPhil Student in Artificial Intelligence
The Hong Kong University of Science and Technology (HKUST), Guangzhou
Advised by Prof. Hui Xiong and Prof. Xuming Hu
My research centers on Multimodal Large Language Models (MLLMs) — from efficient long video understanding (query-aware keyframe selection, token-efficient prompting, and multimodal reasoning for Video-QA) to Vision-Language-Action models for embodied AI and AI for Science. My recent work has been published at CVPR and NeurIPS.
Selected Publications
- Less is More: Token-Efficient Video-QA via Adaptive Frame-Pruning and Semantic Graph Integration — CVPR Findings 2026. PDF · Code
- Logic-in-Frames: Dynamic Keyframe Search via Visual Semantic-Logical Verification — NeurIPS 2025. PDF · Code
For the complete, up-to-date publication list, see my Google Scholar profile.
Contact
Email: shaoguangwang9@gmail.com
Google Scholar: profile
GitHub: @shaoguangwang
This page renders its main content with JavaScript. For the full interactive experience, please enable JavaScript in your browser.