Hi, I am a second-year PhD student in The Hong Kong University of Science and Technology, Department of Computer Science and Engineering. I am fortunate to be advised by Prof. Junxian He. Before that, I received the bachelor degree in Computer Science in Shanghai Jiao Tong University in 2023.
Research Interests
I am primarily focused on large language models, particularly in advancing their reasoning capabilities and multimodal understanding. To achieve this, my research interests lie in:
- Enhancing reasoning and planning abilities through self-improvement and RL techniques. (B-STaR, simpleRL)
- Developing reliable evaluation methods for language models. (C-Eval, LLM-Compression-Intelligence)
- Improving the architecture and training methods of multimodal models to strengthen their understanding across multiple modalities.
I am open to any collaboration 🤗
Publications
Most recent publications on Google Scholar.
* denotes co-first authors
7B Model and 8K Examples: Emerging Reasoning with Reinforcement Learning is Both Effective and Efficient
Weihao Zeng *, Yuzhen Huang *, Wei Liu, Keqing He, Qian Liu, Zejun Ma, Junxian He\ Notion. [notion] [github] [Hugging Face]
- Training on a 7B model using only 8K MATH examples, achieving strong performance in complex mathematical reasoning.
- Demonstrated that a 7B model develops long CoT and self-reflection through RL with a simple design.
- Outperforms methods that use over 50× more data and complex architectures.
Predictive Data Selection: The Data That Predicts Is the Data That Teaches
Kashun Shum *, Yuzhen Huang *, Hongjian Zou, Ding Qi, Yixuan Liao, Xiaoxin Chen, Qian Liu, Junxian He
Arxiv 2025. [arxiv] [github] [dataset]
- Leverages compression efficiency to identify high-quality data that enhances downstream performance.
- Introduces PRESELECT, a lightweight data selection method based on predictive strength.
- Demonstrates a 10x reduction in compute requirements and significant performance improvements.
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners
Weihao Zeng * , Yuzhen Huang *, Lulu Zhao, Yijun Wang, Zifei Shan, Junxian He
ICLR 2025. [arxiv] [github]
- Quantitatively analyze the dynamics of exploration and exploitation during self-improvement.
- Introduce B-STaR, a Self-Taught Reasoning framework that autonomously adjusts its configurations.
- Balance exploration and exploitation, leading to superior performance.
Compression Represents Intelligence Linearly
Yuzhen Huang *, Jinghan Zhang *, Zifei Shan, Junxian He
COLM 2024. [arxiv] [github] [dataset]
- Investigate the linear correlation between compression and intelligence in LLMs.
- Provide evidence for the belief that superior compression is indicative of greater intelligence.
- Propose compression efficiency serves as an unsupervised and reliable metric to assess LLMs’ abilities.
C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation Models
Yuzhen Huang *, Yuzhuo Bai *, Zhihao Zhu, Junlei Zhang, Jinghan Zhang, Tangjun Su, Junteng Liu, Chuancheng Lv, Yikai Zhang, Jiayi Lei, Yao Fu, Maosong Sun, Junxian He
NeurIPS 2023 (Datasets and Benchmarks track). [arxiv] [github] [website] [dataset]
- The first comprehensive Chinese evaluation suite for LLMs.
- Conduct a thorough evaluation of the most advanced LLMs.
- Over 9.8M downloads on Hugging Face and more than 100 models on leaderboard.
Experiences
Academia
- 2024.02 - now PhD student, Department of CSE, HKUST, Hong Kong SAR, China.
- 2019.09 - 2023.06 Undergraduate, Computer Science, Shanghai Jiao Tong University, Shanghai, China.
Industry
- 2023.11 - 2024.01 Research Intern, Wechat, Tencent.
Service
Reviewer: NeurIPS 2024, ICLR 2025, ICML 2025
Invited Talks
- Mar 2025, Georgia Tech PAIR, Emerging Reasoning with Reinforcement Learning is Both Effective and Efficient.
- Feb 2025, Apple AIML, Emerging Reasoning with Reinforcement Learning is Both Effective and Efficient.
- May 2024, BAAI, Compression Represents Intelligence Linearly. [video]