Yanjun Zhao

I'm a first-year Ph.D. student at the University of Illinois Urbana-Champaign (UIUC), supervised by Prof. Jingrui He.

My research focuses on LLM agent, data mining and zeroth-order optimization.

I earned my bachelor's degree and master's degree from Xi'an Jiaotong University, where I worked closely with Prof Yi Qian and Prof Haishan Ye. Previously, I've been a visiting scholar in UIUC collaborating with Prof Huan Zhang. In addition, I worked as a research intern in Decision Intelligence Lab-Alibaba DAMO Academy (working with Doc. Liang Sun and Doc. Tian Zhou ); in Bilibili Inc (working with Doc. Tianjiao Li ).

Email / Google Scholar / Github

Research

Second-Order Fine-Tuning without Pain for LLMs: A Hessian Informed Zeroth-Order Optimizer
Yanjun Zhao*, Sizhe Dang*, Haishan Ye,
Guang Dai, Yi Qian, Ivor W.Tsang
ICLR 2025 , github

HiZOO, a diagonal Hessian informed zeroth-order optimizer which is the first work to leverage the diagonal Hessian to enhance zeroth-order optimizer for fine-tuning LLMs.

A Differentiable Sparse Vector Quantization (SVQ) for Spatio-Temporal Forecasting
Chao Chen*, Tian Zhou*, Yanjun Zhao,
Liang Sun, Qian Yi, Rong Jin
KDD 2025, github

SVQ leverages sparse regression for succinct representation, which theoretically and practically favored over classical clustering based vector quantization methods.

GCformer: An Efficient Framework for Accurate and Scalable Long-Term Multivariate Time Series Forecasting
Yanjun Zhao*, Ziqing Ma* Tian Zhou*,
Liang Sun Mengni Ye, Qian Yi,
CIKM 2023, github

GCformer combines a structured global convolutional branch for processing long input sequences with a local Transformer-based branch for capturing short, recent signals.

SABER: Switchable and Balanced Training for Efficient LLM Reasoning
Kai Zhao* Yanjun Zhao*, Jiaming Song, Shien He
Lusheng Zhang, Qiang Zhang, Tianjiao Li
arxiv, 2025

We propose SABER (Switchable and Balanced Training for Efficient LLM Reasoning), a reinforcement learning framework that endows LLMs with user-controllable, token-budgeted reasoning.

FZOO: Fast Zeroth-Order Optimizer for Fine-Tuning Large Language Models towards Adam-Scale Speed
Sizhe Dang*, Yangyang Guo*, Yanjun Zhao*,
Haishan Ye, Xiaodong Zheng, Guang Dai, Ivor Tsang
arxiv, 2025

FZOO reduces the total forward passes needed for convergence by employing batched one-sided estimates that adapt step-sizes based on the standard deviation of batch losses, while accelerates per-batch computation through the use of Rademacher random vector perturbations.

Sparse-VQ Transformer: An FFN-Free Framework with Vector Quantization for Enhanced Time Series Forecasting
Yanjun Zhao*, Tian Zhou*, Chao Chen,
Liang Sun, Qian Yi, Rong Jin
arXiv, 2024, github

Sparse-VQ cooperates with Reverse Instance Normalization (RevIN) to reduce noise impact and capture sufficient statistics for forecasting, serving as an alternative to the Feed-Forward layer (FFN) in the transformer architecture.

Website Template.