| Kun Yuan

2026

Decentralized Optimization over Time-Varying Row-Stochastic Digraphs
L. Liang, Y. Song, and K. Yuan
arXiv preprint: 2512.24483
Greedy Low-Rank Gradient Compression for Distributed Learning with Convergence Guarantees
C. Chen, Y. He, P. Li, W. Jia, and K. Yuan
IEEE Transactions on Signal Processing (IEEE TSP)
CR-Net: Scaling Parameter-Efficient Training with Cross-Layer Low-Rank Structure
B. Kong, J. Liang, Y. Liu, R. Deng, and K. Yuan
International Conference on Learning Representations (ICLR)

2025

A Convergence-Inspired Learning-to-Optimize Framework for Decentralized Optimization
Y. He, Q. Shang, X. Huang, J. Liu, and K. Yuan
IEEE Transactions on Signal Processing (IEEE TSP)
Decentralized Bilevel Optimization: A Perspective from Transient Iteration Complexity
B. Kong, S. Zhu, S. Lu, X. Huang, and K. Yuan
Journal of Machine Learning Research (JMLR)
Revisiting Gradient Normalization and Clipping for Nonconvex SGD under Heavy-Tailed Noise: Necessity, Sufficiency, and Acceleration
T. Sun, X. Liu, and K. Yuan
Journal of Machine Learning Research (JMLR)
Optimal Complexity in Byzantine-Robust Distributed Stochastic Optimization with Data Heterogeneity
Q. Shi, J. Peng, K. Yuan, X. Wang, and Q. Ling
Journal of Machine Learning Research (JMLR)
Clapping: Removing Per-sample Storage for Pipeline Parallel Distributed Optimization with Communication Compression
B. Kong, X. Huang, Y. Xu, Y. Liang, B. Wang, and K. Yuan
arXiv preprint: 2509.19029
MeCeFO: Enhancing LLM Training Robustness via Fault-Tolerant Optimization
R. Hu, Y. He, R. Yan, M. Sun, B. Yuan, and K. Yuan
The Conference on Neural Information Processing Systems (NeurIPS)
MISA: Memory-Efficient LLMs Optimization with Module-wise Importance Sampling
Y. Liu, R. Deng, Y. He, X. Wang, T. Yao, and K. Yuan
The Conference on Neural Information Processing Systems (NeurIPS)
Improving Model Representation and Reducing KV Cache via Skip Connections with First Value Heads
Z. Wu, Y. Zhang, Y. Dong, C. Zhang, C. Fang, K. Yuan, and Z. Lin
The Conference on Neural Information Processing Systems (NeurIPS)
Greedy Low-Rank Gradient Compression for Distributed Learning with Convergence Guarantees
C. Chen, Y. He, P. Li, W. Jia, and K. Yuan
arXiv preprint: 2507.08784
On the Trade-off between Flatness and Optimization in Distributed Learning
Y. Cao, Z. Wu, K. Yuan, and A. H. Sayed
IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI)
On the Linear Speedup of the Push-Pull Method for Decentralized Optimization over Digraphs
L. Liang, G. Luo, and K. Yuan
arXiv preprint: 2506.18075
Subspace Optimization for Large Language Models with Convergence Guarantees
Y. He, P. Li, Y. Hu, C. Chen, and K. Yuan
International Conference on Machine Learning (ICML)
A Memory Efficient Randomized Subspace Optimization Method for Training Large Language Models
Y. Chen, Y. Zhang, Y. Liu, K. Yuan, and Z. Wen
International Conference on Machine Learning (ICML)
Distributed Retraction-Free and Communication-Efficient Optimization on the Stiefel Manifold
Y. Song, P. Li, B. Gao, and K. Yuan
International Conference on Machine Learning (ICML)
Achieving Linear Speedup and Optimal Complexity for Decentralized Optimization over Row-stochastic Networks
L. Liang, G. Luo, X. Chen, and K. Yuan
International Conference on Machine Learning (ICML) Spotlight
Efficient Multi-Objective Learning under Preference Guidance: A First-Order Penalty Approach
L. Chen, Q. Xiao, E. H. Fukuda, X. Chen, K. Yuan, and T. Chen
International Conference on Machine Learning (ICML) Spotlight
Understanding the Influence of Digraphs on Decentralized Optimization: Effective Metrics, Lower Bound, and Optimal Algorithm
L. Liang, X. Huang, R. Xin, K. Yuan
SIAM Journal on Optimization
BEVHeight++: Toward Robust Visual Centric 3D Object Detection
L. Yang, T. Tang, J. Li, K. Yuan, K. Wu, P. Chen, L. Wang, Y. Huang, L. Li, X. Zhang, K. Yu
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
CE-LoRA: Computation-Efficient LoRA Fine-Tuning for Language Models
G. Chen, Y. He, Y. Hu, K. Yuan, and B. Yuan
arXiv preprint: 2502.01378
Enhancing Zeroth-Order Fine-Tuning for Language Models with Low-Rank Structures
Y. Chen, Y. Zhang, L. Cao, K. Yuan, and Z. Wen
International Conference on Learning Representations (ICLR)

2024

Heavy-Tail phenomenon in decentralized SGD
M. Gurbuzbalaban, Y. Hu, U. Simsekli, K. Yuan, and L. Zhu
IISE Transactions
SPARKLE: A Unified Single-Loop Primal-Dual Framework for Decentralized Bilevel Optimization
S. Zhu, B. Kong, S. Lu, X. Huang, and K. Yuan
The Conference on Neural Information Processing Systems (NeurIPS)
S3 Attention: Improving Long Sequence Attention with Smoothed Skeleton Sketching
X. Wang, T. Zhou, J. Zhu, J. Liu, K. Yuan, T. Yao, W. Yin, R. Jin, H. Cai
IEEE Journal of Selected Topics in Signal Processing
Distributed Bilevel Optimization with Communication Compression
Y. He, J. Hu, X. Huang, S. Lu, B. Wang, and K. Yuan
International Conference on Machine Learning (ICML)
Asynchronous Diffusion Learning with Agent Subsampling and Local Updates
Elsa Rizk, Kun Yuan, Ali H. Sayed
The IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
Momentum Benefits Non-IID Federated Learning Simply and Provably
Z. Cheng , X. Huang, P. Wu, and K. Yuan
International Conference on Learning Representations (ICLR)
Decentralized Bilevel Optimization over Graphs: Loopless Algorithmic Update and Transient Iteration Complexity
B. Kong, S. Zhu, S. Lu, X. Huang, K. Yuan
arXiv preprint: 2402.03167

2023

Sharper Convergence Guarantees for Federated Learning with Partial Model Personalization
Y. Chen, L. Cao, K. Yuan, and Z. Wen
arXiv preprint: 2309.17409
Lower Bounds and Accelerated Algorithms in Distributed Stochastic Optimization with Communication Compression
Y. He , X. Huang, Y. Chen, W. Yin, and K. Yuan
arXiv preprint: 2305.07612
An Enhanced Gradient-Tracking Bound for Distributed Online Stochastic Convex Optimization
S. A. Alghunaim and K. Yuan
Signal Processing
Unbiased Compression Saves Communication in Distributed Optimization: When and How Much?
Y. He , X. Huang, and K. Yuan
The Conference on Neural Information Processing Systems (NeurIPS)
Removing data heterogeneity influence enhances network topology dependence of decentralized SGD
K. Yuan, S. A. Alghunaim, and X. Huang
Journal of Machine Learning Research (JMLR)
Achieving Linear Speedup with Network-Independent Learning Rates in Decentralized Stochastic Optimization
H. Yuan, S. A. Alghunaim, and K. Yuan
IEEE Conference on Decision and Control (CDC)
On the Performance of Gradient Tracking with Local Updates
E. D. H. Nguyen, S. A. Alghunaim, K. Yuan, and C. A. Uribe
IEEE Conference on Decision and Control (CDC)
DSGD-CECA: Decentralized SGD with Communication-Optimal Exact Consensus Algorithm
L. Ding, K. Jin, B. Ying, K. Yuan, and W. Yin
The International Conference on Machine Learning (ICML)
[Code]
AdaNPC: Exploring Non-Parametric Classifier for Test-Time Adaptation
Y.-F. Zhang, X. Wang, K. Jin, K. Yuan, Z. Zhang, L. Wang, R. Jin, and T. Tan
The International Conference on Machine Learning (ICML)
[Code]
BEVHeight: A Robust Framework for Vision-based Roadside 3D Object Detection
L. Yang, K. Yu, T. Tang, J. Li, K. Yuan, L. Wang, X. Zhang, and P. Chen
The IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR)
[Code]

2022

Revisiting optimal convergence rate for smooth and non-convex stochastic decentralized optimization
K. Yuan, X. Huang, Y. Chen, X. Zhang, Y. Zhang, and P. Pan
The Conference on Neural Information Processing Systems (NeurIPS)
Communication-efficient topologies for decentralized learning with O(1) consensus rate
Z. Song, W. Li, K. Jin, L. Shi, M. Yan, W. Yin, and K. Yuan
The Conference on Neural Information Processing Systems (NeurIPS)
[Code] [Poster] [5-min video presentation]
Lower bounds and nearly optimal algorithms in distributed learning with communication compression
X. Huang, Y. Chen, W. Yin, and K. Yuan
The Conference on Neural Information Processing Systems (NeurIPS)
A unified and refined convergence analysis for non-convex decentralized learning
S. A. Alghunaim and K. Yuan
IEEE Transactions on Signal Processing
A Byzantine-resilient dual subgradient method for vertical federated learning
K. Yuan, Z. Wu, and Q. Ling
The IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
CHEX: Channel exploration for CNN model compression
Z. Hou, M. Qin, F. Sun, X. Ma, K. Yuan, Y. Xu, Y.-K. Chen, R. Jin, Y. Xie, and S.-Y. Kung
The IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR)
[Code]
Effective model sparsification by scheduled Grow-and-Prune methods
X. Ma, M. Qin, F. Sun, Z. Hou, K. Yuan, Y. Xu, Y. Wang, Y.-K. Chen, R. Jin, and Y. Xie
The International Conference on Learning Representations (ICLR)
[Code]

2021

Exponential graph is provably efficient for decentralized deep training
B. Ying, K. Yuan, Y. Chen, H. Hu, P. Pan, and W. Yin
The Conference on Neural Information Processing Systems (NeurIPS)
[Code]
Improved analysis and rates for variance reduction under without-replacement sampling orders
X. Huang, K. Yuan, X. Mao, and W. Yin
The Conference on Neural Information Processing Systems (NeurIPS)
DecentLaM: Decentralized momentum SGD for large-batch deep training
K. Yuan, Y. Chen, X. Huang, Y. Zhang, P. Pan, Y. Xu, and W. Yin
The International Conference on Computer Vision (ICCV)
Accelerating gossip SGD with periodic global averaging
Y. Chen, K. Yuan, Y. Zhang, P. Pan, Y. Xu, and W. Yin
The International Conference on Machine Learning (ICML)

2020

Multiagent fully decentralized value function learning with linear convergence rates
L. Cassano, K. Yuan, and A. H. Sayed
IEEE Transactions on Automatic Control Short version appeared in ECC 2019
Decentralized proximal gradient algorithms with linear convergence rates
S. A. Alghunaim, E. K. Ryu, K. Yuan, and A. H. Sayed
IEEE Transactions on Automatic Control
Walkman: A communication-efficient random-walk algorithm for decentralized optimization
X. Mao, K. Yuan, Y. Hu, Y. Gu, and A. H. Sayed, W. Yin
IEEE Transactions on Signal Processing
Can primal methods outperform primal-dual methods in decentralized dynamic optimization?
K. Yuan, W. Xu, and Q. Ling
IEEE Transactions on Signal Processing Short version appeared in Asilomar 2019
On the influence of bias-correction on distributed stochastic optimization
K. Yuan, S. A. Alghunaim, B. Ying, and A. H. Sayed
IEEE Transactions on Signal Processing Short version appeared in CDC 2019
Variance-reduced stochastic learning under random reshuffling
B. Ying, K. Yuan, and A. H. Sayed
IEEE Transactions on Signal Processing Short version appeared in ICASSP 2018

2019

A proximal diffusion strategy for multiagent optimization with sparse affine constraints
S. A. Alghunaim, K. Yuan, and A. H. Sayed
IEEE Transactions on Automatic Control

Dynamic average diffusion with randomized coordinate updates
B. Ying, K. Yuan, and A. H. Sayed
IEEE Transactions on Signal and Information Processing over Networks

Decentralized dynamic admm with quantized and censored communications
Y. Liu, K. Yuan, G. Wu, Z. Tian, and Q. Ling
Asilomar Conference on Signals, Systems, and Computers

COVER: A cluster-based variance reduced method for online learning
K. Yuan, B. Ying, and A. H. Sayed
The International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
A linearly convergent proximal gradient algorithm for decentralized optimization
S. A. Alghunaim, K. Yuan, and A. H. Sayed
The Conference on Neural Information Processing Systems (NeurIPS)

2018

Dual coupled diffusion for distributed optimization with affine constraints
S. A. Alghunaim, K. Yuan, and A. H. Sayed
The IEEE Conference on Decision and Control (CDC)
Supervised learning under distributed features
B. Ying, K. Yuan, and A. H. Sayed
IEEE Transactions on Signal Processing Short version appeared in IEEE Data Science Workshop (DSW) 2018
Stochastic learning under random reshuffling with constant step-sizes
B. Ying, K. Yuan, S. Vlaski, and A. H. Sayed
IEEE Transactions on Signal Processing
Exact diffusion for distributed optimization and learning – Part I: Algorithm development
K. Yuan, B. Ying, X. Zhao, and A. H. Sayed
IEEE Transactions on Signal Processing Short version appeared in EUSIPCO 2017
Exact diffusion for distributed optimization and learning – Part II: Convergence analysis
K. Yuan, B. Ying, X. Zhao, and A. H. Sayed
IEEE Transactions on Signal Processing
Variance-reduced stochastic learning by networked agents under random reshuffling
K. Yuan, B. Ying, J. Liu, and A. H. Sayed
IEEE Transactions on Signal Processing Short version appeared in EUSIPCO 2018

2017

Decentralized exact coupled optimization
S. A. Alghunaim, K. Yuan, and A. H. Sayed
Allerton Conference on Communication, Control, and Computing (Allerton)
Decentralized consensus optimization with asynchrony and delays
T. Wu, K. Yuan, Q. Ling, W. Yin, and A. H. Sayed
IEEE Transactions on Signal and Information Processing over Networks
On the performance of random reshuffling in stochastic learning
B. Ying, K. Yuan, S. Vlaski, and A. H. Sayed
Information Theory and Application Workshop (ITA)

2016

Stochastic gradient descent with finite samples sizes
K. Yuan, B. Ying, S. Vlaski, and A. H. Sayed
IEEE Workshop on Machine Learning for Signal Processing (MLSP)
Online dual coordinate ascent learning
B. Ying, K. Yuan, and A. H. Sayed
European Signal Processing Conference (EUSIPCO)
On the influence of momentum acceleration on online learning
K. Yuan, B. Ying, and A. H. Sayed
Journal of Machine Learning Research
On the convergence of decentralized gradient descent
K. Yuan, Q. Ling, and W. Yin
SIAM Journal on Optimization
ICCM Distinguished Paper Award
Highly Cited Paper in SIAM Journal on Optimization

2015

A decentralised linear programming approach to energy–efficient event detection
K. Yuan, Q. Ling, and Z. Tian
International Journal of Sensor Networks

2014

Communication-efficient decentralized event monitoring in wireless sensor networks
K. Yuan, Q. Ling, and Z. Tian
IEEE Transactions on Parallel and Distributed Systems
On the linear convergence of the ADMM in decentralized consensus optimization
W. Shi, Q. Ling, K. Yuan, G, Wu, and W. Yin
IEEE Transactions on Signal Processing
IEEE Signal Processing Society Yougn Author Best Paper Award [Awardee list] [News]
Highly Cited Paper in IEEE Transactions on Signal Processing

2013

A linearized Bregman algorithm for decentralized basis pursuit
K. Yuan, Q. Ling, W. Yin, and A. Ribeiro
European Signal Processing Conference (EUSIPCO)
Linearly convergent decentralized consensus optimization with the alternating direction method of multipliers
W. Shi, Q. Ling, K. Yuan, G, Wu, and W. Yin
The IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)