BlueFog: A Decentralized Framework for Optimization and Deep Learning
Decentralized optimization algorithms are low-communication-overhead alternatives to traditional distributed algorithms using a center to conduct global average. However, the lack of an easy-to-use and efficient software package has kept most decentralized algorithms merely on paper. BlueFog is the first python library for straightforward, high-performance implementations of diverse decentralized algorithms.
Below are the charts representing the performance of BlueFog that was done on ResNet50 benchmark. Each machine has 8 V100 GPUs (64GB memory) with NVLink-enabled and the inter-connected communication speed is 25Gbps. This is the same hardware setup you can get on AWS clusters. We test the scaling efficiency with a batch size of 64 for a computationally intensive scenario, and a batch size of 32 for a communicationally intensive scenario.
In the figures, the black box represents the ideal linear scaling. It is observed that Bluefog can achieve over 95% scaling efficiency while Horovod (a state-of-the-art distributed deep learning training framework built by Uber AI team) reaches around 66% sacling efficiency with batch size 64 on 128 GPUs. For the communicationally intensive scenario with batch size 32, the scaling efficiency gap between Bluefog and Horovod becomes even larger. To
understand more details about the BlueFog benchmark, checkout the performance page.
Code
BlueFog is an open-source library located at Github. Jupyter notebook tutorials are availabe at here.
Papers
The implemented algorithms and system design in BlueFog can be found in the following papers:
-
Bluefog: Make decentralized algorithms practical for optimization and deep learning
B. Ying, K. Yuan, H. Hu, Y. Chen, and W. Yin
-
Communicate then adapt: An effective decentralized adaptive method for deep training
B. YingE, K. YuanE, Y. ChenE, H. HuE, Y. Zhang, P. Pan, and W. Yin
-
Exponential graph is provably efficient for decentralized deep training
B. YingE, K. YuanE, Y. ChenE, H. Hu, P. Pan, and W. Yin NeurIPS 2021
-
DecentLaM: Decentralized momentum SGD for large-batch deep training
K. YuanEC, Y. ChenE, X. HuangE, Y. Zhang, P. Pan, Y. Xu, and W. Yin ICCV 2021
-
On the influence of bias-correction on distributed stochastic optimization
K. Yuan, S. A. Alghunaim, B. Ying, and A. H. Sayed IEEE Transactions on Signal Processing, 2021
Talks
The introduction and usage of BlueFog are reported in the following talks: