Practical Considerations for the End User

By Michael Catalano 

The NVIDIA DGX-1 is a state-of-the-art integrated system for deep learning and AI development. In this white paper, you will learn the best practices for dramatic acceleration of deep learning algorithms over CPU-based hardware. This includes how the DGX-1 can bring efficiencies to training on batch size, input image size and model complexity.

Learn more about our work in AI research and development.

Abstract

The NVIDIA DGX-1 is a state of the art integrated system for deep learning and AI development. Making use of 8 interconnected NVIDIA Tesla V100 GPUs, the DGX-1 offers dramatic acceleration of deep learning algorithms over CPU-based hardware. In this paper, we highlight a few best practices that enable the DGX-1 end-user to fully capitalize on its industry-leading performance. Benchmark testing was conducted with a common GPU workload, convolutional neural network (CNN) training, using the Keras deep learning API. We first examined the dependence of training efficiency on 3 factors: batch size, input image size and model complexity. Next, the scalability of training speed was assessed using a multi-tower, data-parallel approach. Finally, we demonstrate the importance of learning rate scaling when employing multiple GPU workers.

Login or register to view the rest of the white paper.