Description: Learn how to accelerate and optimize existing C/C++ CPU-only applications using the most essential CUDA tools and techniques. You’ll also learn an iterative style of CUDA development that will allow you to ship accelerated applications fast.
This workshop teaches the fundamental tools and techniques for accelerating C/C++ applications to run on massively parallel GPUs with CUDA®. You’ll learn how to write code, configure code parallelization with CUDA, optimize memory migration between the CPU and GPU accelerator, and implement the workflow that you’ve learned on a new task—accelerating a fully functional, but CPU-only, particle simulator for observable massive performance gains. At the end of the workshop, you’ll have access to additional resources to create new GPU-accelerated applications on your own.
At the end of the workshop, participants can obtain an official certificate from Deep Learning Institute from NVIDIA.
Workflow: The workshop takes place remotely via a browser on the AWS cloud infrastructure.
Difficulty: Basic
Language: English
Prerequisite knowledge: Basic C/C++ competency, including familiarity with variable types, loops, conditional statements, functions, and array manipulations. No previous knowledge of CUDA programming is assumed.
Target audience: HPC developers using CUDA in the network or cloud.
Skills to be gained: At the conclusion of the workshop, you’ll have an understanding of the fundamental tools and techniques for GPU-accelerating C/C++ applications with CUDA and be able to:
– Write code to be executed by a GPU accelerator
– Expose and express data and instruction-level parallelism in C/C++ applications using CUDA
– Utilize CUDA-managed memory and optimize memory migration using asynchronous prefetching
– Leverage command-line and visual profilers to guide your work
– Utilize concurrent streams for instruction-level parallelism
– Write GPU-accelerated CUDA C/C++ applications, or refactor existing CPU-only applications, using a profile-driven approach