Nvidia unveils Ampere-based GPU

May 18, 2020 //By Julien Happich
Ampere architecture
The Nvidia A100 GPU based on the company’s Ampere GPU architecture, is fabricated on the TSMC 7nm N7 manufacturing process. The device includes more streaming multiprocessors (SMs), larger and faster memory, and interconnect bandwidth with third-generation NVLink to deliver massive computational throughput.

The A100’s 40 GB (5-site) high-speed, HBM2 memory has a bandwidth of 1.6 TB/s, which is over 1.7x faster than V100. The 40 MB L2 cache on A100 is almost 7x larger than that of Tesla V100 and provides over 2x the L2 cache-read bandwidth, the company claims. Nvidia also released CUDA 11, the latest version of its Compute Unified Device Architecture parallel computing platform. CUDA 11 provides new specialized L2 cache management and residency control APIs on the A100. The SMs in A100 include a larger and faster combined L1 cache and shared memory unit (at 192 KB per SM) to provide 1.5x the aggregate capacity of the Volta V100 GPU.

The A100 GPU comes equipped with specialized hardware units including third-generation Tensor Cores, more video decoder (NVDEC) units, JPEG decoder and optical flow accelerators. All of these are used by various CUDA libraries to accelerate HPC and AI applications. The Multi-Instance GPU (MIG) feature can physically divide a single A100 GPU into multiple GPUs. It enables multiple clients such as VMs, containers, or processes to run simultaneously while providing error isolation and advanced quality of service (QoS) between these programs. MIG could be used to improve GPU utilization, for example to rent separate GPU instances, running multiple inference workloads on the GPU, hosting multiple Jupyter notebook sessions for model exploration, or resource sharing of the GPU among multiple internal users in an organization (single-tenant, multi-user).


Vous êtes certain ?

Si vous désactivez les cookies, vous ne pouvez plus naviguer sur le site.

Vous allez être rediriger vers Google.