Last night, during the GPU Technology Conference, CEO Jen-Hsun Huang has announced the very first GPU based on the Volta architecture, the Tesla V100. According to him the new chip is the most advanced accelerator ever built, powered by 5120 CUDA cores and over 21 billion transistors. Furthermore the chip features 16GB of HBM2 memory pumping out 900 GB/s of bandwidth.
The Volta GV100 GPU uses TSMC 12nm FFN process, has over 21 billion transistors and is designed for deep learning applications. The new GV100 GPU packs 84 SMs with 128KB of shared L1 cache per SM that can be configured with different texture cache and shared memory ratios. Compared to that the GP100 featured 60 SMs and a total of 3840 CUDA cores, therefore the new Volta chip should pack significantly more compute performance.
In order to improve FP32 and FP64 performance, NVIDIA has equipped the GV100 with a new SM Processor. It’s supposed to be 50 percent more energy efficient than the one that was used with the Pascal architecture. In addition, Volta is equipped with new "Tensor Cores" which should deliver up to 12 times higher TFLOP numbers in the case of Deep Learning applications.
Like the GP100, the upcoming GV100 chip also features four HBM2 memory modules and each stack is controlled by a pair of memory controllers. There are eight 512-bit memory controllers giving this GPU a total memory bus width of 4096-bit. Each memory controller is attached to 768KB of L2 cache, accounting for a total of 6MB of L2 cache (4MB on Pascal). This combination allows for 1.5 times more memory bandwidth when compared to the GP100’s architecture.
Source:
KitGuru