A versatile datacenter GPU for inference and breakthrough multi-precision performance
INFERENCE GPU based on the Turing Architecture
The NVIDIA® T4 is a universal deep learning accelerator for distributed computing environments. Based on the Turing architecture, the T4 provides state-of-the-art multi-precision performance to accelerate DL and ML training, inference and video transcoding, among other applications.
T4 Highlights
- 8.1 TFlops of single precision performance - 2560 CUDA cores - 320 Tensor Cores - 16GB of GDDR6 memory at 300GB/sec - 70W power consumption - PCIe3 Gen3 interface with 32GB/sec bandwidth
The T4 Value Proposition
- An ideal universal accelerator for workloads - Multi-precision performance for DL/ML training, inference and video transcoding - Small form factor 70W design that lends itself to a scale out architecture. - Powerful RT (ray tracing) cores , combined with RTX technology for real-time ray traced rendering