The card is based on NVIDIA's Turing architecture.
It has 2,560 NVIDIA CUDA cores and 320 Turing Tensor Cores. Performance is listed as 8.1 TFLOPS (single-precision), 65 TFLOPS (mixed-precision FP16/FP32), 130 TOPS (INT8) and 260 TOPS (INT4).
The GPU includes 16 GB of GDDR6 memory with 300 GB/s memory bandwidth and supports ECC.
Yes. The product specification indicates ECC (error-correcting code) is supported for GPU memory.
It is a low-profile PCIe card using an x16 PCIe Gen3 interface.
The card uses a passive thermal solution, which requires adequate chassis or rack airflow provided by the host system.
The GPU supports CUDA and is compatible with NVIDIA TensorRT and ONNX-based workflows; it can be used with frameworks that target those runtimes.
This Turing-based T4 is optimized for AI inference and accelerated graphics/edge AI workloads thanks to its Tensor Cores and INT8/INT4 performance. It can be used for some training tasks, but higher-end GPUs are typically preferred for large-scale model training.
Yes, the low-profile PCIe form factor makes it suitable for many workstations and servers, but because it is passively cooled you must ensure the host provides sufficient airflow and that the motherboard has an available PCIe Gen3 x16 slot.
The specification lists a 32 GB/sec interconnect bandwidth and PCIe Gen3 x16 interface. NVLink is not listed in the provided specification—check the vendor or NVIDIA documentation if NVLink/multi-GPU interconnects are required.
A specific TDP/power draw is not listed in the provided spec. Because the card is passively cooled, ensure your system or server chassis supplies adequate directed airflow and that your power supply meets the overall system power requirements. Refer to the vendor datasheet for exact power specs.
Yes—since it supports CUDA and TensorRT, it can be used in containerized workflows (with NVIDIA Container Toolkit) and with GPU passthrough. For vGPU or full virtualization support, verify compatibility with NVIDIA virtualization solutions and your platform's vendor documentation.
The spec lists 32 GB/sec interconnect bandwidth for GPU-to-system connectivity. This reflects the effective data transfer bandwidth between the GPU and host subsystem; consult the detailed datasheet for how this metric is measured.
Install NVIDIA's appropriate GPU drivers and the CUDA Toolkit for compute workloads. For inference and model deployment, install TensorRT and an ONNX runtime as needed. Always match driver and toolkit versions to your OS and framework requirements.
Warranty, availability, and purchase terms vary by reseller and region. Check the seller or manufacturer's product page and datasheet for warranty information, supported SKUs, and authorized distributors.
Discover our latest orders