Best GPU for deep learning in 2020-2021: RTX 3090 vs. RTX 3080 vs. TITAN RTX vs. RTX 2080 Ti benchmarks (FP32, FP16)

November, 11, 2020

Introduction

In this post, we are comparing the most popular graphics cards for deep learning in 2020–2021: NVIDIA RTX 3090, RTX 3080, RTX 2080 Ti, and Titan RTX.

Methodology

  • We are using the standard "tf_cnn_benchmarks.py" benchmark script from the official TensorFlow GitHub (more details).
  • We ran tests on the following networks: ResNet-50, ResNet-152, Inception v3, Inception v4, VGG-16.
  • We compared FP16 to FP32 performance and used standard batch sizes (64 in most cases).
  • We compared GPU scaling (1 GPU vs 4 GPUs, vs 2 GPUs for RTX 30xx).
  • Please note: it is important to have the same drivers and frameworks versions installed on the workstation to get accurate results.

Hardware

Test bench:
bizon g3000
BIZON G3000 (1-4x GPU deep learning desktop)
More details: https://bizon-tech.com/bizon-g3000.html

Tech specs:
  • CPU: Intel Core i9-10980XE 18-Core 3.00GHz
  • Overclocking: Stage #3 +600 MHz (up to + 30% performance)
  • Cooling: Liquid Cooling System (CPU; extra stability and low noise)
  • Memory: 256 GB (6 x 32 GB) DDR4 3000 MHz
  • Operating System: BIZON Z–Stack (Ubuntu 20.04 (Bionic) with preinstalled deep learning frameworks)
  • HDD: 1TB PCIe SSD
  • Network: 10 GBIT

BIZON Z5000 (liquid cooled deep learning and GPU rendering workstation PC)
More details: https://bizon-tech.com/bizon-z5000.html
bizon z5000 Tech specs:
  • CPU: Intel Core i9-10980XE 18-Core 3.00GHz
  • Overclocking: Stage #3 +600 MHz (up to + 30% performance)
  • Cooling: Custom water-cooling system (CPU + GPUs)
  • Memory: 256 GB (6 x 32 GB) DDR4 3000 MHz
  • Operating System: BIZON Z–Stack (Ubuntu 20.04 (Bionic) with preinstalled deep learning frameworks)
  • HDD: 1TB PCIe SSD
  • Network: 10 GBIT

Software

Deep Learning Models:
  • Resnet50
  • Resnet152
  • Inception V3
  • Inception V4
  • VGG16

Drivers and Batch Size:
  • Nvidia Driver: 455
  • CUDA: 11.1
  • TensorFlow: 1.x
  • Batch size: 64

Benchmarks



Note: RTX 3090 and RTX 3080 tested in 2-GPU configuration.


Conclusion

Winner: NVIDIA RTX 3090, 24 GB
Price: $1500

RTX 3090 Academic discounts are available.
Notes: Water cooling required for 4 x RTX 3090 configurations.
The RTX 3090 has the best of both worlds: excellent performance and price.
When used as a pair with an NVLink bridge, you effectively have 48 GB of memory to train large models.
The main problem with the RTX 3090 is cooling. Since the RTX 3090 does not have blower-style fans, it will immediately reach 90°C and stop working (GPU will activate thermal throttling mode).
We have seen an up to 60% (!) performance drop due to overheating.
Liquid cooling is the best solution; providing 24/7 stability, low noise, greater hardware longevity, and maximum performance out of the RTX 3090.
As per our tests, a water-cooled RTX 3090 will stay within a safe range of 50-60°C vs 90°C when air-cooled (90°C is the red zone where the GPU will stop working and shutdown).
Noise is another important point to mention. 4 x air-cooled GPUs are pretty noisy due to their blower-style fans.
Keeping the workstation in the lab or office is impossible - not to mention servers. The noise level is so high that it’s almost impossible to speak when they are running.
Liquid cooling resolves the noise issue in desktops and servers.
Noise is 20% lower vs. air cooling (49 dB for water cooling vs. 62 dB for air cooling on maximum load). One could place a workstation or even a server with massive computing power in an office or lab.
BIZON designed an enterprise-class custom liquid-cooling system for servers and workstations.
Recommended models:
We offer desktops and servers with RTX 3090.
Desktops:
Servers:

RTX 3080, 10 GB
Price: $700

RTX 3080 Academic discounts are available.
RTX 3080 is an excellent GPU for deep learning, and offers the best performance/price ratio.
The main limitation is the VRAM size. Training on RTX 3080 will require small batch sizes and in some cases, you will not be able to train large models.
Recommended models:
We offer desktops and servers with RTX 3080.
Desktops:
Servers:


You can find more NVIDIA RTX 3080 vs RTX 3090 GPU Deep Learning Benchmarks here.


NVIDIA TITAN RTX, 24 GB
Price: $2500

Titan RTX Academic discounts are available.
Notes: Water-cooling required for 4x TITAN RTX configurations. TITAN RTX will overheat immediately with air cooling and will stop working.
The TITAN RTX has the best of two worlds: excellent performance and price. As you can see from the benchmarks, performance almost matches the Quadro RTX 8000 in most cases at half the cost.
When used as a pair with an NVLink bridge, you effectively have 48 GB of memory to train large models.
The main problem with the TITAN RTX is cooling. Since the Titans are not equipped with blower-style fans, they will almost immediately reach 90°C and stop working (GPU will activate thermal throttling mode).
We have seen an up to 60% (!) performance drop due to overheating. Liquid cooling is the best solution for this issue; providing 24/7 stability, low noise, long-life for components, and maximum performance out of the TITAN RTX. As per our tests, a water-cooled TITAN RTX will stay within a safe range of 50-60°C vs 90°C when air-cooled (90C is the red zone where the GPU will stop working and shutdown). Noise is another important point to mention. 4x air-cooled GPUs are pretty noisy due to the blower-style fans. Keeping a workstation like this in a lab or office is impossible - not to mention servers. The noise level is so high that it’s almost impossible to speak when they are running.
Water-cooling solves the noise problem for desktops and servers. Noise is 20% lower vs. air cooling (49 dB for water cooling vs. 62 dB for air cooling on maximum load). You can easily keep a workstation, or even a server, with such massive computing power in an office or lab.
BIZON designed an enterprise-class custom liquid-cooling system for servers and workstations.
Recommended models:
We offer water-cooled 4x GPU desktops and 10x GPU servers with TITAN RTX:

RTX 2080 Ti, 11 GB (Blower Model)
Price: $1200

RTX2080Ti RTX 2080 Ti is an excellent GPU for deep learning, offering a fantastic performance/price ratio.
The main limitation is the VRAM size. Training on an RTX 2080 Ti will require small batch sizes, and you will not be able to train large models in some cases. Using an NVLINk will not combine the VRAM of multiple GPUs, unlike TITANs or Quadros.
Recommended models:

Quadro RTX 8000, 48 GB
Price: $5500

Quadro RTX8000 Academic discounts are available.
Performance is close to the TITAN RTX. It’s recommended for extra-large models since it comes with 48GB of VRAM. When used in a pair w/NVLink, you get 96GB of GPU memory. The price is the main disadvantage.
Recommended models:

Overall Recommendations

For most users, the RTX 3090 or the RTX 3080 will provide the best bang for their buck. The only limitation of the 3080 is its 10 GB VRAM size. Working with a large batch size allows models to train faster and more accurately, saving a lot of time. This is only possible with Quadro GPUs, TITAN RTX, and RTX 3090. Using FP16 allows models to fit in GPUs with insufficient VRAM. In charts #1 and #2, the RTX 3080 cannot fit models on Resnet-152 and inception-4 with FP32. Once we change it to FP16, it can fit perfectly. 24 GB VRAM on RTX 3090 is more than enough for most use cases, and you can fit any model and use large batch sizes.