Best GPU for deep learning in 2021: RTX 3090 vs. RTX 3080 benchmarks (FP32, FP16)

November, 11, 2020


The graphics cards in the newest NVIDIA release have become the most popular and sought-after graphics cards in deep learning in 2021. These 30-series GPUs are an enormous upgrade from NVIDIA's 20-series, released in 2018. Using deep learning benchmarks, we will be comparing the performance of NVIDIA's RTX 3090, RTX 3080, and RTX 3070.



  • We used TensorFlow's standard "" benchmark script from the official GitHub (more details).
  • We ran tests on the following networks: ResNet-50, ResNet-152, Inception v3, Inception v4, VGG-16.
  • We compared FP16 to FP32 performance and used standard batch sizes (64, in most cases).
  • We compared GPU scaling on all 30-series GPUs using up to 2x GPUs and on the A6000 using up to 4x GPUs!
To accurately compare benchmark data from multiple workstations, we maintained consistency by having the same driver and framework versions installed on each workstation. It is important to keep a controlled environment in order to have valid, comparable data.



Test Bench:

bizon g3000BIZON G3000 (1-4x GPU deep learning desktop)
Tech specs:
  • CPU: Intel Core i9-10980XE 18-Core 3.00GHz
  • Overclocking: Stage #3 +600 MHz (up to +30% performance)
  • Cooling: Liquid Cooling System (CPU; extra stability and low noise)
  • Memory: 256 GB (8 x 32 GB) DDR4 3200 MHz
  • Operating System: BIZON Z–Stack (Ubuntu 20.04 (Bionic) with preinstalled deep learning frameworks)
  • Network: 10 GBIT
BIZON Z5000 (liquid cooled deep learning and GPU rendering workstation PC)
Tech specs:
  • CPU: Intel Core i9-10980XE 18-Core 3.00GHz
  • Overclocking: Stage #3 +600 MHz (up to + 30% performance)
  • Cooling: Custom water-cooling system (CPU + GPUs)
  • Memory: 256 GB (8 x 32 GB) DDR4 3200 MHz
  • Operating System: BIZON Z–Stack (Ubuntu 20.04 (Bionic) with preinstalled deep learning frameworks)
  • Network: 10 GBIT



Deep Learning Models:
  • Resnet50
  • Resnet152
  • Inception V3
  • Inception V4
  • VGG16
Drivers and Batch Size:
  • Nvidia Driver: 455
  • CUDA: 11.1
  • TensorFlow: 1.x
  • Batch size: 64




Note: Due to their 2.5 slot design, RTX 30-series GPUs can only be tested in 2-GPU configurations when air-cooled. Water-cooling is required for 4-GPU configurations.


Our Recommendation: NVIDIA RTX 3090, 24 GB
Price: $1500

RTX 3090Academic discounts are available.
Notes: Water cooling required for 4 x RTX 3090 configurations.  
The RTX 3090 has the best of both worlds: excellent performance and price. The RTX 3090 is the only GPU model in the 30-series capable of scaling with an NVLink bridge. When used as a pair with an NVLink bridge, one effectively has 48 GB of memory to train large models. A problem some may encounter with the RTX 3090 is cooling, mainly in multi-GPU configurations. Due to its massive TDP of 350W and because the RTX 3090 does not have blower-style fans, it will almost immediately activate thermal throttling and then shut off at 90°C.
We have seen an up to 60% (!) performance drop due to overheating.
Liquid cooling is the best solution; providing 24/7 stability, low noise, and greater hardware longevity. Plus, any water-cooled GPU is guaranteed to run at its maximum possible performance. As per our tests, a water-cooled RTX 3090 will stay within a safe range of 50-60°C vs 90°C when air-cooled (90°C is the red zone where the GPU will stop working and shutdown). Noise is another important point to mention. 2x or 4x air-cooled GPUs are pretty noisy, especially with blower-style fans. Keeping the workstation in a lab or office is impossible - not to mention servers. The noise level is so high that it’s almost impossible to carry a conversation while they are running. Without proper hearing protection, the noise level may be too high for some to bear. Liquid cooling resolves this noise issue in desktops and servers. Noise is 20% lower compared to air cooling (49 dB for liquid cooling vs. 62 dB for air cooling on maximum load). One could place a workstation or even a server with such massive computing power in an office or lab. BIZON has designed an enterprise-class custom liquid-cooling system for servers and workstations.
RTX 3080 is an excellent GPU for deep learning and offers the best performance/price ratio. The main limitation is its VRAM size. Training on RTX 3080 will require small batch sizes, so those with larger models may not be able to train them.
RTX 3070 is a good GPU for deep learning and is the best option for those with a smaller budget. The main limitation is its VRAM size, just like the 3080. Training on RTX 3070 will require even smaller batch sizes.
Recommended models: We offer desktops and servers with RTX 3070. Desktops: Servers: You can find more NVIDIA RTX 3080 vs RTX 3090 GPU Deep Learning Benchmarks here.


Overall Recommendations

For most users, the RTX 3090 or the RTX 3080 will provide the best bang for their buck. The only limitation of the 3080 is its 10 GB VRAM size. Working with a large batch size allows models to train faster and more accurately, saving a lot of time. With the latest generation, this is only possible with the A6000 or RTX 3090. Using FP16 allows models to fit in GPUs with insufficient VRAM. In charts #3 and #4, the RTX 3080 cannot fit models on Resnet-152 and inception-4 using FP32. Once we change to FP16, the model can fit perfectly. 24 GB of VRAM on the RTX 3090 is more than enough for most use cases, allowing space for almost any model and large batch sizes.
Back to News List