NVIDIA GeForce RTX 4090 “Ada Lovelace” Graphics Card Launched – 16384 Cores, 24 GB GDDR6X, 4X Faster Than RTX 3090 at $1599 US

NVIDIA GeForce RTX 4090 is here as the next-gen BFGPU, offering earth-shattering performance that takes gaming to the next level. The GeForce RTX 4090 isn’t just a GPU, it’s the flagship green team offering, and offers four times the performance increases over its predecessor.

NVIDIA Takes Gaming To The Next Level With Its Next-Gen GeForce RTX 4090 BFGPU, Infused With The Fastest GPU On The Planet

NVIDIA’s GeForce RTX 4090 has been long-awaited and now it’s finally here. The graphics card is designed for enthusiasts and gamers who want the best visual fidelity and to achieve that, you want a powerful GPU considering how demanding the next-gen AAA titles have become. Keeping that in mind, NVIDIA didn’t push its next chip to be a few percent or 50% faster than the last gen but a whole 4x faster with DLSS and 2x faster in native resolution to make sure that their graphics cards are not just ready for the upcoming titles but also for enhanced features such as Ray Tracing, etc.

NVIDIA’s AD102 ‘Ada Lovelace’ GPU – The Next-Gen Powerhouse

At the heart of the NVIDIA GeForce RTX 4090 graphics card lies the Ada Lovelace AD102 GPU. The GPU is said to measure around 600mm2 and will utilize the TSMC 4N process node which is an optimized version of TSMC’s 5nm (N5) node designed for the green team. The GPU features an insane 76 Billion transistors.

The NVIDIA Ada Lovelace AD103 GPU is expected to feature up to 7 GPC (Graphics Processing Clusters). This is the same GPC count as the Ampere GA102 GPU and one additional GPC over the GA103 GPU. Each GPU will consist of 6 TPCs and 2 SMs which is the same configuration as the existing chip. Each SM (Streaming Multiprocessor) will house four sub-cores which is also the same as the GA102 GPU. What’s changed is the FP32 & the INT32 core configuration. Each sub-core will include 128 FP32 units but combined FP32+INT32 units will go up to 192. This is because the FP32 units don’t share the same sub-core as the IN32 units. The 128 FP32 cores are separate from the 64 INT32 cores.

So in total, each sub-core will consist of 32 FP32 plus 16 INT32 units for a total of 48 units. Each SM will have a total of 128 FP32 units plus 64 INT32 units for a total of 192 units. And since there are a total of 84 SM units (12 per GPC), we are looking at 12,288 FP32 Units and 6,144 INT32 units for a total of 18,432 cores. Each SM will also include two Wrap Schedules (32 thread/CLK) for 64 wraps per SM. This is a 50% increase on the cores (FP32+INT32) and a 33% increase in Wraps/Threads vs the GA102 GPU.

NVIDIA AD102 ‘Ada Lovelace’ Gaming GPU ‘SM’ Block Diagram (Image Credits: Kopite7kimi):

GPU NameAD102GA102TU102GA100GH100

GPC12 (Per GPU)1.7x2x1.5×1.5x

TPC6 (Per GPC)SameSame0.75×0.67x

SM2 (Per TPC)SameSameSameSame

Sub-Core4 (Per SM)SameSameSameSame

FP32128 (Per SM)Same2x2xSame

FP32+INT32192 (Per SM)1.5×1.5×1.5xSame

Warps64 (Per SM)1.33x2xSameSame

Threads2048 (Per SM)1.33x2xSameSame

L1 Cache192 KB (Per SM)1.5x2xSame0.75x

L2 Cache96 MB (Per GPU)16x16x2.4×1.6x

ROPs32 (Per GPC)2x2x2x2x

Moving over to the cache, this is another segment where NVIDIA has given a big boost over the existing Ampere GPUs. The Ada Lovelace GPUs will pack 192 KB of L1 cache per SM, an increase of 50% over Ampere. That’s a total of 4.5 MB of L1 cache on the top AD102 GPU. The L2 cache will be increased to 96 MB as mentioned in the leaks. This is a 16x increase over the Ampere GPU that hosts just 6 MB of L2 cache. The cache will be shared across the GPU.

Finally, we have the ROPs which are also increased to 32 per GPC, an increase of 2x over Ampere. You are looking at up to 384 ROPs on the next-gen flagship versus just 112 on the fastest Ampere GPU, the RTX 3090 Ti. There are also going to be the latest 4th Generation Tensor and 3rd Generation RT (Raytracing) cores infused on the Ada Lovelace GPUs which will help boost DLSS & Raytracing performance to the next level. Overall, the Ada Lovelace AD102 GPU will offer:

2x GPCs (Versus Ampere)
50% More Cores (Versus Ampere)
50% More L1 Cache (Versus Ampere)
16x More L2 Cache (Versus Ampere)
Double The ROPs (Versus Ampere)
4th Gen Tensor & 3rd Gen RT Cores

NVIDIA GeForce RTX 4090 ‘Official’ Specifications

The NVIDIA GeForce RTX 4090 will use 128 SMs of the 144 SMs for a total of 16,384 CUDA cores. The GPU will come packed with 96 MB of L2 cache and a total of 384 ROPs which is simply insane but considering that the RTX 4090 is a cut-down design, it may feature slightly lower L2 and ROP counts. The clock speeds are not confirmed yet but considering that the TSMC 4N process is being used, we are expecting clocks between the 2.0-3.0 GHz range.

As for memory specs, the GeForce RTX 4090 is expected to rock 24 GB GDDR6X capacities that will be clocked at 21 Gbps speeds across a 384-bit bus interface. This will provide up to 1 TB/s of bandwidth. This is the same bandwidth as the existing RTX 3090 Ti graphics card and as far as the power consumption is concerned, the TBP is said to be rated at 450W which means that TGP may end up lower than that. The card will be powered by a single 16-pin connector which delivers up to 600W of power. It is likely that we may get 500W+ custom designs as we saw with the RTX 3090 Ti.

NVIDIA GeForce RTX 4090 Graphics Cards Performance

As for the performance of these monster GPUs, we can only use theoretical numbers here since the launch is a bit far away but based on what we know, the RTX 4090 series cards might be the first gaming cards to hit the 100 TFLOPs compute horsepower limit.

Just for comparison’s sake:

NVIDIA GeForce RTX 4090: 90 TFLOPs (FP32) (Assuming 2.8 GHz clock)
NVIDIA GeForce RTX 3090 Ti: 40 TFLOPs (FP32) (1.86 GHz Boost clock)
NVIDIA GeForce RTX 3090: 36 TFLOPs (FP32) (1.69 GHz Boost clock)

Based on a theoretical clock speed of 2.8 GHz, you get up to 103 TFLOPs of compute performance and the rumors are suggesting even higher boost clocks. Now, these are definitely sounding like peak clocks, similar to AMD’s peak frequencies which are higher than the average ‘Game’ clock. A 100+ TFLOPs compute performance means more than double the horsepower versus the 3090 Ti flagship. But one should keep in mind that compute performance doesn’t necessarily indicate the overall gaming performance but despite that, it will be a huge upgrade for gaming PCs and an 8.5x increase over the current fastest console, the Xbox Series X.

FP32 Compute Horsepower Comparisons (Higher is Better)
Compute Power
0
20
40
60
80
100
120
0
20
40
60
80
100
120
RTX 4090
90
RTX 3090 Ti
40
RX 6900 XTX
25
Xbox Series X
12.1
PlayStation 5
10.2

This will be a 2x compute performance uplift for each graphics card versus its predecessor and this is without even factoring in the RT and Tensor core performance which are expected to get major lifts too in their respective department. Now FLOPs aren’t necessarily reflective of the graphics or gaming performance but they do provide a metric that can be used for comparison. A 2-2.5x gain over the RTX 3090 & RTX 3090 Ti would be very disruptive and it makes sense why NVIDIA is going so hard with higher power limits on their cards.

Gamers should expect 4K gaming to be buttery smooth on these graphics cards and with DLSS, we might even see playable 60 FPS at 8K resolution which is something that NVIDIA has been trying to achieve with its RTX 3090 series BFGPUs for a while now.

NVIDIA GeForce RTX 4090 Graphics Cards Price & Availability

Now coming to the prices, the NVIDIA GeForce RTX 3090 Ti & RTX 3090 graphics cards are without a doubt the most expensive single-chip GPUs to date. The NVIDIA GeForce RTX 4090 is going to come at a price of $1599 US for the Founders Edition variant and will be available on the 12th of October.

The NVIDIA GeForce RTX 40 series graphics cards are rumored for a mid-July launch and while we have seen cooler shrouds of the RTX 4090 Ti leak out earlier, NVIDIA could still release the non-Ti variant first with the RTX 4090 Ti variant hitting the market much later. But this wouldn’t be the first time that NVIDIA releases a high-end SKU during the very start of its next generation. The RTX 2080 Ti flagship was launched with the rest of the lineup even though its predecessor, the GTX 1080 Ti appeared months after the launch of the initial lineup. The RTX 3090 launched with the initial line of RTX 30 series cards but the 3090 Ti came more than a year late. This time, NVIDIA could launch the entire family from the start and go for a mid-cycle refresh later on but that remains to be seen.

NVIDIA GeForce GPU Segment/Tier Prices

Graphics Segment2014-20162016-20172017-20182018-20192019-20202020-20212021-2022

Titan TierTitan X (Maxwell)Titan X (Pascal)Titan Xp (Pascal)Titan V (Volta)Titan RTX (Turing)GeForce RTX 3090GeForce RTX 3090 Ti
GeForce RTX 3090

Price$999 US$1199 US$1199 US$2999 US$2499 US$1499 US$1999 US
$1499 US

Ultra Enthusiast TierGeForce GTX 980 TiGeForce GTX 980 TiGeForce GTX 1080 TiGeForce RTX 2080 TiGeForce RTX 2080 TiGeForce RTX 3080 TiGeForce RTX 3080 Ti

Price$649 US$649 US$699 US$999 US$999 US$1199 US$1199 US

Enthusiast TierGeForce GTX 980GeForce GTX 1080GeForce GTX 1080GeForce RTX 2080GeForce RTX 2080 SUPERGeForce RTX 3080 10 GBGeForce RTX 3080 12 GB

Price$549 US$549 US$549 US$699 US$699 US$699 US$999 US

High-End TierGeForce GTX 970GeForce GTX 1070GeForce GTX 1070GeForce RTX 2070GeForce RTX 2070 SUPERGeForce RTX 3070 Ti
GeForce RTX 3070GeForce RTX 3070 Ti 16 GB

Price$329 US$379 US$379 US$499 US$499 US$599
$499TBA

Mainstream TierGeForce GTX 960GeForce GTX 1060GeForce GTX 1060GeForce GTX 1060GeForce RTX 2060 SUPER
GeForce RTX 2060
GeForce GTX 1660 Ti
GeForce GTX 1660 SUPER
GeForce GTX 1660GeForce RTX 3060 Ti
GeForce RTX 3060 12 GBGeForce RTX 3060 Ti
GeForce RTX 3060 12 GB

Price$199 US$249 US$249 US$249 US$399 US
$349 US
$279 US
$229 US
$219 US$399 US
$329 US$399 US
$329 US

Entry TierGTX 750 Ti
GTX 750GTX 950GTX 1050 Ti
GTX 1050GTX 1050 Ti
GTX 1050GTX 1650 SUPER
GTX 1650GTX 1650 SUPER
GTX 1650RTX 3050

Price$149 US
$119 US$149 US$139 US
$109 US$139 US
$109 US$159 US
$149 US$159 US
$149 US$249 US

As of now, the rumors point out the Mid-July launch so we have to wait two more months to see how well that goes!

Which NVIDIA GeForce RTX 40 series graphics card are you looking forward to the most?

Poll Options are limited because JavaScript is disabled in your browser.

The post NVIDIA GeForce RTX 4090 “Ada Lovelace” Graphics Card Launched – 16384 Cores, 24 GB GDDR6X, 4X Faster Than RTX 3090 at $1599 US by Hassan Mujtaba appeared first on Wccftech.