ushers in a new level of performance at the top of the stack, with the besting the previous generation by 52% on average in our rasterization benchmarks, and 70% in ray tracing benchmarks — both at 4K, naturally. The 4090 now sits comfortably atop our and ranks as one of the around, at least if you have deep pockets.
Unfortunately, the step down from the 4090 to theis rather precipitous, dropping performance by 23% for rasterization and 30% for ray tracing. Dropping down another level to the new knocks an additional 22% off the performance relative to the 4080. If you’re keeping track — and we definitely like to keep score — that means the third-string Ada card sporting the AD104 GPU is slower than the previous generation 3090 Ti, nevermind Nvidia’s claims to the contrary that rely on benchmarks using DLSS 3’s Frame Generation.
Perhaps more alarming with the RTX 4070 Ti is that it only has a 192-bit memory interface. It still has 12GB of GDDR6X memory, and the large L2 cache in general means that the narrower bus isn’t a deal killer, but things don’t look so good as we eye future lower-tier RTX 40-series parts like the 4060 and 4050.
Nvidia recently announced the full line of RTX 40-series laptop GPUs, ranging from the RTX 4090 mobile that uses the AD103 GPU (basically a mobile 4080) down to the anemic-sounding RTX 4050. Here’s the full list of specs for the mobile parts.
|Graphics Card||RTX 4090 for Laptops||RTX 4080 for Laptops||RTX 4070 for Laptops||RTX 4060 for Laptops||RTX 4050 for Laptops|
|Process Technology||TSMC 4N||TSMC 4N||TSMC 4N||TSMC 4N||TSMC 4N|
|Die size (mm^2)||378.6||294.5||?||?||?|
|Ray Tracing “Cores”||76||58||36||24||20|
|Boost Clock (MHz)||1455-2040||1350-2280||1230-2175||1470-2370||1605-2370|
|VRAM Speed (Gbps)||18?||18?||18?||18?||18?|
|VRAM Bus Width||256||192||128||128||96|
|TFLOPS FP32 (Boost)||28.3-39.7||20.0-33.9||11.3-20.0||9.0-14.6||8.2-12.1|
|TFLOPS FP16 (FP8)||226-318 (453-635)||160-271 (321-542)||91-160 (181-321)||72-116 (145-233)||66-97 (131-194)|
It’s a reasonably safe bet that the desktop RTX 4070 will use the same AD104 as the RTX 4070 Ti, just with fewer SMs and shaders. Desktop RTX 4060 Ti, assuming we get that anytime soon, may or may not use AD104; the only other option would presumably be the AD106 GPU used in the mobile 4070/4060. And that’s a problem.
The previous generation RTX 3060 Ti came with 8GB of GDDR6 on a 256-bit interface. We weren’t particularly pleased with the lack of VRAM, especially when AMD started shipping RX 6700 XT (and later 6750 XT) with 12GB VRAM. Nvidia basically did a course correction with the RTX 3060 and gave it 12GB VRAM, making it a nice step up from the previous RTX 2060 — and even the 2060 eventually saw 12GB models, though prices made them mostly unattractive.
Now we’re talking about RTX 4060 most likely going back to 8GB, and that would suck. There are plenty of games now that can exceed 8GB of VRAM use, and that number will only grow in the next two years. But Nvidia doesn’t have many other options, since GDDR6 and GDDR6X memory capacities top out at 2GB per 32-bit channel.
There’s potential to do “clamshell” mode with two memory chips per channel, one on each side of the PCB, but that’s pretty messy and not something we’d expect to see in a mainstream GPU. That could get the 128-bit interface up to 16GB of VRAM, which again would be odd as the higher-tier parts like the 4070 Ti only have 12GB. Still, that sounds better than an RTX 4060 8GB model to me!
And what about the RTX 4050? Maybe Nvidia will stick with the 128-bit interface on the AD106 GPU and just skip using AD107 on a desktop part — that’s basically what happened with GA107 that was almost exclusively used for laptop RTX 3050. But if it does try to use AD107 in a desktop, it would only have up to 6GB VRAM, again with clamshell VRAM being a potential out.
It’s not just the memory capacities that raise some concern. We said in the RTX 4070 Ti review that performance wasn’t bad, but it also wasn’t amazing. It’s basically a cheaper take on an RTX 3090, with half the VRAM and lower power use. The 4070 Ti gets by with 60 Streaming Multiprocessors (SMs) and 7680 CUDA cores (GPU shaders), slightly more than the outgoing RTX 3070 Ti. But AD106 could top out at just 40 SMs, maybe even 36 SMs, which would put it in similar territory to the RTX 3060 Ti on core counts, leaving only GPU clocks as a performance boost.
Put those two things together — insufficient VRAM and relatively minor increases in GPU shader counts — and we’re likely looking at modest performance improvements compared to the previous Ampere generation GPUs.
Nvidia will then trot out DLSS 3 performance improvements, which only apply to a subset of games and also don’t offer true performance increases, and things start to sound even worse. Part of the benefit of having a GPU that can run games at 120 fps today is that, as games get more demanding, it will still be able to do 60 fps in most games a few years down the road. But what happens when those aren’t real framerates?
Let’s assume a game running at 120 fps courtesy of DLSS 3’s Frame Generation technology, with a base performance of 70 fps. All is well and good for now, but down the road the base performance will drop below 40 fps as games become more demanding, and eventually it will fall below 30 fps. What we’ve experienced is that Frame Generation with a base framerate of less than 30 fps still feels like sub-30 fps, even if the monitor is getting twice as many frame updates per second.
That same logic applies to higher framerates as well, so DLSS 3 at 120 fps with a 70 fps base will still feel like 70 fps, even if it looks a bit smoother to the eye. Most people won’t be able to tell the difference between input rates at 70 samples per second and inputs at 120 samples per second. But when you start to fall below 40, even non-professional gamers will start to feel the difference.
Or to put it more bluntly: DLSS 3 and Frame Generation are no panacea. They can help smooth out the visuals and maybe improve the feel of games a bit, but the benefit isn’t going to be as noticeable as actual fully rendered frames with new user input factored in, particularly as performance drops below 60 fps.
That’s not to say it’s a bad technology — it’s quite clever actually — and we don’t mind that it exists. But Nvidia needs to stop comparing DLSS 3 scores against non-DLSS 3 results and acting like they’re the same thing. Take the base framerate before Frame Generation and add maybe 10–20 percent and that’s what a game feels like, not the 60–100 percent higher fps that benchmarks will show.
Back to the topic at hand, the future mainstream and budget RTX 40-series GPUs will no doubt beat the existing models in pure performance, and they’ll offer DLSS 3 support as well. Hopefully Nvidia will return to prices closer to the previous generation, though, because if the RTX 4060 costs $499 and the RTX 4050 costs $399, they’re going to end up being minor upgrades compared to the existing cards at those price points.