Nvidia GB300 NVL72 Shows Record Performance Gain - +45% to DeepSeek R1 vs GB200
Nvidia Officially Releases MLPerf Benchmark Results for Its New Rack-Scale System Blackwell Ultra GB300 NVL72, declaring 45% increase in productivity in inference DeepSeek R1 compared to the previous generation GB200. The system combines an updated architecture, accelerated tensor units and a number of software-level optimizations, which allowed Nvidia to take first place in all key tests, including Llama 3.1 405B, Llama 3.1 8B and Whisper.
Against the background of GB200 deployments in global data centers, new version GB300 with Blackwell architecture Ultra goes further by offering significantly increased bandwidth between GPU — 130 TB/s via 1,8 TB/s NVLink between each of the 72 graphics systems in the rack. This allowed even the largest language models to be efficiently scaled while maintaining stable latency at high processing volumes.
The key components of the growth were updated tensor coresproviding 2x faster attention operations and 50% more FLOPS for AI tasks, as well as active use of the format NVFP4 for quantization of weights. This made it possible reduce the volume of the model without loss of accuracy and speed up calculations, especially in inference tasks on DeepSeek R1.
According to Nvidia, these improvements can make Blackwell Ultra the main tool in the construction of the so-called "AI factories", where power optimization directly impacts processing profitability. According to their statements, GB300 is capable of running 5 times faster than Hopper accelerators, which is especially relevant against the backdrop of competitive solutions from AMD and Huawei, which are also demonstrating growth in the AI accelerator segment.
Considering that GB300 deliveries to start this month, the release of record-breaking MLPerf results appears to be part of Nvidia's strategic campaign to strengthen its leadership in enterprise AI solutions.




