Home » Uncategorized » CoreWeave Proves NVIDIA GB300 NVL72 Outpaces H100 With 6X GPU Throughput

CoreWeave Proves NVIDIA GB300 NVL72 Outpaces H100 With 6X GPU Throughput

by ytools
2 comments 4 views

CoreWeave has just put the spotlight on NVIDIA’s new GB300 NVL72 Blackwell AI superchip, and the results are staggering. In benchmarks running the DeepSeek R1 reasoning model, the GB300 crushed the previous-generation H100 GPUs with a 6X higher throughput per GPU.
CoreWeave Proves NVIDIA GB300 NVL72 Outpaces H100 With 6X GPU Throughput
Even more striking, what previously required a cluster of 16 H100s could now be handled by just 4 GB300s.

The secret behind this leap lies in architecture. By cutting tensor parallelism from 16-way down to just 4-way, the GB300 drastically reduces inter-GPU communication overhead, while its massive memory capacity and lightning-fast bandwidth carry the load of large AI models. The NVLink and NVSwitch interconnects push GPU-to-GPU data exchange to 130 TB/s, while the NVL72 rack-scale system offers a huge 37 TB memory capacity, scalable up to 40 TB.

For enterprises, this translates into faster token generation, lower latency, and efficiency gains that don’t just scale performance but also cut costs. CoreWeave’s demo makes it clear: the GB300 isn’t just a raw TFLOP monster, it’s a more elegant, less fragmented solution for running complex AI workloads. While H100 clusters still hold value, the Blackwell-powered GB300 marks a generational shift – simplifying training, inference, and scaling in a way that’s hard to ignore.

You may also like

2 comments

Byter September 3, 2025 - 2:02 pm

deepseek already jumping ship to greener pastures 👀

Reply
Rooter November 3, 2025 - 2:06 am

Xeon dat dogshit compared to this 🤣

Reply

Leave a Comment