exo icon indicating copy to clipboard operation
exo copied to clipboard

re blog DGX Spark + Mac M3 Ultra: why eth 10 Gb, not USB/TB 40Gb+? why DGX Spark, not low VRAM hi compute RTX?

Open ai-bits opened this issue 2 months ago • 1 comments

re DGX Spark + Mac M3 Ultra

  1. Why eth 10 Gb, not USB/TB 40Gb+ firstplace for max headroom?
  2. Why DGX Spark, not low VRAM (not enough for prefill) higher compute RTX (5090)?

Looks rather constructed for DGX Spark. Any other box could provide at least TB4 transfer speed and way more compute at a better price!

Thx G.

ai-bits avatar Dec 20 '25 21:12 ai-bits

  1. No need to go higher than 10GbE for now, since we can already hide all communication perfectly by overlapping computation and communication.
  2. Doing this with an RTX card is possible. It has a lot less VRAM so we’d need to stream the model in while doing prefill. Should work.

For a more detailed analysis see https://blog.exolabs.net/nvidia-dgx-spark

AlexCheema avatar Dec 21 '25 01:12 AlexCheema

Just to clarify, the spark has an advertised 273GB/s (not Gb) memory bandwidth, so streaming the model weights from VRAM for a ~100GB model is much faster than 80Gb (so 10 GB) thunderbolt.

Evanev7 avatar Dec 21 '25 11:12 Evanev7