exo
exo copied to clipboard
re blog DGX Spark + Mac M3 Ultra: why eth 10 Gb, not USB/TB 40Gb+? why DGX Spark, not low VRAM hi compute RTX?
- Why eth 10 Gb, not USB/TB 40Gb+ firstplace for max headroom?
- Why DGX Spark, not low VRAM (not enough for prefill) higher compute RTX (5090)?
Looks rather constructed for DGX Spark. Any other box could provide at least TB4 transfer speed and way more compute at a better price!
Thx G.
- No need to go higher than 10GbE for now, since we can already hide all communication perfectly by overlapping computation and communication.
- Doing this with an RTX card is possible. It has a lot less VRAM so we’d need to stream the model in while doing prefill. Should work.
For a more detailed analysis see https://blog.exolabs.net/nvidia-dgx-spark
Just to clarify, the spark has an advertised 273GB/s (not Gb) memory bandwidth, so streaming the model weights from VRAM for a ~100GB model is much faster than 80Gb (so 10 GB) thunderbolt.