Timothy Liu

Results 41 comments of Timothy Liu

The PC with the 3090 is using a 5600X with 32GB of DDR4 memory. The 3090 itself has 24GB GDDR6X video memory.

I did use the NGC TF container. What is the image size you used? From your script, it looks like it defaults to 128x128? That could be why it looks...

Will test it soon and update. I am not expecting much of a difference however.

Update: it doesn't seem to affect the measured results, updated the README

Actually, this IS pretty good, considering the 3090 has Tensor Core which in theory gives it a much larger performance advantage, while the M1 seems to be using normal shader...

Batch size is kept smaller, in line with the batch size running on M1 Max. This is for apple-to-apples comparison. I did note on the results that the 3090 would...

I had noted in the readme that I have yet to update the batch size for 3090. Previously I had run both at the batch size, but after comments from...

I have updated the readme. I can't get his level of performance even at BS=256. I suspect Ross Wightman is using PyTorch and it ends up being more performant for...

Q1: I only have the 32GB version, so I cannot answer with absolute certainty. I observe that about ~3GB is required for the OS etc, so in theory you would...

The Cluster HAT images set a hostname that you can access via (p1.local ~ p4.local). Are you using the Cluster HAT? Otherwise, how are you hooking everything up?