Hugh Perkins
Hugh Perkins
Hmmm, I just tried monkeyhacking sequential a bit. Seems like we only do forward prop, no backprop, is that right? Insert at line 12 of neural_style.lua: ``` function nn.Sequential:updateOutput(input) print('update...
I managed to lower usage of memory on cuda from around ~630MB to around 570MB by shuffling the weights backwards and forwards to main memory. The 'real' time, output by...
Yes, maybe :-) I'm not planning on going further with this particularly. Just throwing the idea out there, in case someone is sufficiently motivated to take it further. Per my...
@zenpoet Nice avatar :-) @jcjohnson : > An even better workaround that I've played with a little bit for generating bigger images that look nice is to use a multistage...
Caveat: per Paul Graham, better to go deep, do something very well, than kind of blur one's 'focus' over many things. I worry gently that if too many benchmarks then:...
> This seems useful. It requires initial model parameters to be dumped in some format and loaded into each framework, but it would help to ensure that all implementations are...
> I guess getting the same stream of pseudo-random values in all different frameworks is more difficult than importing a set of tensors into all different frameworks. We wouldn't want...
> It would be interesting to log the power dissipation in each testcase I like this idea. A Titan draws 250watts peak (I think?). 24 hours a day for a...
> Agreed. Soumith's current benchmarks are useful, but they mainly evaluate "who can make the thinnest wrapper around cuDNN, Neon, or similar?" To be fair, cudnn, neon are competing with...
> just use some kind of flops/watt metric Well, the ideal would be joules per batch. But I think this will be tricky to measure. Might need some specialized hardware...