syhw
syhw
Pretty convinced that it's because the code/signature of the DLL injection is similar to this trojan. @tscmoo would know all about it.
@tlikhomanenko is correct. I should add 8 times faster for an epoch doesn't always mean 8 times faster convergence, but under 16-32 GPUs, pretty much so.
Why not... I'll give it a night's thought. Originally I only put results I trusted well, but now I've started to accept/put results from barely published papers. I guess the...
See #32
Yes, see #32 :)
Feel free to do a PR, otherwise I'll eventually get to it in 2018.
It's 16k only for the base (pretrained) Code Llama 70B @michaelroyzen