Kyle Herndon
Kyle Herndon
The device I'm using has approximately 200GB of memory. I updated the filebin with two additional files. I halved the number of attention layers in the model so the model...
Added two more files to the [filebin](https://filebin.net/u4gmgdsh5s6ks6lr) with just one attention layer and it did finally run. I would think this would put an upper bound on the remaining additional...
Same general error when running with those flags, at least on 405b. @aviator19941 said he would try out 70b.