Results 34 comments of Gary Linscott

That's awesome that you've done that! I don't have time to debug it though :(. Sorry it's such a mess, it evolved over time, and initially javascript performance was better...

Yes, the original interface only auto-promotes to queen currently - thanks for filing the issue :).

It looks like the opencl drivers are not working. What does `clinfo` give?

That's really cool, thanks!! I will review later this week.

@syntithenai, very nice! Thanks for doing that, it exposes a much cleaner interface.

Got results for 7B, ctx=1024: `perplexity: 11.4921 [57/57]`, so that seems promising.

> Very useful work. I think this can be significantly made faster if we have the option for the eval method to return the logits even for the past tokens:...

> @glinscott How is it you're seeing `[x/114]`? With the default context size (512), I'm seeing `[x/649]`. But from the code that should only depend on `tokens.size() / params.n_ctx`, and...

One thing to note, I don't think the `params.n_batch` has any effect - I think adding support for that shouldn't be too hard though. Can someone try adding this debugging...