Maks

Results 4 comments of Maks

Just realized I'm not deepcopying the cached state 😅 Will fix.

Fixed that. Also fixed a corner case where it caches so much that it has no new tokens to generate from (such as when hitting Regenerate).

Thanks! The increase in tokens/s that you see is caused by the fact that ooba measures `number of tokens generated / total time of inference` without taking into account that...

Raven is what I tested it with. There's no reason why you can't call `forward` manually. If you're referring to this message BlinkDL posted on Discord: > 1. Never call...