Reactor Scram

Results 221 comments of Reactor Scram

We can split this off if my issue on CPU is different, but on bbecf3f4 it seems to work with cache_prompt false? Whatever commit I was using before didn't work,...

Okay. I just happened to find my CPU inference issue by searching the issue tracker for "determinism" so I'm not sure if I should start a new issue or what,...

Okay cool. I don't know exactly how it's supposed to work but I assume "cache" means it should yield the same output but faster, right? I did notice the 2nd...

@ggerganov I tried un-commenting that line, but it doesn't seem to compile because the seed can only be set on the llama_context, which is server-wide, and the requests come to...

Yeah, if we can start DNS control and maybe a tunnel device without connlib then that would be enough

I think I was going to add something else to this and then smoke test it before opening

This is ready for early reviews. There's still some `unwrap`s I want to clean up, but I want to make sure I'm headed in the right direction. I can split...

@thomaseizinger Oh yeah should this go after the 1.4.0 Client is cut, so that we have one last Ubuntu 20.04 Client that supports the new protocol?

@ksparakis Yes I think 2 weeks is too long to wait, we're trying to get this merged soon

@thomaseizinger oh this is probably failing to merge because I deleted Ubuntu 20.04 from the test matrix. Should we force it and then we can update the required tests later?...