Pierrick Hymbert comments

Results 98 comments of


                                            Pierrick Hymbert

server: stop generation at `n_ctx_train` if `n_predict` is not set

@ggerganov, finally, I would prefer not to go this way but to stop the generation at `n_ctx` with a warning, instead of printing a warning each time if `n_predict` is...

server: stop generation at `n_ctx_train` if `n_predict` is not set

@ggerganov @slaren please have a look to this proposal

server: stop generation at `n_ctx_train` if `n_predict` is not set

> Maybe it would be simpler to set `n_predict` to `n_ctx_train` by default if not set in the request. Yeah, it was the first version, but I feel it noisy...

server: stop generation at `n_ctx_train` if `n_predict` is not set

I see, I am OK with both solutions even if it will be sort of a breaking change to set n_predict all the time. AFAIK not all models hallucinate and...

server: stop generation at `n_ctx_train` if `n_predict` is not set

> This would be simple if context shifting was opt-in, then there would always be a hard limit of `n_ctx` tokens. I am not sure that enabling context shift by...

server: stop generation at `n_ctx_train` if `n_predict` is not set

> @ggerganov up to you, but we need to stop this infinite loop recurrent concern some way. @ggerganov I think with the removal of hard coded stop tokens, this PR...

server: stop generation at `n_ctx_train` if `n_predict` is not set

> > Maybe it would be simpler to set `n_predict` to `n_ctx_train` by default if not set in the request. > > Yes, let's do that. Context-shift has to be...

Implement '--keep-split' to quantize model into several shards

Thanks. Do you mind to add a tests.sh as we did in #6655

Server: enable lookup decoding

Great work. As we discussed previously, servers' test coverage matters, and adding a new scenario in the test framework is mandatory.

Server: enable lookup decoding

> Are there already any tests that assert correctness for the server? I didn't see any so as part of this implementation I would try to add some. https://github.com/ggerganov/llama.cpp/tree/master/examples/server/tests