Pierrick Hymbert

Results 83 comments of Pierrick Hymbert

No worries at all, I just saw it open on my to-do list for a while, so I preferred to close. Thanks for your feedback, I understand, reopened, no hurry.

@RafaAguilar @ngxson As you were part of the tests related discussion, do you feel OK with the proposed approach here ? If so, I will continue with asynchronous request and...

> Great idea, thanks for starting this PR. Some suggestions: > > 1. Since the number of test cases is not very big, can we reduce number of files? (so...

@ggerganov @ngxson Any idea on how to improve the prompt eval time on the github runners ? Should we give a try to OpenBLAS ?

> @phymbert Can you try this model instead? (pay attention to set `n_predict` or `max_tokens`, because the model never outputs EOS token) > > https://huggingface.co/ngxson/dummy-llama/blob/main/llama_xs_q4.bin > > I have no...

> > @Azeirah > Other than that, I think it's fine that the tests are in separate files. It's kinda just how behave is meant to be used, each feature...

@ggerganov I suggest to go with this first version and see how it behaves on master. Note: sorry I messed up with github CI workflow and triggered lot of jobs...

> Wow! Very nice work - this would be very useful and should help to improve `server` significantly > > > multi users with total number of tokens to predict...

> I will review this fully tomorrow, I'm a bit sick but I have energy when I plan it out. @Azeirah No worries, take care, it can wait for tomorrow...

> Btw, one thing that would greatly improve the state of `server` in terms of debugging issues is to add detailed logs. Things like incoming requests, parameters, batch info, etc....