paulcx

Results 51 comments of paulcx

> Sorry I was a bit unclear. If you set it to null, it will trigger the default value 👍 However, in some cases, if I set max_new_tokens to null,...

For me, max_new_tokens = max-total-tokens - actual input tokens is a very easy calculation to understand, right?

I don't understand why the max_new_tokens does work in /generate but the max_tokens does not work in v1/chat/completions. Are they not the same logic?

> Sorry @paulcx I don't follow 100%. > > In which way do you mean works? As in that it's not truncated in `/generate` but is truncated in `v1/chat/completions`? If...

> Sorry @paulcx I don't follow 100%. > > In which way do you mean works? As in that it's not truncated in `/generate` but is truncated in `v1/chat/completions`? If...

> Okay gotcha! Thanks for being elaborate on this 👍 The difference between `v1/chat/completions` and `/generate` is indeed a bit off. > > I'll ping @drbh I think he might...

> Hi, I think [this was a start](https://github.com/huggingface/text-generation-inference/pull/2652) but there seems to be some direction change @drbh? Any fix in docker image 2.4.0?

> 不用,只要删除账号重新登录 好像不行,点了登录页面没有添加成功

[here is my experiment](https://github.com/huggingface/transformers/pull/34191#issuecomment-2423249052) can confirm the fix still needs to be fixed.

> Can confirm it seems fixed after I enable loss_kwags, and the loss is looking great at bs16_ga4 vs before (1.7 vs 1.74) Would you mind summarize the solution for...