paulcx comments

Results 51 comments of


                                            paulcx

Output truncated when max_tokens is None

> Sorry I was a bit unclear. If you set it to null, it will trigger the default value 👍 However, in some cases, if I set max_new_tokens to null,...

Output truncated when max_tokens is None

For me, max_new_tokens = max-total-tokens - actual input tokens is a very easy calculation to understand, right?

Output truncated when max_tokens is None

I don't understand why the max_new_tokens does work in /generate but the max_tokens does not work in v1/chat/completions. Are they not the same logic?

Output truncated when max_tokens is None

> Sorry @paulcx I don't follow 100%. > > In which way do you mean works? As in that it's not truncated in `/generate` but is truncated in `v1/chat/completions`? If...

Output truncated when max_tokens is None

> Sorry @paulcx I don't follow 100%. > > In which way do you mean works? As in that it's not truncated in `/generate` but is truncated in `v1/chat/completions`? If...

Output truncated when max_tokens is None

> Okay gotcha! Thanks for being elaborate on this 👍 The difference between `v1/chat/completions` and `/generate` is indeed a bit off. > > I'll ping @drbh I think he might...

Output truncated when max_tokens is None

> Hi, I think [this was a start](https://github.com/huggingface/text-generation-inference/pull/2652) but there seems to be some direction change @drbh? Any fix in docker image 2.4.0?

2025年04月20日更新修复

> 不用，只要删除账号重新登录好像不行，点了登录页面没有添加成功

New GA fix causes training loss multiple times higher across the board (5x to 10x higher)

[here is my experiment](https://github.com/huggingface/transformers/pull/34191#issuecomment-2423249052) can confirm the fix still needs to be fixed.

New GA fix causes training loss multiple times higher across the board (5x to 10x higher)

> Can confirm it seems fixed after I enable loss_kwags, and the loss is looking great at bs16_ga4 vs before (1.7 vs 1.74) Would you mind summarize the solution for...