ragflow icon indicating copy to clipboard operation
ragflow copied to clipboard

[Question]: With the vLLM backend, with max_tokens disabled, the output will still be truncated

Open xyk0930 opened this issue 10 months ago • 2 comments

Describe your problem

Scene description:

  1. Version v.016.0 used by ragflow
  2. Xinference with vLLM backend
  3. max_tokens disabled
  4. the output will still be truncated

Image

Xinference DEBUG LOG Image generate config: {'frequency_penalty': 0.7, 'presence_penalty': 0.4, 'temperature': 0.1, 'top_p': 0.3, 'stream': True, 'stop': ['<|end?of?sentence|>'], 'stop_token_ids': [151643]} There are no configuration items for max_tokens,but the output is still truncated.This is because max_tokens have a default value or because ignore_eos defaults to False?

xyk0930 avatar Mar 04 '25 08:03 xyk0930