InternVL icon indicating copy to clipboard operation
InternVL copied to clipboard

issue on activating thinking mode with lmdeploy

Open yijunCai opened this issue 4 months ago • 2 comments

Hi, thanks for sharing the InternVL3.5 series! The thinking mode can be activated by setting the system prompt when inferecing with transformers, but how should it be done when running offline inference with lmdeploy (using the pipeline method in lmdeploy)? To be more specific, how should I change the following code to activate thinking mode?

from lmdeploy import pipeline, PytorchEngineConfig from lmdeploy.vl import load_image pipe = pipeline("OpenGVLab/InternVL3_5-30B-A3B", backend_config=PytorchEngineConfig(session_len=max_length, tp=2)) input = [(prompt, load_image(image_url))] responses = pipe(input)

yijunCai avatar Sep 04 '25 10:09 yijunCai

content = [{'type': 'text', 'text': question}] messages = [dict(role='system', content=R1_SYSTEM_PROMPT), dict(role='user', content=content)] out = pipe(messages, gen_config=GenerationConfig(do_sample=True, temperature=0.6 ))
print(out.text)

JoshonSmith avatar Sep 05 '25 08:09 JoshonSmith

content = [{'type': 'text', 'text': question}] messages = [dict(role='system', content=R1_SYSTEM_PROMPT), dict(role='user', content=content)] out = pipe(messages, gen_config=GenerationConfig(do_sample=True, temperature=0.6 )) print(out.text)

Ok this works, thanks! I also have a subsequent question: I want to get the logits of the final output (after "<//think>"). I have tried setting gen_config=GenerationConfig(output_logits='generation') (also with the recommended (do_sample=True, temperature=0.6) for thinking mode) following the lmdeploy doc, but the logits from pipe() response.logits seems to be None. How should I get the final output logits properly?

yijunCai avatar Sep 11 '25 03:09 yijunCai