mem0 Getting Error in new_retrieved_facts: Expecting value: line 1 column 1 (char 0)

mem0ai 1.0.0b0

Oct 10 '25 12:10 ymshuang

My config is as follows： config = { "llm": { "provider": "vllm", "config": { "model": "qwen-tool", "vllm_base_url": "http://192.168.150.2:8006/v1", "temperature": 0.1, "max_tokens": 256, "api_key": "test" } }, "embedder": { "provider": "langchain", "config": { "model": embeddings, }

},
"vector_store": {
   "provider": "qdrant",
    "config": {
        "collection_name": "test",
        "host": "localhost",
        "port": 6333,
        "embedding_model_dims":1024  #这里的维度要和向量模型的维度一致
    }
},
"graph_store": {
    "provider": "neo4j",
    "config": {
        "url": "neo4j://localhost:7687",
        "username": "neo4j",
        "password": "20250923",
    },
},
# 1.0版本支持排序
"reranker": {
    "provider": "huggingface",
    "config": {
        "model": "/home/zy0065-lr/code/models/bge-reranker-base",
        "local_files_only": True
    }
},
# 启用工具调用支持
"tools": {
    "enabled": True,
    "function_call_format": "json"  # 与vLLM的函数调用格式匹配
},

"history_db_path": "history.db"

}

Oct 10 '25 12:10 ymshuang

Hey @ymshuang thanks for reaching out and really sorry for the trouble. Can you share the spec where vllm is running?

Oct 13 '25 12:10 parshvadaftari

Thanks for reply,I have solved the problem.Casuing by using the Qwen3-8B, VLLM returns generation with the result of thinking.Once I have stopped the thinking mode, the problem has been solved.

Oct 14 '25 01:10 ymshuang

And I suggest that the config paramerter should add the choice,wether to use thingking mode or not, because lots of models are thinking mode.

Oct 14 '25 01:10 ymshuang

Hey @ymshuang thanks for pointing it out, will definitely incorporate this with the new releases.

Oct 14 '25 13:10 parshvadaftari

Hey @parshvadaftari, I would love to work upon this issue, could you please assign this to me.

Oct 16 '25 17:10 Vedant817

Feel free to work on it @Vedant817

Oct 16 '25 18:10 parshvadaftari

Hey @parshvadaftari , I have created a PR #3626. Please review it and tell if any further changes are required.

Oct 19 '25 16:10 Vedant817

Thanks for reply,I have solved the problem.Casuing by using the Qwen3-8B, VLLM returns generation with the result of thinking.Once I have stopped the thinking mode, the problem has been solved.

I meet the same question. Can I know how you stop the thinking mode of Qwen3-8B? This appears to be an error caused by mem0 internally calling the model itself, and it's not possible to disable thinking during the response generation phase.

Oct 21 '25 07:10 Jiafazi17

Thanks for reply,I have solved the problem.Casuing by using the Qwen3-8B, VLLM returns generation with the result of thinking.Once I have stopped the thinking mode, the problem has been solved.

I meet the same question. Can I know how you stop the thinking mode of Qwen3-8B? This appears to be an error caused by mem0 internally calling the model itself, and it's not possible to disable thinking during the response generation phase.

Just stop thingking mode when you use LLM,you can add the parameter at original mem0 code as follows: response = self.client.chat.completions.create( model=self.config.model, messages=messages, temperature=self.config.temperature, max_tokens=self.config.max_tokens, response_format=response_format, extra_body={"stop": ["<think>", "</think>"]} )

Oct 21 '25 08:10 ymshuang

Hey @ymshuang thanks for the suggestion, feel free to work on it. This issue still is up for grabs. We want something which can be set from the config when using vllm, ollama, or lmstudion as provider.

Oct 21 '25 20:10 parshvadaftari

Hey @parshvadaftari, I have created a PR https://github.com/mem0ai/mem0/pull/3643. Please review it and tell if any further changes are required.

Oct 22 '25 07:10 ymshuang

Hey @ymshuang can you please incorporate the requested changes? I've reviewed it and it looks good overall.

Oct 22 '25 20:10 parshvadaftari