agentscope There exists error in using long memory with local embdding model!

code as follow:


from agentscope.model import OpenAIChatModel
from agentscope.tool import Toolkit
from agentscope.embedding import OpenAITextEmbedding
from agentscope.memory import Mem0LongTermMemory
from agentscope.message import Msg
from common import g_model
import asyncio

from mem0.vector_stores.configs import VectorStoreConfig

vector_store_config = VectorStoreConfig(
    provider="qdrant",
    config={
        # "path": "./test_qdrant_db",
        "on_disk": False,
        "embedding_model_dims":4096,
    }
)

# 基本使用示例
async def basic_usage():
    """基本使用示例"""
    # 记录记忆

    g_model= OpenAIChatModel(
        model_name="yd_qwen3-14B",  # 可以为空，实际模型由vLLM服务决定
        api_key="EMPTY",  # vLLM不需要API key，但需要填一个值
        stream=False,
        client_args={
            "base_url": "http://192.168.11.111:5432/v1"  # 通过client_args设置base_url
        },
        generate_kwargs={
            "temperature": 0.7,
            "max_tokens": 30000,
        }
    )
    
    long_term_memory = Mem0LongTermMemory(
        agent_name="Friday",
        user_name="user_123",
        model = g_model,
        embedding_model = OpenAITextEmbedding(
            model_name="Qwen3-Embedding-8B",
            api_key="",
            base_url="http://192.168.11.111:5432/v1"
        ),
        on_disk=False,
        vector_store_config=vector_store_config,  #
    )

    # long_term_memory = Mem0LongTermMemory(
    #     agent_name="Friday",
    #     user_name="user_123",
    #     model = g_model,
    #     embedding_model = OpenAITextEmbedding(
    #         model_name="bge-m3",
    #         api_key="",
    #         base_url="http://192.168.11.111:5432/v1"
    #     ),
    #     on_disk=False,
    #     # vector_store_config=vector_store_config,  #
    # )
    
    await long_term_memory.record([Msg("user", "我喜欢住民宿", "user")])

    # 检索记忆
    results = await long_term_memory.retrieve(
        [Msg("user", "我的住宿偏好", "user")],
    )
    print(f"检索结果: {results}")

asyncio.run(basic_usage())

the error log as follow:

Error in new_retrieved_facts: Expecting value: line 1 column 1 (char 0)
------------------: {'input': ['"我的住宿偏好"'], 'model': 'Qwen3-Embedding-8B', 'dimensions': 1024, 'encoding_format': 'float'}
Traceback (most recent call last):
  File "/home/shawn/samba/Workspace/RAG_agent/agentscope-main/src/agentscope/memory/_mem0_utils.py", line 202, in embed
    response = asyncio.run(_async_call())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shawn/miniforge3/envs/agent/lib/python3.12/asyncio/runners.py", line 195, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/home/shawn/miniforge3/envs/agent/lib/python3.12/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shawn/miniforge3/envs/agent/lib/python3.12/asyncio/base_events.py", line 691, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/home/shawn/samba/Workspace/RAG_agent/agentscope-main/src/agentscope/memory/_mem0_utils.py", line 199, in _async_call
    response = await self.agentscope_model(text_list)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shawn/samba/Workspace/RAG_agent/agentscope-main/src/agentscope/embedding/_openai_embedding.py", line 95, in __call__
    response = await self.client.embeddings.create(**kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shawn/miniforge3/envs/agent/lib/python3.12/site-packages/openai/resources/embeddings.py", line 251, in create
    return await self._post(
           ^^^^^^^^^^^^^^^^^
  File "/home/shawn/miniforge3/envs/agent/lib/python3.12/site-packages/openai/_base_client.py", line 1794, in post
    return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shawn/miniforge3/envs/agent/lib/python3.12/site-packages/openai/_base_client.py", line 1594, in request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'error': {'message': 'Model "Qwen3-Embedding-8B" does not support matryoshka representation, changing output dimensions will lead to poor results.', 'type': 'BadRequestError', 'param': None, 'code': 400}}

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/shawn/samba/Workspace/RAG_agent/AgentsProject/TESTS/test_agentscope/test_memory1.py", line 73, in <module>
    asyncio.run(basic_usage())
  File "/home/shawn/miniforge3/envs/agent/lib/python3.12/asyncio/runners.py", line 195, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/home/shawn/miniforge3/envs/agent/lib/python3.12/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shawn/miniforge3/envs/agent/lib/python3.12/asyncio/base_events.py", line 691, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/home/shawn/samba/Workspace/RAG_agent/AgentsProject/TESTS/test_agentscope/test_memory1.py", line 68, in basic_usage
    results = await long_term_memory.retrieve(
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shawn/samba/Workspace/RAG_agent/agentscope-main/src/agentscope/memory/_mem0_long_term_memory.py", line 564, in retrieve
    result = await self.long_term_working_memory.search(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shawn/miniforge3/envs/agent/lib/python3.12/site-packages/mem0/memory/main.py", line 1534, in search
    original_memories = await vector_store_task
                        ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shawn/miniforge3/envs/agent/lib/python3.12/site-packages/mem0/memory/main.py", line 1553, in _search_vector_store
    embeddings = await asyncio.to_thread(self.embedding_model.embed, query, "search")
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shawn/miniforge3/envs/agent/lib/python3.12/asyncio/threads.py", line 25, in to_thread
    return await loop.run_in_executor(None, func_call)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shawn/miniforge3/envs/agent/lib/python3.12/concurrent/futures/thread.py", line 59, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shawn/samba/Workspace/RAG_agent/agentscope-main/src/agentscope/memory/_mem0_utils.py", line 215, in embed
    raise RuntimeError(
RuntimeError: Error generating embedding using agentscope model: Error code: 400 - {'error': {'message': 'Model "Qwen3-Embedding-8B" does not support matryoshka representation, changing output dimensions will lead to poor results.', 'type': 'BadRequestError', 'param': None, 'code': 400}}

Oct 29 '25 09:10 xtanitfy

The default dimensions value in OpenAITextEmbedding is set to 1024. Try changing it to 4096. This issue might be related to vLLM itself. You can reference this issue for more context.

Oct 29 '25 10:10 qbc2016

vllm command： ··· CUDA_VISIBLE_DEVICES=0 python -m vllm.entrypoints.openai.api_server
--model="/diskb/Bigmodels/modescope/Qwen3-Embedding-8B"
--tensor-parallel-size 1
--gpu-memory-utilization 0.4
--dtype half
--served-model-name "Qwen3-Embedding-8B"
--host 0.0.0.0
--port 5321 --max-model-len 1024
--hf_overrides '{"matryoshka_dimensions":[1024]}' ··· Stuck here and it's not moving forward.

··· (agent) shawn@rg4208v4:~/samba/Workspace/RAG_agent/AgentsProject/TESTS/test_agentscope$ python test_memory1.py Error in new_retrieved_facts: Expecting value: line 1 column 1 (char 0) ------------------: {'input': ['"我的住宿偏好"'], 'model': 'Qwen3-Embedding-8B', 'dimensions': 1024, 'encoding_format': 'float'} 检索结果: ···

Oct 30 '25 02:10 xtanitfy

I'm thinking about encapsulating a Mem0LongTermMemory myself and implementing OpenMemory. I'm still considering it. https://github.com/caviraoss/openmemory.git Actually, I've already successfully tested its MCP as a tool in Agentscope. However, this still doesn't allow multiple agents to share a single memory store.

Oct 30 '25 03:10 xtanitfy

This issue is marked as stale because there has been no activity for 21 days. Remove stale label or add new comments or this issue will be closed in 3 day.

Nov 20 '25 09:11 github-actions[bot]

The reason is that agentscope includes the dimension parameter by default in the request when calling the embedding model, but the bge-m3 deployed with vllm does not allow the dimension parameter in the request. Can the default dimension parameter in agentscope be removed?？here

@qbc2016

Nov 24 '25 08:11 Tendo33

This issue is marked as stale because there has been no activity for 21 days. Remove stale label or add new comments or this issue will be closed in 3 day.

Dec 15 '25 09:12 github-actions[bot]