Multi-user isolation doesn't exist? Library factually broken and eats TPM quota.
Hi Memori team,
I made an earlier issue #68 which asked about per-user memories where you reassured me that per-user memories do exist and this is further supported by your multi-user scenario example. My previous issue was primarily to address the support and to inquite about the term used to identify individual users -- "namespaces" -- which are said to be isolated per user.
I'm opening a new issue because the repository currently advertises per-user isolation, but the shipping code can't deliver it.
How to reproduce (verbatim example)
- From repo root:
uv sync --extra devuv pip install python-dotenv fastapi uvicornuv run python examples/multiple-users/fastapi_multiuser_app.py
- In another terminal:
curl -s -X POST http://127.0.0.1:8000/chat -H 'Content-Type: application/json' -d '{"user_id":"alice","message":"alice: please remember that I like pizza"}'
curl -s -X POST http://127.0.0.1:8000/chat -H 'Content-Type: application/json' -d '{"user_id":"bob","message":"bob: remind me to call alice"}'
- Inspect the database:
sqlite3 fastapi_multiuser_memory.db "SELECT namespace, user_input FROM chat_history ORDER BY timestamp;"
Observed
- Both messages appear in both namespaces (fastapi_user_alice and fastapi_user_bob). There is no isolation.
- The DB explodes in size after just those two curls (e.g., ~8MB), filled with repeated "CONVERSATION CONTEXT" JSON blocks. -- This is likely fixed in #102
Analysis
- each Memori.enable() registers a LiteLLM success callback into a shared global list. Every completion triggers every registered callback, and each calls record_conversation() using its own namespace. With two enabled instances, each chat is written twice into two namespaces.
- record_conversation() schedules long-term processing, which itself prompts the LLM for summarization -- THIS IS A RECURSION. Because both instances record the same chat, both launch their own processing prompts, multiplying requests and writes. This is why TPM and DB size blow up. Again, likely fixed in #102 but I will keep it here in case this is a missed aspect of the issue.
- The FastAPI example never routes a request to a specific Memori instance; it globally calls litellm.completion() and relies on side effects, which is exactly what causes the fan-out.
Impact
- Data leakage across "user" namespaces (privacy / tenancy risk).
- Corrupted memory due to duplicated short-/long-term rows.
README markets "Simple Multi-User" and "FastAPI Multi-User App" as working isolation examples. In practice, namespaces are just a column; no request-scoped routing or callback selection ensures isolation. The examples are misleading.
Can the maintainers please confirm the issue?
Hello, I also encountered the same problem as yours. When deploying as a Flask or FastAPI project, I encountered internal circular calls to the API. It ran for approximately 10 minutes and consumed over 10 million tokens. I noticed that the log always showed "request timeout" occurrences.
并且namespace失效,我一点也没有修改代码,隔离失效
@slobodaapl @1002358072 Thanks for reporting this! We will look into this asap.
@Boburmirzo I have a reliable isolation solution architected for you, I'll contribute as soon as I'm able
@slobodaapl We are already working on multi-user and multi-assistant solution for memori, but we would also like to explore your solution, you can connect with us on discord !
Critical bug! The namespace isolation issue is a serious production blocker. Based on reproduction steps, the memory storage layer isn't properly scoping queries by namespace.
Root cause: Database queries lack WHERE clause filtering by user/namespace
Impact: Cross-user data leakage + TPM quota exhaustion from duplicate context
Fix needed:
- Add namespace column to all tables
- Update SELECT/INSERT to filter by namespace
- Database migration for existing data
- Integration tests for multi-tenant isolation
Glad @slobodaapl is working on a solution! Happy to review/test once ready.