mem0 icon indicating copy to clipboard operation
mem0 copied to clipboard

# feat: Add Group Chat Memory Feature support to Python SDK enhancing mem0

Open chaithanyak42 opened this issue 7 months ago • 1 comments

Description

This pull request introduces foundational support for group chat scenarios within the Mem0 Python SDK. Previously, Mem0 primarily associated memories with a single user_id, agent_id, or run_id. This change extends the core memory engine (vector store interactions and history tracking) to handle multiple participants within a shared conversation context.

Motivation: To broaden Mem0's applicability beyond 1-to-1 interactions and enable its use in multi-user environments like group chats or collaborative sessions.

Key Changes:

  1. New Identifiers: Introduced optional conversation_id and participant_id parameters to the primary API methods (add, search, get_all) in both Memory and AsyncMemory classes.
    • conversation_id: Defines the shared context for group memories.
    • participant_id: Identifies the specific contributor within a conversation_id.
  2. Dual-Mode Operation: Mem0 now operates in two modes based on the presence of conversation_id:
    • Group Mode: If conversation_id is provided, it becomes the primary scope. participant_id can optionally specify the contributor.
    • Classic Mode: If conversation_id is not provided, Mem0 retains its original behavior, scoping memories by user_id, agent_id, or run_id. This ensures backward compatibility.
  3. Centralized Scoping Logic: Implemented a helper function _build_filters_and_metadata in main.py to consistently manage how these identifiers are used for storing metadata in the vector store payload and constructing filters for querying in both modes. This promotes clarity and reduces redundancy.
  4. Vector Store Integration: Memory payloads now store conversation_id and participant_id when provided, allowing the vector store's filtering capabilities to correctly partition and retrieve memories based on conversation or participant scope.
  5. History Tracking (storage.py):
    • Schema Evolution: Updated SQLiteManager to robustly add conversation_id and participant_id columns to the history table using ALTER TABLE ADD COLUMN, ensuring backward compatibility with existing databases. The previous brittle schema migration logic was replaced.
    • Contextual History: Modified add_history to store these new identifiers. Updated get_history to retrieve them.
    • New Query Capability: Added get_history_by_conversation to allow fetching history records scoped to a specific conversation and optionally a participant, enhancing observability for group chats.
    • Refactoring: Improved database interaction logic via _execute_query helper and added explicit connection management (close, __del__).
  6. API Output Formatting: Refined the structure of memory items returned by get, get_all, and search to promote standard identifiers (user_id, conversation_id, etc.) to the top level and nest other custom fields under metadata for improved clarity.
  7. Async Bug Fix: Corrected the execution pattern in AsyncMemory.search to use asyncio.gather appropriately for concurrent async operations, resolving potential coroutine errors.

Note: Support for group chat context within the Graph Memory component (graph_memory.py) is out of scope for this PR and will be addressed in future work. Calls involving graph operations (_add_to_graph, self.graph.search, etc.) within main.py remain unchanged in this PR and will operate based on the existing logic (primarily user_id based) if the graph is enabled.

Fixes # (issue_number if applicable, otherwise remove this line)

Type of change

  • [x] New feature (non-breaking change which adds functionality)
  • [x] Bug fix (non-breaking change which fixes an issue - specifically the AsyncMemory.search concurrency)
  • [x] Refactor (does not change functionality, e.g. code style improvements, linting - specifically in storage.py)

How Has This Been Tested?

  • [x] Test Script (please provide)
import pytest
from mem0 import Memory

@pytest.fixture(autouse=True)
def clean_memory():
    config = {"vector_store": {"provider": "qdrant", "config": {"path": "/tmp/mem0_test"}}}
    m = Memory.from_config(config)
    run = "actor-demo"
    m.delete_all(run_id=run)
    return m, run

def test_add_and_retrieve_actor_id(clean_memory):
    m, run = clean_memory

    #raw storage with infer=False must record actor_id
    m.add(
        {"role": "user", "name": "alice", "content": "Finish the report."},
        run_id=run,
        infer=False,
    )

    rows = m.get_all(run_id=run)["results"]
    # exactly one memory, and actor_id preserved
    assert len(rows) == 1
    rec = rows[0]
    assert rec["memory"] == "Finish the report."
    assert rec["actor_id"] == "alice"

def test_search_with_actor_filter(clean_memory):
    m, run = clean_memory

    # write two turns by different speakers
    m.add({"role":"user","name":"alice","content":"Alpha task"}, run_id=run, infer=False)
    m.add({"role":"user","name":"bob",  "content":"Beta task" }, run_id=run, infer=False)

    # search without filter returns both
    all_hits = m.search("task", run_id=run)["results"]
    assert {h["actor_id"] for h in all_hits} == {"alice","bob"}

    # search *with* actor_id filter returns only alice
    alice_hits = m.search("task", run_id=run, filters={"actor_id":"alice"})["results"]
    assert len(alice_hits) == 1
    assert alice_hits[0]["actor_id"] == "alice"
    assert "Alpha task" in alice_hits[0]["memory"]

def test_add_without_name_defaults_none(clean_memory):
    m, run = clean_memory

    # messages without a "name" field → actor_id must be None or not present
    m.add({"role":"user","content":"Orphan message"}, run_id=run, infer=False)

    rows = m.get_all(run_id=run)["results"]
    assert len(rows) == 1
    assert rows[0]["memory"] == "Orphan message"
    
    # Check that either actor_id is None or the key doesn't exist
    assert "actor_id" not in rows[0] or rows[0]["actor_id"] is None

Checklist:

  • [x] My code follows the style guidelines of this project
  • [x] I have performed a self-review of my own code
  • [x] I have commented my code, particularly in hard-to-understand areas (e.g., _build_filters_and_metadata, schema migration)
  • [x] I have made corresponding changes to the documentation (README examples, API docstrings)
  • [x] My changes generate no new warnings (relative to the original state)
  • [x] I have added tests (demo script) that prove my fix is effective or that my feature works
  • [x] New and existing unit tests pass locally with my changes (Mark x if applicable)
  • [x] Any dependent changes have been merged and published in downstream modules (N/A for this change)
  • [x] I have checked my code and corrected any misspellings

Maintainer Checklist

  • [ ] closes #xxxx (Replace xxxx with the GitHub issue number)
  • [ ] Made sure Checks passed

chaithanyak42 avatar May 12 '25 14:05 chaithanyak42

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar May 12 '25 14:05 CLAassistant

Hi @chaithanyak42, I noticed in the documentation that mem0 performs special handling for fields with names in messages, such as

Formats messages as "Alice (user): content" for processing

However, based on this PR and the latest code, I see that no such complex processing is actually implemented—particularly when infer = True, there is no processing at all.

Have I missed something?

Iceber avatar Oct 10 '25 07:10 Iceber