Feature Request: Model-Agnostic Memory Layer (Global & Per-Project)

Open jphpme opened this issue 3 days ago • 0 comments

Problem When using OpenCode across multiple LLM providers (Claude, Gemini, Kimi, OpenRouter, etc.), there's no persistent memory between sessions or across models. Each conversation starts fresh, losing valuable context about:

User preferences and coding style
Project-specific decisions and architecture
Previously solved problems and their solutions
Domain knowledge accumulated over time Proposed Solution Add a model-agnostic memory layer to OpenCode that works transparently regardless of which LLM provider is being used. Key Features

Two memory scopes:
- Global memory (~/.opencode/memory/) - Personal preferences, cross-project knowledge
- Project memory (.opencode/memory/) - Codebase-specific context, architecture decisions
Provider independence:
- Use local embedding model (e.g., Qwen3-Embedding-0.6B) - no API dependency
- Local vector storage (LanceDB, SQLite, or similar)
- - Optional: configurable "memory LLM" separate from main model (could be a cheap/fast model)
Transparent operation:
- Automatically extract relevant facts from conversations
- Inject relevant context before each LLM call
- Works the same whether using Claude, Gemini, Kimi, or any other provider Example Architecture User Query │ ▼ ┌──────────────────────┐ │ Memory Retrieval │ ◄── Local embeddings + vector search │ (relevant context) │ └──────────┬───────────┘ │ ▼ ┌──────────────────────┐ │ Context Injection │ ◄── Prepend relevant memories to prompt └──────────┬───────────┘ │ ▼ ┌──────────────────────┐ │ LLM Call (any) │ ◄── Claude, Gemini, Kimi, etc. └──────────┬───────────┘ │ ▼ ┌──────────────────────┐ │ Memory Extraction │ ◄── Extract facts from response (async) │ (background task) │ └──────────────────────┘

Use Cases

Preference persistence: "I prefer functional style" remembered across all projects and models
Project context: Architecture decisions stay available even when switching models mid-project
Cross-session continuity: Resume work without re-explaining context
Knowledge accumulation: Build up domain expertise over time

Configuration (example)

# opencode.toml
[memory]
enabled = true
global_enabled = true
project_enabled = true
# Optional: dedicated model for memory operations (cheap/fast)
extraction_model = "openai/gpt-4.1-mini"
# Local embedding model (no API needed)
embedding_model = "Qwen/Qwen3-Embedding-0.6B"
Prior Art / Inspiration
- SimpleMem (https://github.com/aiming-lab/SimpleMem) - Efficient lifelong memory for LLM agents using semantic compression
- Mem0 (https://github.com/mem0ai/mem0) - Memory layer for AI applications
Additional Context
This would be especially valuable for users who:
- Switch between providers based on task (coding vs reasoning vs speed)
- Work on multiple projects with different contexts
- Want continuity without provider lock-in

Jan 12 '26 18:01 jphpme