opencode
opencode copied to clipboard
Feature Request: Model-Agnostic Memory Layer (Global & Per-Project)
Problem When using OpenCode across multiple LLM providers (Claude, Gemini, Kimi, OpenRouter, etc.), there's no persistent memory between sessions or across models. Each conversation starts fresh, losing valuable context about:
- User preferences and coding style
- Project-specific decisions and architecture
- Previously solved problems and their solutions
- Domain knowledge accumulated over time Proposed Solution Add a model-agnostic memory layer to OpenCode that works transparently regardless of which LLM provider is being used. Key Features
-
Two memory scopes:
-
Global memory (
~/.opencode/memory/) - Personal preferences, cross-project knowledge -
Project memory (
.opencode/memory/) - Codebase-specific context, architecture decisions
-
Global memory (
-
Provider independence:
- Use local embedding model (e.g.,
Qwen3-Embedding-0.6B) - no API dependency - Local vector storage (LanceDB, SQLite, or similar)
-
- Optional: configurable "memory LLM" separate from main model (could be a cheap/fast model)
- Use local embedding model (e.g.,
-
Transparent operation:
- Automatically extract relevant facts from conversations
- Inject relevant context before each LLM call
- Works the same whether using Claude, Gemini, Kimi, or any other provider Example Architecture User Query │ ▼ ┌──────────────────────┐ │ Memory Retrieval │ ◄── Local embeddings + vector search │ (relevant context) │ └──────────┬───────────┘ │ ▼ ┌──────────────────────┐ │ Context Injection │ ◄── Prepend relevant memories to prompt └──────────┬───────────┘ │ ▼ ┌──────────────────────┐ │ LLM Call (any) │ ◄── Claude, Gemini, Kimi, etc. └──────────┬───────────┘ │ ▼ ┌──────────────────────┐ │ Memory Extraction │ ◄── Extract facts from response (async) │ (background task) │ └──────────────────────┘
Use Cases
- Preference persistence: "I prefer functional style" remembered across all projects and models
- Project context: Architecture decisions stay available even when switching models mid-project
- Cross-session continuity: Resume work without re-explaining context
- Knowledge accumulation: Build up domain expertise over time
Configuration (example)
# opencode.toml
[memory]
enabled = true
global_enabled = true
project_enabled = true
# Optional: dedicated model for memory operations (cheap/fast)
extraction_model = "openai/gpt-4.1-mini"
# Local embedding model (no API needed)
embedding_model = "Qwen/Qwen3-Embedding-0.6B"
Prior Art / Inspiration
- SimpleMem (https://github.com/aiming-lab/SimpleMem) - Efficient lifelong memory for LLM agents using semantic compression
- Mem0 (https://github.com/mem0ai/mem0) - Memory layer for AI applications
Additional Context
This would be especially valuable for users who:
- Switch between providers based on task (coding vs reasoning vs speed)
- Work on multiple projects with different contexts
- Want continuity without provider lock-in