code_puppy icon indicating copy to clipboard operation
code_puppy copied to clipboard

Feature Request: Persistent Shared Knowledge Base Across Planning and Sub-Agents

Open diegonix opened this issue 2 months ago • 1 comments

Feature Request: Persistent Shared Knowledge Base Across Planning and Sub-Agents

Version code_puppy v0.235

Description When using the planning agent to execute plans, even though the session is saved after every interaction, there is a critical limitation: after restarting Code Puppy and reloading the session, the planning agent does not resume from the previous context — it rebuilds the entire plan from scratch. The planning agent is not recording your progress or reusing session history.

The same issue affects the sub-agents it invokes. They do not maintain historical memory, so if a sub-agent was halfway through a complex task (e.g., 70% complete) and its API credits expired or it crashed, all progress is lost. When the planning agent re-invokes that same sub-agent, it behaves as if it were starting from zero.

This behavior leads to massive inefficiency, token waste, and lost progress on long-running workflows.

Observed Problems

  • Session reload (/autosave_load) restores messages but not internal agent state.
  • The planning agent loses awareness of ongoing or completed tasks.
  • Sub-agents do not share or persist their memory across invocations.
  • Work already performed by sub-agents cannot be reused or correlated later.

Example Scenario

  1. A planning agent creates a multi-step plan and assigns tasks to multiple sub-agents.
  2. A sub-agent (e.g., code generator or tester) completes 70% of its task before failing due to API limits.
  3. The user hit CTRL+C or restarts Code Puppy and restores the session.
  4. The planner loses the task context and restarts the entire workflow, re-spawning sub-agents from scratch.

Expected Behavior

  • Both the planning agent and all sub-agents should have persistent, synchronized memory across the entire session lifecycle.

  • When reloading with /autosave_load, the entire orchestration context should be restored:

    • Planner’s state (current goal, task queue, progress)
    • Sub-agents’ task histories and partial outputs
  • Sub-agents should be able to continue where they left off, rather than restarting.

Actual Behavior

  • Only chat history is restored — not agent reasoning state or context.
  • Agents lose continuity and duplicate prior work.
  • Token usage and cost increase significantly.

Why It’s Important This is not just a quality-of-life improvement — it’s a must-have feature for long, multi-step or resource-intensive workflows. A shared persistence mechanism would:

  • Prevent wasting API credits and time.
  • Enable true resumable sessions.
  • Allow complex projects to continue seamlessly even after interruptions.

Proposed Solution (High-Level)

  • Introduce a central session knowledge base (e.g., JSON or SQLite-backed database) shared by the planner and all sub-agents.

  • On save or shutdown:

    • Store each agent’s context, current goal, and partial results.
  • On reload:

    • Restore not only messages but also internal state trees.
  • Optionally add:

    • A /memory sync command to manually persist the current state of all active agents.
    • A /memory status command to list which agents have persistent contexts.

Example Concept

Session Memory Schema
├── planning_agent/
│   ├── current_plan.json
│   ├── active_tasks/
│   └── completed_tasks/
├── sub_agents/
│   ├── code_writer/
│   │   ├── memory.json
│   │   └── output_cache/
│   ├── tester/
│   │   └── memory.json
│   └── documenter/
│       └── memory.json

Addendum: Shared Session Memory & Agent Orchestration Requirements

In order for the system to support true persistence and resumability, the following architectural enhancements should be considered:

  1. Central Session Knowledge Base

    • Maintain a central store (e.g., SQLite, Postgres + optional vector store) capable of storing:

      • Session metadata (session_id, user_id, created_at, last_active)
      • Planning agent state (current_plan_id, step_cursor, plan_graph_snapshot)
      • Task records for sub-agents (task_id, parent_plan_id, assigned_agent, status, resume_token, artifact_refs)
      • Agent memory snapshots & caches (agent_id, memory_blob_ref, last_updated)
    • The knowledge base acts as the single source of truth across the entire set of agents (planner + sub-agents) and must be referenced for restoration and continuation.

  2. Handoff Protocol Between Agents

    • Define a structured payload whenever the planning agent invokes a sub-agent:

      {
        "session_id": "…",
        "plan_id": "…",
        "task_id": "…",
        "assigned_agent": "code_writer",
        "inputs_ref": ["artifact://…"],
        "memory_delta_ref": "artifact://…",
        "expected_outputs": ["artifact://…"]
      }
      
    • The sub-agent must update its memory snapshot and reference it in the knowledge base before reporting back completion or error.

    • If a sub-agent fails (API error, credit exhaustion, timeout), the failure and its state must be logged and visible to the planning agent, allowing the plan to resume or adjust rather than restart.

  3. Session Restore & Autosave Mechanics

    • On /autosave_load, the system should:

      • Rehydrate the planner state (graph + queue + cursor) from the knowledge base.
      • Load each sub-agent’s last known memory snapshot and output cache.
      • Resume execution from the exact step where it left off (rather than constructing a new plan).
    • The autosave should trigger at meaningful checkpoints (e.g., after each task completion, at major step transitions) to protect against token waste and redundant work.

  4. Memory Layers & Token Efficiency

    • Maintain two memory levels:

      • Short-term memory: immediate context (recent messages, ongoing task).
      • Long-term memory: distilled facts, decisions, artifacts, outcomes (indexed in vector DB for quick retrieval).
    • When token window grows large, implement summarisation of older context and store it in the long-term store, referencing it via pointers rather than including full chatter on every call.

  5. Progress & Cost Visibility

    • Track metrics per agent/task: token usage, elapsed time, retries, errors.
    • Expose CLI commands like /memory status or /agent report to monitor “who did what”, “where did we stop”, “what remains”.

diegonix avatar Oct 28 '25 14:10 diegonix

Acknowledged. Will plan this feature.

mpfaffenberger avatar Nov 03 '25 11:11 mpfaffenberger