OpenHands icon indicating copy to clipboard operation
OpenHands copied to clipboard

Context / Knowledge Management

Open jpelletier1 opened this issue 5 months ago • 10 comments

As an engineering leader, there is a lot of different places where information/context is stored in my organization that could be used to give AI coding agents additional context to perform tasks.

Example: I may have an internal wiki page that outlines a system architecture; I may have an internal Jira ticket that details a specific user story; I may have a slack conversation that debates technical implementation

While microagents help me store some of this context, I want to figure out better/more efficient ways to have AI agents use context (securely, too) from these various sources without having to copy/paste from multiple places.

jpelletier1 avatar Jul 22 '25 19:07 jpelletier1

I'm not sure why is this a Cloud issue. Surely everyone needs to use knowledge scattered in wikis, jira or slack.

OpenHands already has a “custom secrets” feature, which allows the user to enter credentials for the agent to use. I suppose this feature would reuse those credentials if necessary, or use public information if public.

I’m not sure what is enterprise-y about using credentials to access private information. It’s not?

I suppose Cloud users may have brought this out, sure, everyone does, that doesn't make it a feature specific to the Cloud, to the contrary, over time we have considered it in OpenHands. I guess it’s also possible that some particular enterprise wants something more custom than the general knowledge management. But that still seems to mean that knowledge management belongs in OpenHands.

While custom solutions per enterprise is … probably not even within the SaaS scope, idk, more like on-place customization.

enyst avatar Jul 24 '25 18:07 enyst

@enyst That's a fair point, and I think this is actually a decent candidate to move to the Open Source project, especially since we need to narrow on the problem we want to solve here. In general, I think the problem is knowledge about an app or project can live in many places, and users want easy ways to onboard that knowledge/context and keep it updated as things change.

jpelletier1 avatar Jul 28 '25 15:07 jpelletier1

Would this not be provided to OpenHands through an MCP server?

bartlettroscoe avatar Jul 28 '25 23:07 bartlettroscoe

And if things are overloaded with MCP functions, check this out https://github.com/RooCodeInc/Roo-Code/discussions/6289#discussioncomment-13928740

Context/knowledge management is likely an MCP-level operation, but I am not even sure if OpenHands is ready to allow tools usually reserved for Cursor or Claude Code to be used here. Related note written MCP repos rather than here https://github.com/basicmachines-co/basic-memory/discussions/115 https://github.com/eyaltoledano/claude-task-master/discussions/487/

There are so much knowledge to capture:

  • Offline base context documents
  • Online documentation and references for up-to-date work
  • Planning/architecture notes that are dynamically updated
  • Source code docstrings and related works
  • Current issues and debug failure reports
  • Socratic thinking and creative thought notes

BradKML avatar Jul 30 '25 02:07 BradKML

@bartlettroscoe no, the agent manager itself has to perform this function or you run into silent failure modes like those observed in Manus when the KB is full.

erkinalp avatar Aug 18 '25 04:08 erkinalp

Having the agent do the context retrieval with find/grep and lots of LLM calls eats a ton of tokens. I understand why Claude Code does this (because they want you to pay for lots of Claude inference which they make money on), but other agents could make a different choice.

I just hope the AI coding agent community can get a hold on the context problem for huge code bases in a way that does not cost $$$ in LLM tokens just to build a basic context to pass to the LLM to do the actual thing.

bartlettroscoe avatar Aug 18 '25 12:08 bartlettroscoe

@bartlettroscoe they are working on spreading the load between using big models for code + small models for easier tasks. Vector DB + "grep" like tools are a weird game of tradeoffs.

BradKML avatar Aug 18 '25 15:08 BradKML

@bartlettroscoe Manus and Devin employ a dedicated vector database for RAG, constantly monitoring the output stream and inserting the pieces of "knowledge" records into the context when necessary.

erkinalp avatar Aug 18 '25 16:08 erkinalp

I'm using SOTA Graphiti MCP server for OH's memory: https://github.com/getzep/graphiti

kripper avatar Sep 02 '25 14:09 kripper

I ran across OpenMemory/mem0 today: https://github.com/mem0ai/mem0

jpelletier1 avatar Sep 23 '25 19:09 jpelletier1