opencode icon indicating copy to clipboard operation
opencode copied to clipboard

[FEATURE]: Support for a local Relevance Index

Open Judahmeek opened this issue 1 month ago • 8 comments

Feature hasn't been suggested before.

  • [x] I have verified this feature I'm about to request hasn't been suggested before.

Describe the enhancement you want to request

Problem Summary (from AI): LLM architecture results in context window data loss because it is bandwidth-limited by the Softmax function. It forces the model to ignore the vast majority of the context to focus on a few spikes. When the context is too long, the "noise" of irrelevant tokens drowns out the "signal" of relevant ones, and the mechanism defaults to looking at the beginning (anchors) and end (immediate context) while losing the middle.

Best Possible Solution that I could think of: We need a data structure that enables AI to tell what entities are relevant to modifications of any specific entity at a glance, a.k.a. a single script command.

AI is able to figure this out fairly well for any individual entity, but it's currently restricted to storing that information in the context window, which is seriously lossy due to LLM's attention transformer architecture.

Thus, optimal LLM-assisted programming will require a new data system that supplements source code with relevance weights? for the entities that the source code represents. The relevance index wouldn't even have to be exhaustive in order to cause operations that involve a fuzzy search (one of the main things I want AI to be able to do) to be more comprehensive.

Relevance weights could also remind Agents that if they make a change to a particular entity (some configuration of a transpiler or linter, perhaps) that it's a Chesterton's Fence and will likely require changes to lines A & B in File X, and line C in File Z. Or even something as basic as "This comment above this function has particular relevance to these two lines of logic nested about 25 lines down",

Finally, the relevance index needs to be updated somehow upon test failure, because in my experience, test failures are mainly caused by things slipping through the context window's cracks (or by a failure to test changes, either manually or in CI).

This data structure would be stored locally and committed to version control as it could possibly grow to match the size of the code base itself.

Why OpenCode?: I'd love to see open-source get some bragging rights in the AI war. Also, I'm pretty sure that all the Corps keep their eyes on us.

A little marketing: This isn't the only crazy idea I've had lately. Check out https://github.com/Judahmeek/Significance-Hypothesis-Based-ARC-AGI-2-puzzle-solver for another one.

Judahmeek avatar Dec 02 '25 15:12 Judahmeek

Searching for potential duplicates...

github-actions[bot] avatar Dec 02 '25 15:12 github-actions[bot]

This issue might be a duplicate of existing issues. Please check:

Your "relevance index" proposal could complement these existing discussions, particularly #2108's RAG/repository map concept and #1990's context management controls.

Feel free to ignore if your feature request addresses a specific gap these issues don't cover.

github-actions[bot] avatar Dec 02 '25 15:12 github-actions[bot]

I'd note that https://github.com/sst/opencode/issues/2108 injects information about the codebase into the context window. A relevance index could be accessed through a tool call, almost entirely bypassing the context window, thus risking far less information being lost.

Judahmeek avatar Dec 02 '25 17:12 Judahmeek

Have you considered making a custom-tool that can do this?

It'd be interesting to see how it performs, right now we are focused on some lower level things:

  • overall stability (bug fixes)
  • improvements to baseline tooling: better prompts, better grep (looking at things like ast-grep and mgrep)
  • adding highly in demand features

Typically, people will also have similar idea, then comment on this issue here or on discord and then it will move it up higher in our priority list.

We have some very important things we need to address first, but I like your idea

rekram1-node avatar Dec 04 '25 16:12 rekram1-node

One reviewer made the suggestion of indexing the abstract syntax tree for relevance, which may simplify things.

Slightly off-topic: In the context of agentic execution, the same reviewer suggested making every level of abstraction handled by a different sub-agent, but that suggestion runs into the problem of making levels of abstraction discrete enough to distinguish between.

Judahmeek avatar Dec 07 '25 01:12 Judahmeek