llmap Idea: Introducing Confidence-Based Refinement Filtering

In my current workflow, the final refinement call to DeepSeek is overly aggressive in dropping files, which leads to the exclusion of files that are actually important for my context.

For example, consider this pipeline:

git ls-files \
| grep -E "\.(cs|razor)$" \
| llmap --no-revise "Is this file a Blazor Page? If yes, I want to collect all used Blazor Components AND all injected C# services for this page."

Here, one of the files identified during the analysis is:

FilePages/RevokedPermission.razor

The analysis of this file yielded the following output:

Overall Summary
Yes, the provided file is a Blazor Page. It is identified by the @page directive at the top, which specifies the route for the page (/permission/revokedPermission/{PermissionRequestId:guid}). The page uses Blazor Components and likely relies on injected C# services (though the services are not explicitly visible in the provided code snippet).

...

Conclusion
The file is a Blazor Page that uses several Blazor components and likely relies on injected C# services for localization and data handling. To fully identify all injected services, you would need to inspect the @code block or the associated C# class file (if it exists) for dependency injection (@inject directives).

However, despite its importance, this file is dropped during the refinement step, which is undesirable.

Proposed Solution

To address this, I suggest introducing a confidence voting mechanism (i.e. 1 -10) in the full source code analysis step. Here's how it could work:

Confidence Scoring: When the LLM analyzes a file, it assigns a confidence score indicating how relevant or important the file is to the context.
- For example, files that are clearly identified as Blazor Pages with components and injected services would receive a higher confidence score.
Threshold for Refinement: Use the confidence score as an indicator during the refinement step:
- Files with a score above a certain threshold (which could be configurable via a CLI parameter) are retained automatically.
- Files below the threshold can either be dropped automatically or sent to the refinement step for further validation.
Direct Filtering: By leveraging this confidence score, we could optionally bypass the refinement LLM entirely for lower-confidence files, streamlining the process and reducing unnecessary refinement calls.

Benefits of the Proposal

Improved Precision: Files that are critical for the context are less likely to be dropped accidentally.
Configurable Behavior: A CLI parameter for the confidence threshold allows fine-tuning based on the use case.
Efficiency Gains: By pre-filtering low-confidence files, we can reduce the workload on the refinement LLM, speeding up the pipeline.

What are your thoughts on this approach? It would ensure that files like RevokedPermission.razor—which clearly contain relevant information—are retained for later stages of the analysis.

Jan 25 '25 22:01 lutzleonhardt

Refining is a bit of a weak link for sure (hence the --no-refine option as an escape hatch)

I'm pretty skeptical that we'd get reliable enough scoring for this particular solution to work, but by all means prove me wrong :)

I have medium confidence that we could get it to "good enough" just with better prompting in the refine stage

Jan 26 '25 01:01 jbellis

Thinking about it a bit more, for your use case I would not expect the refine step to add value since you're trying to ask a per-file question. Refine is designed to boil down per-file answers to a per-repo summary. So just use --no-refine if you don't need the "reduce" stage.

Jan 28 '25 01:01 jbellis

yes you are right, but without the refinement I have a lot of noisy files. With the refinement I loose some important files...

Jan 28 '25 01:01 lutzleonhardt