Add Support for Entire Codebase Context Without Indexing
Validations
- [ ] I believe this is a way to improve. I'll try to join the Continue Discord for questions
- [ ] I'm not able to find an open issue that requests the same enhancement
Problem
Current LLM models such as Gemini Pro offers 2M Context window which is sufficient to provide entire codebase as context for small codebases.
Solution
No response
Hmmm this is an interesting idea. What are the scenario(s) where you'd prefer to just provide the entire codebase as context rather than using built-in indexing?
Wanting the llm to understand the whole project at once, rather than snippets there and there.
I developped it. Here is how to use it :
open ~/.continue/config.ts
Paste there :
import {
Config,
ContextItem,
ContextProviderExtras,
CustomContextProvider,
} from "@continuedev/core";
import * as fs from 'fs/promises';
import * as path from 'path';
const MAX_TOTAL_SIZE_MB = 5; // Set a maximum total size for all files
async function listFiles(
dir: string,
excludedPaths: string[] = ["/node_modules/", "/.git/", "/.continue/", "/build/", "/dist/"]
): Promise<string[]> {
let files: string[] = [];
try {
const entries = await fs.readdir(dir, { withFileTypes: true });
for (const entry of entries) {
const fullPath = path.join(dir, entry.name);
if (entry.isSymbolicLink()) {
continue;
}
if (excludedPaths.some(excluded => fullPath.includes(excluded))) {
continue;
}
if (entry.isFile()) {
files.push(fullPath);
} else if (entry.isDirectory()) {
files = files.concat(await listFiles(fullPath, excludedPaths));
}
}
} catch (error) {
console.error(`Error listing files in ${dir}:`, error);
return [];
}
return files;
}
const AllFilesContextProvider: CustomContextProvider = {
title: "allfiles",
displayTitle: "All Files (Auto)", // Indicate automatic inclusion
description: "Automatically includes all files in the workspace (without indexing, BE CAREFUL)",
type: "normal", // Use "normal" type
async getContextItems(
query: string,
extras: ContextProviderExtras
): Promise<ContextItem[]> {
const { ide } = extras;
const workspaceDirs = await ide.getWorkspaceDirs();
let allFiles: string[] = [];
let totalSize = 0;
const contextItems: ContextItem[] = [];
for (const dir of workspaceDirs) {
try {
if (await fs.stat(dir)) {
const files = await listFiles(dir);
allFiles = allFiles.concat(files);
}
} catch (error) {
console.error(`Error accessing workspace directory ${dir}:`, error);
}
}
for (const file of allFiles) {
try {
const stats = await fs.stat(file);
totalSize += stats.size;
if (totalSize / (1024 * 1024) > MAX_TOTAL_SIZE_MB) {
console.warn(`Max size of ${MAX_TOTAL_SIZE_MB}MB reached, stopping file inclusion.`);
break; // Stop if we exceed the size limit
}
const content = await extras.ide.readFile(file);
contextItems.push({
name: path.basename(file),
description: file,
content,
});
} catch(error) {
console.error(`Error reading file ${file}:`, error);
contextItems.push({
name: path.basename(file),
description: `Error reading file: ${file}`,
content: "",
});
}
}
return contextItems;
},
};
export function modifyConfig(config: Config): Config {
if (!config.contextProviders) {
config.contextProviders = [];
}
config.contextProviders.push(AllFilesContextProvider);
// You don't strictly need @tree anymore, but it can still be helpful
if (!config.contextProviders.find((p) => p.title === "tree")) {
config.contextProviders.push({ title: "tree" });
}
return config;
}````
Then go to your chat and you can use `@allfiles` just like other context providers.
However, please note that if you have large codebase it will be slow and use an insanely high number of input tokens.
Use it at your own risks
I and the Continue team ARE NOT in any way responsible for any high price you end up getting.
This is just a small script that I wrote to help you.
This issue hasn't been updated in 90 days and will be closed after an additional 10 days without activity. If it's still important, please leave a comment and share any new information that would help us address the issue.
I'm still interested
can you write this PR?
If not this maybe better to force add a folder recursively. It can replace what the author is saying as it would allow to add everything if you select 'java' folder or the project folder. Sometimes RAG doesn't know the file would be relevant and you do. I honestly don't understand how this functionality is not a part of every plugin that allows to use local LLMs. If you use providers I get it - you pay for for the input tokens. But with local LLMs you can really stop compromising at least on this front.
Have any of you considered MCP servers like Repomix to do this? When I'm working on small repos I usually just use the Repomix CLI directly to get the codebase as a single file and then provide that file as context.
This issue hasn't been updated in 90 days and will be closed after an additional 10 days without activity. If it's still important, please leave a comment and share any new information that would help us address the issue.
This issue was closed because it wasn't updated for 10 days after being marked stale. If it's still important, please reopen + comment and we'll gladly take another look!