@Codebase doesn't include relevant files
Before submitting your bug report
- [x] I've tried using the "Ask AI" feature on the Continue docs site to see if the docs have an answer
- [x] I believe this is a bug. I'll try to join the Continue Discord for questions
- [ ] I'm not able to find an open issue that reports the same bug
- [x] I've seen the troubleshooting guide on the Continue Docs
Relevant environment info
- OS: macOS 15,1 and Arch linux
- Continue version: latest + pre-release
- IDE version: VScode 1,103
- Model: Qwen3-32b-A3B GGUF
- config:
name: Local Assistant
version: 1.0.0
schema: v1
models:
- name: Qwen3 32b
provider: llama.cpp
model: qwen3-coder-32b
apiBase: http://192.168.2.61:5000
roles:
- chat
- edit
- apply
- embed
codebase:
rootDir : ${workspaceFolder}
ignoreGlobs:
- "**/node_modules/**"
- "**/.git/**"
- "**/dist/**"
- "**/__pycache__/**"
context:
- provider: open
- provider: code
- provider: docs
- provider: diff
- provider: file
- provider: terminal
- provider: problems
- provider: folder
- provider: codebase
- provider: currentFile
Description
I am having a problem with codebase. Both on macOS and my linux computer running the LLM. Every time I add @Codebase to the prompt, it only fetches partial files. Even when I specifically state which file, it pulls the same files. In my scenario, I have 3 files: index.html, styles.css and scripts.JS. When the @Codebase command is utilized within the prompt, it only contexts the html and script file, not the css file.
I've checked the index.sqlite file with sqlitedatabase browser, and the tables clearly show that the files are in the codebase index. I've tried re-indexing on every step, different versions of continue, deleted the index file and rebuild it from scratch, tried to make a rule to automatically include all files, reinstalled extension/vscode/OS but none of them seem to work. every prompt I have to manually insert each file that is relevant to my problem, but sometimes I don't know what is causing the problem so I don't know what to insert.
I am starting to think it is caused by the API of my Llama.cpp server, or perhaps the lack of embeddings of my model. I just find it strange that it is able to pull the html and js file, but not the css file. Is it named wrong? do I append .CSS instead of .css? any quirks I am overlooking?
I've even checked to see if the VScode indexing does not include .css, but It does.
I've found multiple closed issues here that go over the same problems. I've tried their solutions too, but to no avail.
To reproduce
No response
Log output
I've found out some new things.
if I change my styles.css to layout. it gets indexed and send together with the prompt.
if I change it from layout to layout.style, it gets indexed and send together with the prompt.
if I change it back from layout.style to layout.css, it gets left out again...
it really doesn't seem to like .css
logs don't show anything useful either that could be causing the issue. any recommendations?
have you tried an actual embedding model rather than the chat/edit/apply model as your embedder?
the embedding model is used to index the codebase, which is then used when you refer to @codebase context
Yes, they have the same problem. I switched to starcoder 2, qwen 3 8B embedding and ranking, qwen 2.5 coder 30b and even tried the GLM 4 model, but this one is just way heavy for my rtx3070.
However, today, after switching back to my Qwen 3 model, it seems to work with codebase sometimes now (like 2/5 times). I am puzzled what is causing this issue, as I really think this is caused by the model. But who knows what kind of a black box magic is going wrong there. maybe it can lay a link after spamming @styles.css enough times.
I guess it just needs a few tries. if you chat with it enough and don't delete the chat history, I think it can figure it out by itself.
please keep this issue open as I am still investigating the process in llama.cpp for anything that stands out that could cause the issues. the verbose logs of continue look fine to me, So either it is the model, llama.cpp or the openAI communication with continue. the behavior is unpredictable and consequently the logs will be the same. Perhaps it is the reasoning going astray?
I will be testing different llama.cpp flags/parameters on different models to test behaviors. I will report back here in a couple of days.
This issue hasn't been updated in 90 days and will be closed after an additional 10 days without activity. If it's still important, please leave a comment and share any new information that would help us address the issue.