Error throw when using @codebase in the prompt: Cannot read properties of undefined (reading 'sort')
Before submitting your bug report
- [X] I believe this is a bug. I'll try to join the Continue Discord for questions
- [X] I'm not able to find an open issue that reports the same bug
- [X] I've seen the troubleshooting guide on the Continue Docs
Relevant environment info
- OS: MacOS 12.6
- Continue: v0.8.43
- IDE: VSCode 1.91.1 (Universal)
- Model: Any
- config.json:
{
"models": [...],
...,
"embeddingsProvider": {
... // Any provider I tried, transformer.js, ollama and openai (voyage)
},
"reranker": {
"name": "voyage",
"params": {
"model": "rerank-1",
"contextLength": 8000,
"apiKey": "abcd"
}
},
"contextProviders": [
...,
{
"name": "codebase",
"params": {
"nRetrieve": 25,
"nFinal": 5,
"useReranking": true
}
},
],
}
Description
I'm evaluating different embeddings models and config. I don't have embeddings issue before but since yesterday, I got this error. I try to change different settings, try different provider, use back the old one but it doesn't work anymore.
I don't provide the embeddings config above because I tried each one of the provider but still doesn't work. It was working before. Let me know if you still need one example of mine.
To reproduce
- Update embeddings config
- In the Continue chat, click green dot besides model dropdown list to force reindex.
- Clear the Continue log in VSCode OUTPUT tab.
- Start Continue chat with "@codebase
" - VSCode notification and DevTool Console produce the error above
- Observe the Continue log. It doesn't add the relevant context to LLM
- LLM give general answer as it don't know about the code
- Try restart the VSCode and repeat, try different embeddings model, provider and config. Same result.
Log output
==========================================================================
==========================================================================
Settings:
contextLength: 4096
model: meta-llama/llama-3.1-8b-instruct:free
maxTokens: 1024
log: undefined
############################################
<user>
where is the relevant code and files that set the theming for this website?
==========================================================================
==========================================================================
Completion:
Unfortunately, I don't have the capability to browse the internet or access any specific website's code or files. However, I can give you some general information about where to look for theming-related code on a website.
Most modern websites use a combination of HTML, CSS, and JavaScript to display their content and apply their theme. Here are some places you might look to find the relevant code and files:
...
Keep in mind that the location and naming conventions may vary depending on the website's architecture and technology stack. To find the relevant code and files, you can use a combination of search engines, developer tools, and DNS reconnaissance techniques.
EDIT: Before this thing happened, I know my embedding was working fine, I saw the log produce a relevant context of my codebase
This is the notification.
I tried to remove ~/.continue/index in case if it get messed up but using codebase retrieval with new index cache, the error still appear.
I tried in a few projects web and python in case related to project, still not working. However it is fine in a new project with 1 file.
So maybe there's a cache related to project, I'm not sure anywhere else other than ~/.continue/index/lancedb/{project path}... but I already clear it for a fresh index.
Hopefully someone can tell me anywhere else I can clear the cache.
What happens if you remove the re-ranker definition from your config.json file?
What happens if you remove the re-ranker definition from your
config.jsonfile?
You are right. I removed and it is working fine in my project.
Anyway to get the log or check further? So far the embedding result contain some irrelevant info, some LLM can't answer it correctly.
See here for logs etc -> https://docs.continue.dev/troubleshooting#llm-prompt-logs
That log doesn't include reranker related. Continue only log i/o for LLM. It doesn't log i/o for embeddings and reranker models before sent to LLM like my log above.
I mean I can't use reranker for my projects and not sure how to check further without any log.
... but wait, I get it now, the reranker return unexpected result. I wait for a while if I hit the rate limit.
It turn out that I hit the TPM rate limit for embeddings size times codebase nRetrieve params. I can avoid the error by adjusting the nRetrieve. I'm not sure if I can adjust chunk size.
So maybe an improvement can be made to catch and log unexpected response. I never expect it was due to the invalid response or rate limited.
I'm running into this same issue. It's because the core\context\retrieval\pipelines\RerankerRetrievalPipeline.ts is sending too much data to the reranker in one batch:
"Request to model 'rerank-1' failed. The max allowed tokens per submitted batch is 100000. Your batch has 145736 tokens after truncation. Please lower the number of tokens in the batch."
This is with Voyage's renrank-1 model, but not with rerank-lite-1 since it has a higher token limit:
"What is the total number of tokens for the rerankers? We define the total number of tokens as the “(number of query tokens × the number of documents) + sum of the number of tokens in all documents". This cannot exceed 300K for rerank-lite-1 and 100K for rerank-1. However, if you are latency-sensitive, we recommend no more than 200K total tokens per request for rerank-lite-1."
RerankerRetrievalPipeline is sending quite a bit of data
retrievalResults.push(
...recentlyEditedFilesChunks,
...ftsChunks,
...embeddingsChunks,
...repoMapChunks,
);
So it's not even the embeddings themselves that are filling up the batch. I guess we need some way to limit the search results we send to the reranker
I am getting the same issue - do I understand correctly that the solution is removing the reranker from the config.json file? I have atm:
"reranker": {
"name": "voyage",
"params": {
"apiKey": "pa-xxx"
}
},
Hi all, thanks for all the details here. Sharing another example of a user running into this issue: https://discord.com/channels/1108621136150929458/1108621136830398496/1296138969288937522
Will circle back to this thread when we address the issue.
This issue hasn't been updated in 90 days and will be closed after an additional 10 days without activity. If it's still important, please leave a comment and share any new information that would help us address the issue.
SAME ISSUE...
I also have this issue, but with a slightly different message:
This issue hasn't been updated in 90 days and will be closed after an additional 10 days without activity. If it's still important, please leave a comment and share any new information that would help us address the issue.
This issue was closed because it wasn't updated for 10 days after being marked stale. If it's still important, please reopen + comment and we'll gladly take another look!