claude-3.5-sonnet returns 500 on some special unicode characters
When i set model = "claude-3.5-sonnet" and select my whole buffer or set context = "buffers" and ask something to Copilot while having claude as selected model, i have this error that appears :
Failed to parse response: "Expected value but found invalid token at character 1"
Internal Server Error
Debug logs :
[CopilotChat.nvim] [DEBUG 00:41:04]
~/.local/share/nvim/lazy/CopilotChat.nvim/lua/CopilotChat/context.lua:239: Got 1 embeddings
[CopilotChat.nvim] [DEBUG 00:41:05]
~/.local/share/nvim/lazy/CopilotChat.nvim/lua/CopilotChat/copilot.lua:499: Temperature: 0.1
[CopilotChat.nvim] [DEBUG 00:41:08]
~/.local/share/nvim/lazy/CopilotChat.nvim/lua/CopilotChat/copilot.lua:513: Tokenizer: o200k_base
[CopilotChat.nvim] [INFO 00:41:08]
~/.local/share/nvim/lazy/CopilotChat.nvim/lua/CopilotChat/copilot.lua:468: Claude enabled
[CopilotChat.nvim]
Internal Server Error
Here is my current CopilotChat config :
{
"CopilotC-Nvim/CopilotChat.nvim",
event = "BufReadPre",
branch = "canary",
dependencies = {
{ "github/copilot.vim" },
{ "nvim-lua/plenary.nvim" },
},
build = "make tiktoken",
opts = {
debug = true,
window = {
layout = "float",
border = "rounded",
title = "Copilot Chat",
},
model = "claude-3.5-sonnet",
question_header = " User ",
answer_header = " Copilot ",
error_header = " Error ",
auto_insert_mode = true,
insert_at_end = true,
context = "buffers",
highlight_selection = false,
},
config = function(_, opts)
local chat = require("CopilotChat")
chat.setup(opts)
local wk = require("which-key")
wk.add({
{
group = "Copilot",
"<leader>g",
{ "<leader>cc", chat.toggle, mode = { "n", "v" }, desc = "Toggle Copilot Chat", icon = { icon = "", color = "white" } },
{ "<leader>ca", ask_copilot, mode = { "n", "v" }, desc = "Ask Copilot", icon = { icon = "", color = "white" } },
icon = { icon = "", color = "white" },
},
})
end,
},
https://github.com/settings/copilot
can you check if you have access to claude here? Also check :CopilotChatModels if its listed there.
Yes i have access to claude and it is listed when i run :CopilotChatModels
Hey I actually know what is going on here! I was about to open an issue last week but then after several hours I figured out the issue. Claude seems to not handle some unicode characters properly. For me it was a couple specific nerd font icons in the active selection that gets sent with the prompt. It only happens with claude as openai models seem to handle it fine.
To confirm, @quentin-lpr the problem you were having with sending a @buffer question, can you see if you have any icons / emojis in the buffer? For me the following are the ones I have determined so far are problems. I ended up client side in my config writing a "scrubber" to replace these specific icons before sending to claude.
This was from the default noice.nvim config...
input = { view = 'cmdline_input', icon = ' ' }, -- Used by input()
This was from a icon I was using for mini.indentscope...
symbol = '',
Here is what the actual icons look like in neovim...
@deathbeam I am not sure what the potential solution here is other than allowing for copilotchat config to have a option to provide a table of find / replace strings that run on the api body before sending? so that you can personally add problematic characters as you find them. It seems like doing something like removing / replacing ALL emojis / icons programmatically is not ideal but would be the only other solution I can think of 🤷♂️
For now this is all I am doing on my side... running a scrub of visual selection (or @buffer) before providing it into the prompt/body to send to the api:
---Remove bad characters like nerd font emojis / icons
---@param input_str string
local replace_bad_characters = function(input_str)
local char_mappings = {
-- Default replacement character
default_char = '�',
---@alias CharToReplace string Character(s) to replace
---@alias ReplacementChar string Replacement character (nil will use default_char)
---@type table<CharToReplace, ReplacementChar|nil>
replacement_map = {
[''] = 'default_char', -- From Noice.nvim cmdline input icon
[''] = 'default_char', -- From mini.indentscope symbol
},
}
local scrubbed_str = input_str
for bad_char, replacement in pairs(char_mappings.replacement_map) do
local replacement_char = replacement == 'default_char' and char_mappings.default_char or replacement
scrubbed_str = string.gsub(scrubbed_str, bad_char, replacement_char)
end
return scrubbed_str
end
@deathbeam another option that fits a little better with the current config is to similar to the selection fn you can provide (below):
-- default selection (visual or line)
selection = function(source)
return select.visual(source) or select.line(source)
end,
What if you provided a replace(content) or body(content) ... or whatever you want to call it (I am not sure what to name it) which runs on ALL content that makes up the body? The tricky part is that you would either have to run this function "generically" against each part (system prompt, system message the active selection gets sent as, and the user message) ... OR provide a table like the following where you can define each (I probably prefer this just to keep it as flexible as possible if we are already adding something, may as well do it "right").
replace = {
prompt = function(conent): string,
selection = function(conent): string,
message = function(conent): string,
}
Another thing to remember is that the current selection function you can provide to the config would actually be completely different still than this because that is just used for the default selection if one is not provided.
Thoughts?
Hmm thats interesting, I wonder if there is more stuff like this. Because i noticed that sometimes claude just do not responds at all and im just stuck waiting. Im not sure if we need to provide custom replacement functions, i think simply stripping all non-standard unicode characters sounds fine, I dont think it helps with the chat context whatsoever anyway so losing them automatically would not matter.
I opened #467 that sanitizes when sending request so hopefully it will help
Hmm thats interesting, I wonder if there is more stuff like this. Because i noticed that sometimes claude just do not responds at all and im just stuck waiting. Im not sure if we need to provide custom replacement functions, i think simply stripping all non-standard unicode characters sounds fine, I dont think it helps with the chat context whatsoever anyway so losing them automatically would not matter.
No, this is a different issue which I also experience with claude! I have tested extensively and they are mutually exclusive. I had the same hunch as you until I dug in and "solved" the icon / emoji issue (which is where the Internal Server Error comes from). I tried for a while to figure out the random Claude api "drops" but it seemed like it was just random and that the api / curl call gets lost in the abyss somehow sometimes randomly 🤷 (frustrating though).
Yea its super annoying, I tried to figure out how to catch that issue as well so it doesnt just get "stuck" but no idea. I think github api just decides to not respond at all sometimes, maybe because its just in beta state, idk. Did not experienced that at all with other models.
EDIT: I wonder if its reproducible in VScode? Or maybe we are missing some header?
Yea its super annoying, I tried to figure out how to catch that issue as well so it doesnt just get "stuck" but no idea. I think github api just decides to not respond at all sometimes, maybe because its just in beta state, idk. Did not experienced that at all with other models.
EDIT: I wonder if its reproducible in VScode? Or maybe we are missing some header?
One thing that would greatly reduce the frustration of it would be if there was a command / mechanism to just re-send the last message (including prompt, history etc.) ... more or less just a "retry" option. Because the annoying part is when I do a custom prompt or write something out via a vim.ui.input and maybe a specific selection etc. and when it hangs / doesn't respond I sort of have to do it all over again. Often can just copy from the displayed user prompt in the chat window, but sometimes with my workflow it is a little annoying. A simple "retry" command that just always has the last api request (curl / chat:ask or whatever) "cached" and will try it again would basically eliminate the annoyance for me.
Thoughts?
Well you can just do :CopilotChatStop and send again and that should work fine for that purpose (or thatswhat i do at least). And the history is not erased on stop so it will just work with that as well
To can you see if you have any icons / emojis in the buffer?
well, i just checked and you are right, this was the cause of the Internal Server Error because i was using the same code as provided for the context (i was using nerd font icons for the which-key keymaps). Thanks! :)
i also tried implementing your suggested solution but i couldn't make it to work so i'll attempt to troubleshoot it again tomorrow
EDIT: or i might just wait for the fix
same error here
EDIT: works now with the same content, looks like caused by Internal Server Error by claude
Reverted the change fixing this as it was causing side effects. Maybe its better to just mark this as upstream issue for now and hopefully github fixes it before claude goes out of beta. Or someone can check what VSCode is doing or if this issue also occurs there.
vscode is also failing on these special chars so its probs best to wait for solution upstream
@deathbeam It is an old problem that happened to me today. Do you know how I can solve it?
@deathbeam It is an old problem that happened to me today. Do you know how I can solve it?
Not giving files to context that have unicode chars that trip up claude is the only way really.
I will just close this, if newer claude models still have this issue oh well, but I dont think there is much we can do here about it without breaking other functionality by accident (like what happened before and it broke for some languagues)