zed
zed copied to clipboard
Files with UTF-8 BOM are not handled correctly
Check for existing issues
- [X] Completed
Describe the bug / provide steps to reproduce it
Files that start with the UTF-8 byte order mark (EFBBBF) are not interpreted correctly. This can have different effects depending on the type of file. In the case of a JSON file, Zed shows an error on the first position stating "Expected a JSON object, array or literal". In the case of a C# file, syntax highlighting is messed up on the first line. In all cases, editing the line causes strange behavior where the characters inserted are in a different position than the cursor.
I don't remember noticing anything like this before updating Zed this morning. I believe my prior version was 142.4.
Environment
Zed: v0.143.7 (Zed) OS: macOS 14.5.0 Memory: 32 GiB Architecture: x86_64
If applicable, add mockups / screenshots to help explain present your vision of the feature
First line of C# file that starts with BOM:
When BOM removed:
If applicable, attach your Zed.log file to this issue.
No response
Hi there! 👋 We're working to clean up our issue tracker by closing older issues that might not be relevant anymore. If you are able to reproduce this issue in the latest version of Zed, please let us know by commenting on this issue, and we will keep it open. If you can't reproduce it, feel free to close the issue yourself. Otherwise, we'll close it in 7 days. Thanks for your help!
In 0.176.3 (with the csharp extension) the situation seems better - it doesn't throw off tokenization, so highlighting is normal and LSP features work fine. The only remaining quirk I notice is that the BOM is shown as a little half-width marker at the beginning of the file:
I hope my comment will not add noise.
Comparing the behavior of VSCode on Linux, the editor produce UTF8 BOM files only when used with C# extension. Perhaps this is a requirement for compatibility with Visual Studio (Ms-Windows realm) 🤔 Outside of C# projects, VSCode read BOM files with a charm (the user don't notice it), but always produce UTF-8 without BOM ; or keep BOM marker on existing file.
I wish BOM didn't exist.. but it does and so I think it should be handled more gracefully (vscode behaviour does seems sensible).
I have my .editorconfig to use utf-8-bom encoding, and when I save a file it deletes the BOM automatically, using Linux (Pop OS! 22.04) Kernel 6.12.10-76061203-generic
[*.cs]
charset = utf-8-bom
Same issue on Ubuntu 25.04. Opening F# files previously edited in VS Code adds a leading Unicode character FEFF that creates super annoying git changes. It makes Zed impossible to use when other members of the team are using VS Code.
@snovak7 Thanks for sharing this .editorconfig workaround, it also works for F# files 🎉
I have my .editorconfig to use utf-8-bom encoding, and when I save a file it deletes the BOM automatically, using Linux (Pop OS! 22.04) Kernel 6.12.10-76061203-generic
[*.cs] charset = utf-8-bom
how to get it work? should i place it in .zed dir? trying to apply it for every file in project (macos 15.5)
I have my .editorconfig to use utf-8-bom encoding, and when I save a file it deletes the BOM automatically, using Linux (Pop OS! 22.04) Kernel 6.12.10-76061203-generic [*.cs] charset = utf-8-bom
how to get it work? should i place it in .zed dir? trying to apply it for every file in project (macos 15.5)
in root dir
not working for some reason
[*]
charset = utf-8
does it require some tweaks in settings.json ? like enabling .editorconfig or smth
probably you want also
root = true # at the top
I have my .editorconfig to use utf-8-bom encoding, and when I save a file it deletes the BOM automatically, using Linux (Pop OS! 22.04) Kernel 6.12.10-76061203-generic
[*.cs] charset = utf-8-bom
Today it doesn't delete BOM on save, anymore
In 0.176.3 (with the
csharpextension) the situation seems better - it doesn't throw off tokenization, so highlighting is normal and LSP features work fine. The only remaining quirk I notice is that the BOM is shown as a little half-width marker at the beginning of the file:![]()
I see the same character, but it isn't deleted anymore
Same here, adding charset to the editorconfig doesn't seems to work anymore