zed icon indicating copy to clipboard operation
zed copied to clipboard

Opening a non-utf8 encoded text file fails without error.

Open imron opened this issue 3 years ago • 4 comments

Describe the bug Opening a source file that contains text encoded in something other than utf-8 fails without an error.

To Reproduce

  1. Open the file sample.cpp.txt, which is a cpp file (renamed as .txt for attaching to the github issue) with some comments that contain some non-utf8 encoded text. In this case, the encoding is GB2312 (used for Chinese), and it's not uncommon to see these sorts of files in real life, albeit only for c++ which unlike Rust does not restrict source code encoding to utf8.
  2. Note that the contents of the file are not displayed, and no error is shown.

Expected behavior The file should be loaded and displayed normally, and the unicode replacement character shown for any chars that are not valid utf8.

Alternatively, an error message could be shown saying that the file contains non-utf8 text and couldn't be opened.

Environment:

  • Architecture: x86_64
  • macOS Version: ProductName: macOS ProductVersion: 12.3.1 BuildVersion: 21E258
  • Zed Version: Zed 0.39.0 – /Applications/Zed.app

imron avatar Jun 23 '22 00:06 imron

I am able to confirm this is still an issue in Zed 0.141.1.

If you try and open a file that is not UTF-8 by clicking on it in the project sidebar you will get an error like this: image

But if you open a file directly via File->Open or from the command line: zed filename-non-utf8.txt there is nothing in the log nor is the error displayed.

notpeter avatar Jun 22 '24 21:06 notpeter

Same behavior with Zed 0.149.5 on Mac.

midnightcodr avatar Aug 23 '24 20:08 midnightcodr

This should be fixed in the next version now that the following PR is merged:

  • https://github.com/zed-industries/zed/pull/15613

apricotbucket28 avatar Aug 23 '24 22:08 apricotbucket28

What vscode does for this is display a screen which allows you to try to open it as text, or use the hex editor.

The file is not displayed in the text editor because it is either binary or uses an unsupported text encoding. (and an Open Anyway button)

If you force it to open as text, they do a lossy conversion and replace invalid bytes with .

In the future we could maybe do something similar.

MolotovCherry avatar Aug 24 '24 13:08 MolotovCherry

This should be fixed in the next version now that the following PR is merged:

* [workspace: Improve error handling when dropping a file that cannot be opened into the workspace pane #15613](https://github.com/zed-industries/zed/pull/15613)

Ii appears the mentioned fix doesn't really address "stream did not contain valid UTF-8. Please try again." error that OP reported. The fix is more about making the error display behavior to be consistent with clicking on a file containing non-UTF-8 characters when dropping a file into the edit window.

Even if the error is thrown consistently regardless how the file is opened, it would be a dealbreaker if I want to switch from editors such as vscode to zed to edit files containing invalid UTF-8 characters.

midnightcodr avatar Aug 27 '24 16:08 midnightcodr

This issue is explicitly about the lack of an error message on open which has been fixed and so I'm closing it.

I've opened unified issue for this feature here:

  • https://github.com/zed-industries/zed/issues/16965

Please upvote 👍 to help prioritize that issue and click subscribe if you would like to receive notifications on its progress. Thanks all!

notpeter avatar Aug 27 '24 18:08 notpeter