file-system-access
file-system-access copied to clipboard
Handling of non-Unicode handle names
https://wicg.github.io/native-file-system/#dom-filesystemhandle-name is a USVString and can only represent a sequence of Unicode scalar values. But file systems don't respect those rules:
- On Linux file names can usually be any sequence of bytes
- On Windows it looks like NTFS uses a sequence of UTF-16 code points
On macOS there's somewhat famously Unicode Normalization Form D applied. Not sure if this makes non-Unicode names impossible, probably it does.
If encountering a name which can't be roundtripped as USVString, what should the UA do? If the invalid bits are replaced with U+FFFD, then presumably the name can't be used with directory.removeEntry(name) and similar APIs?
This is mentioned non-normatively in domintros of getFile() and getDirectory(), which say "can fail because [...] the name uses characters that aren’t supported in file names on the underlying file system".
That needs to be turned into normative language, but it's a different question from reading files. It would probably be very frustrating for a user to be able to see a file on disk but not be able to select it, or perhaps worse being able to open it but not save it.