VeraCrypt icon indicating copy to clipboard operation
VeraCrypt copied to clipboard

Veracrypt GUI on Linux sets wrong encoding to be used for filenames

Open mpeter50 opened this issue 9 months ago • 4 comments

Expected behavior

I have files and directories on an NTFS based Veracrypt container, whose names contain hungarian characters: é, ü, á, í, ő, ... I expect that these names are printed accurately, like when I directly mount an NTFS filesystem.

Observed behavior

Hungarian accented characters are garbled. Some programs show a question mark in place, the ls command utility prints 'Eg'$'\351''szs'$'\351''g'$'\374''gy' instead of "Egészségügy".

Steps to reproduce

  1. Open Veracrypt GUI
  2. Pick a slot
  3. Select a VC container file
  4. Click mount
  5. Type password and accept (so no custom mount options are set)
  6. Observe path names in mounted directory

Your Environment

VeraCrypt version: 1.26.20

Operating system and version: openSUSE Leap 15.6, Linux kernel 6.4.0-150600.23.38-default (64-bit)

System type: 64 bit Linux

The VC container was created on a Windows 10 system.

Additional information

Mounting an entire NTFS-based VC partition has no such issues, and has the following mount options:

/dev/mapper/veracrypt1 on /mnt/vera/a type fuseblk (rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other,blksize=4096)

Mounting the VC container results in these mount options:

/dev/mapper/veracrypt2 on /media/veracrypt2 type vfat (rw,relatime,uid=1000,gid=100,fmask=0077,dmask=0077,codepage=437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro)

There are many differences (why no nosuid, nodev?), most importantly there is a codepage and an iocharset option set.

mpeter50 avatar Apr 05 '25 21:04 mpeter50

I just realized that this VC container might hold a FAT-something filesystem.

In any case, the point of this issue is to make Veracrypt be able to automatically detect (or prompt the user for easy questions) the correct mount options, as it happens on Windows.

I fear that this only works on Windows out of box because there the system holds that information in some form.

mpeter50 avatar Apr 05 '25 21:04 mpeter50

Turns out the correct mount option in my case is iocharset=utf8.

I understand if we can not change the default, but I think it would be worth to consider

  • adding a warning when the filesystem is a kind of FAT
  • adding some kind of setting for the iocharset to be used

Especially considering that this is not something easy to find on your own, I think. I was trying with codepage 852 and 1250 (even though thats for short names only), and iocharset iso8859-2 (which is the legacy Hungarian windows codepage) before I found out that this parameter can just be set to utf8.

mpeter50 avatar Apr 05 '25 22:04 mpeter50

I believe the default options for mount iocharset are compiled with the kernel, so this probably works for most people out of the box (at least on both my Arch Linux and Ubuntu VMs it is set to iocharset=utf8 by default). So I think this would only be a problem for an user if they were to create the files on one system and then move them into another one with different default mount encoding. The risk of forcing it to be utf8 for everyone could break behaviour for existing users whose systems by default use and mount with different encoding and they have no problems with how it currently works.

There are some ways to detect the encoding but relying on certain programs existing on the system is not a great solution, so most sensible path from my view would be to add instructions to the troubleshooting part of the documentation on what to do if the filenames look scrambled up.

Jertzukka avatar Apr 08 '25 17:04 Jertzukka

I believe the default options for mount iocharset are compiled with the kernel, so this probably works for most people out of the box (at least on both my Arch Linux and Ubuntu VMs it is set to iocharset=utf8 by default).

man mount says here that the default is iso8859-1

Mount options for fat [...] iocharset=value Character set to use for converting between 8 bit characters and 16 bit Unicode characters. The default is iso8859-1. Long filenames are stored on disk in Unicode format.

But its good to be aware that there is a variety in this.

so most sensible path from my view would be to add instructions to the troubleshooting part of the documentation on what to do if the filenames look scrambled up.

That sounds good to me. And at least now there is also an issue with the information if someone else ends up with it too :)

mpeter50 avatar Apr 09 '25 01:04 mpeter50