csi-driver-smb icon indicating copy to clipboard operation
csi-driver-smb copied to clipboard

Unreadable Greek filenames

Open J-ohn opened this issue 1 year ago • 4 comments

What happened:

  1. Mounted a share with files whose names contain greek characters (tried both with a vm running windows as the smb server and an azure files storage as the smb server)
  2. Attached to the pod and did an ls on the share
  3. The file names had question marks instead of the greek characters e.g. ????? ??.JPG What you expected to happen: Comparing to the behavior of the azureFile csi driver and running the same test the files listed using ls in linux we got file names like so ''$'\317\203\317\207\316\265\316\264\316\271\316\277'''$'\317\207\317\211\317\201\316\271\317\202'''$'\317\204\316\271\317\204\316\273\316\277''_-_2022-03-12t100557_115.png' which in turn are interpreted correctly by the application. If you create a file with greek characters from the application under the mounted folder, the file is correctly shown with ls (with escaped characters like the above) and listed by the application, but if you visit either the windows server or azure storage and see how it shows the filename you get something like this σχεδιοχωρις_τιτλο-_2022-03-12t100557_115.png How to reproduce it:
  4. attach to a running pod with a mount on it
  5. cd to your mounted directory
  6. touch ''$'\317\203\317\207\316\265\316\264\316\271\316\277'''$'\317\207\317\211\317\201\316\271\317\202'''$'\317\204\316\271\317\204\316\273\316\277''.png'
  7. visit the mount from the storage's side (azure storage explorer for example), instead of seeing a file name like so "σχεδιο_χωρις_τιτλο.png" you will see an ineligible file name Anything else we need to know?:

Environment:

  • CSI Driver version: 3.11
  • Kubernetes version (use kubectl version): 1.27.8
  • OS (e.g. from /etc/os-release): (pod os), Debian GNU/Linux 11 (bullseye)
  • Kernel (e.g. uname -a): Linux web-1 5.15.0-89-generic #99-Ubuntu SMP Mon Oct 30 20:42:41 UTC 2023 x86_64 GNU/Linux
  • Install tools: dotnet v7
  • Others: n/a

J-ohn avatar Jan 16 '24 10:01 J-ohn

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Apr 15 '24 10:04 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar May 15 '24 11:05 k8s-triage-robot

Hi, I have a simular issue with German umlaute (üäöß). I'm usinf the current 1.14.0 SMB csi driver in the latest k3s cluster.

Any ideas how to fix this?

regards André

Andre15711 avatar May 16 '24 09:05 Andre15711

In bash:

$ echo '$'\317\203\317\207\316\265\316\264\316\271\316\277'''$'\317\207\317\211\317\201\316\271\317\202'''$'\317\204\316\271\317\204\316\273\316\277''_-_2022-03-12t100557_115.png'
σχεδιοχωριςτιτλο_-_2022-03-12t100557_115.png

So the filenames exposed by cifs are valid UTF-8 strings. The problem is probably with your application which is expecting a different encoding for filenames. It's not a problem with csi-driver-smb.

Try setting the environment variable LC_CTYPE=C.UTF-8 and see if it helps. If so then you could bake that environment variable into your container image to make it the default.

yrro avatar Jun 28 '24 09:06 yrro