coreutils icon indicating copy to clipboard operation
coreutils copied to clipboard

stat: mount point output should preserve non-UTF8 bytes

Open dekuu5 opened this issue 2 months ago • 0 comments

stat: mount point output should preserve non-UTF8 bytes

Problem

The stat implementation in Rust currently converts mount points to a String using .to_string_lossy().
This behavior does not match GNU stat: if the mount point contains non-UTF8 bytes, they are replaced with the Unicode replacement character (U+FFFD).

GNU stat preserves the original raw bytes, so the two tools diverge when dealing with mount points that include invalid UTF-8.


Steps to Reproduce

Setup

# Create a mount point with an invalid UTF-8 byte in the name (0x80)
mkdir $'mnt_\x80'

# Mount a tmpfs there
sudo mount -t tmpfs tmpfs $'mnt_\x80'

# Create a file inside it
touch $'mnt_\x80/file.txt'

Proof of Concept

# GNU stat
#!/bin/bash

# GNU stat
stat -c '%m' $'mnt_\x80/file.txt' | xxd

# coreutils stat
uu_stat -c '%m' $'mnt_\x80/file.txt' | xxd

Output

Image

Proposed Solution

  1. Update find_mount_point to return an OsString instead of String.
  2. Extend OutputType with a new variant, e.g. OutputType::OsStr.
  3. Add a new helper function print_osstr in the printing layer that writes raw bytes instead of assuming UTF-8.
  4. Update stat to use OutputType::OsStr for mount points.

dekuu5 avatar Aug 31 '25 12:08 dekuu5