truffleruby icon indicating copy to clipboard operation
truffleruby copied to clipboard

Fun with Dir.glob and filesystems

Open djberg96 opened this issue 2 years ago • 4 comments

The following behavior is probably an artifact of the underlying implementation rather than a deliberate design choice, but I thought I would mention it:

Given the following layout:

foo/a.h
foo/B.h
foo/c.h

And the following snippet:

Dir.glob('foo/[A-Z]*'){ |f| p f }

Ruby 3.0.2 output on a case insensitive filesystem (e.g. OSX/HFS, Windows/NTFS):

"foo/B.h"
"foo/a.h"
"foo/c.h"

Ruby 3.0.2 output on a case sensitive filesystem (e.g. Linux/ext4):

foo/B.h

Truffleruby 22.0.0.2, on the other hand, appears to honor case sensitivity on any filesystem, i.e. it only returns "foo/B.h" on OSX.

Compatibility issue? Or user beware?

(Side note: I've not yet checked what happens this code example if you create a file on Windows with the FILE_FLAG_POSIX_SEMANTICS flag)

djberg96 avatar Mar 15 '22 16:03 djberg96

File.fnmatch? docs have a description of something probably related:

  File.fnmatch('cat', 'CAT')                     #=> false # case sensitive
  File.fnmatch('cat', 'CAT', File::FNM_CASEFOLD) #=> true  # case insensitive
  File.fnmatch('cat', 'CAT', File::FNM_SYSCASE)  #=> true or false # depends on the system default

So probably CRuby is using the FNM_SYSCASE semantics for Dir.glob.

eregon avatar Mar 15 '22 16:03 eregon

@eregon On macOS I see:

% ruby -e 'p File::FNM_SYSCASE'
0

I found an old issue describing this: https://bugs.ruby-lang.org/issues/4255

This might be the glob implementation: https://github.com/ruby/ruby/blob/0db68f023372b634603c74fca94588b457be084c/dir.c#L2404-L2405 https://github.com/ruby/ruby/blob/0db68f023372b634603c74fca94588b457be084c/dir.c#L1754-L1782

bjfish avatar Mar 15 '22 16:03 bjfish

This seems really complicated and needing lots of low-level hacks, so I think we should wait until there is an actual need before trying to implement this.

eregon avatar Mar 15 '22 21:03 eregon

Regarding the old ruby-lang issue, you can test the filesystem using something like File.identical?(Dir.pwd, Dir.pwd.swapcase), which I don't think existed at that time. This is what I do for the sys-filesystem gem. Technically, I think it only tests the file's partition, but in practice I think mixed filesystems are super rare.

djberg96 avatar Mar 15 '22 21:03 djberg96