truffleruby
truffleruby copied to clipboard
Fun with Dir.glob and filesystems
The following behavior is probably an artifact of the underlying implementation rather than a deliberate design choice, but I thought I would mention it:
Given the following layout:
foo/a.h
foo/B.h
foo/c.h
And the following snippet:
Dir.glob('foo/[A-Z]*'){ |f| p f }
Ruby 3.0.2 output on a case insensitive filesystem (e.g. OSX/HFS, Windows/NTFS):
"foo/B.h"
"foo/a.h"
"foo/c.h"
Ruby 3.0.2 output on a case sensitive filesystem (e.g. Linux/ext4):
foo/B.h
Truffleruby 22.0.0.2, on the other hand, appears to honor case sensitivity on any filesystem, i.e. it only returns "foo/B.h" on OSX.
Compatibility issue? Or user beware?
(Side note: I've not yet checked what happens this code example if you create a file on Windows with the FILE_FLAG_POSIX_SEMANTICS
flag)
File.fnmatch?
docs have a description of something probably related:
File.fnmatch('cat', 'CAT') #=> false # case sensitive
File.fnmatch('cat', 'CAT', File::FNM_CASEFOLD) #=> true # case insensitive
File.fnmatch('cat', 'CAT', File::FNM_SYSCASE) #=> true or false # depends on the system default
So probably CRuby is using the FNM_SYSCASE semantics for Dir.glob
.
@eregon On macOS I see:
% ruby -e 'p File::FNM_SYSCASE'
0
I found an old issue describing this: https://bugs.ruby-lang.org/issues/4255
This might be the glob implementation: https://github.com/ruby/ruby/blob/0db68f023372b634603c74fca94588b457be084c/dir.c#L2404-L2405 https://github.com/ruby/ruby/blob/0db68f023372b634603c74fca94588b457be084c/dir.c#L1754-L1782
This seems really complicated and needing lots of low-level hacks, so I think we should wait until there is an actual need before trying to implement this.
Regarding the old ruby-lang issue, you can test the filesystem using something like File.identical?(Dir.pwd, Dir.pwd.swapcase)
, which I don't think existed at that time. This is what I do for the sys-filesystem
gem. Technically, I think it only tests the file's partition, but in practice I think mixed filesystems are super rare.