zig
zig copied to clipboard
`std.fs.IterableDir.Walker` doesn't ignore macOS "firmlinks" as it does symlinks, resulting in iterating over some dirs twice
Zig Version
0.10.0-dev.4324+c23b3e6fd
Steps to Reproduce and Observed Behavior
This is probably not a fully minimal example but I'm not fluent in Zig yet:
const std = @import("std");
const al = std.heap.page_allocator;
pub fn main() !void {
if (std.os.argv.len == 3) {
try search(
std.os.argv[1][0..std.mem.len(std.os.argv[1])],
std.os.argv[2][0..std.mem.len(std.os.argv[2])]);
std.debug.print("done\n", .{});
} else {
std.debug.print("** first arg is path to walk, second is substring to look for in directory names", .{});
}
}
// ?!T is easier to work with than !?T for some kinds of iterators
fn flipiter(comptime T: type, inp: anyerror!?T) ?anyerror!T {
return (try inp) orelse null;
}
fn search(path:[]const u8, string:[]const u8) !void {
const start_path = if (path[path.len - 1] == '/') path[0..path.len-1] else path[0..];
var idir = std.fs.cwd().openIterableDir(path, .{}) catch |err| {
std.debug.print("** err: {s} {any}\n", .{path, err});
};
defer idir.close();
var walker = try idir.walk(al);
defer walker.deinit();
while (flipiter(std.fs.IterableDir.Walker.WalkerEntry, walker.next())) |err_or_entry| {
if (err_or_entry) |entry| {
if (entry.kind == .Directory) {
if (std.mem.indexOf(u8, entry.basename, string)) |x| {
_ = x;
if (entry.dir.realpathAlloc(al, entry.basename)) |realpath| {
defer al.free(realpath);
std.debug.print("> {s}/{s}\n» {s}\n", .{start_path, entry.path, realpath});
} else |err| {
std.debug.print("** realpath err {?}\n", .{err});
}
}
}
} else |err| {
if (err != error.AccessDenied) std.debug.print("{!}\t{s}/{s}\n", .{err, start_path, walker.name_buffer.items});
}
}
}
Run it with a starting path and a substring to search for. It will walk the directory and print out all paths that contain the substring after a >
symbol and then its resolved canonical path after a »
symbol on the next line.
I can only see the wrong behaviour when running in the root directory /
as macOS has several directories from /System/Volumes/Data/x
linked to /x
using "firmlinks", which is a feature of the APFS filesystem going back to Sierra. Example firmlinked directories.
-
/Library
->/System/Volumes/Data/Library
-
/Users
->/System/Volumes/Data/Users
-
/Applications
->/System/Volumes/Data/Applications
I can't find Apple official documentation on firmlinks. Apparently they're supposed to be transparent and not affect most code. I also don't know how to get them to show up using ls
. The ls
commandline I know that will show the most extra info macOS-specific info is ls -alO@ /
but that doesn't seem to reveal them unless sunlnk
means this and only this:
drwxr-xr-x 20 root wheel sunlnk 640 24 Aug 16:59 .
drwxr-xr-x 20 root wheel sunlnk 640 24 Aug 16:59 ..
lrwxr-xr-x 1 root admin - 36 24 Aug 16:59 .VolumeIcon.icns -> System/Volumes/Data/.VolumeIcon.icns
---------- 1 root admin - 0 24 Aug 16:59 .file
drwxr-xr-x 2 root wheel hidden 64 24 Aug 16:59 .vol
drwxrwxr-x 44 root admin sunlnk 1408 28 Oct 09:31 Applications
drwxr-xr-x 67 root wheel sunlnk 2144 14 Sep 10:42 Library
drwxr-xr-x@ 9 root wheel restricted 288 24 Aug 16:59 System
com.apple.rootless 0
drwxr-xr-x 5 root admin sunlnk 160 24 Aug 16:59 Users
drwxr-xr-x 3 root wheel hidden 96 25 Oct 15:39 Volumes
drwxr-xr-x@ 38 root wheel restricted,hidden 1216 24 Aug 16:59 bin
com.apple.rootless 0
drwxrwxr-t 2 root admin hidden 64 8 Dec 2020 cores
dr-xr-xr-x 4 root wheel hidden 4888 25 Oct 15:39 dev
lrwxr-xr-x@ 1 root wheel restricted,hidden 11 24 Aug 16:59 etc -> private/etc
com.apple.rootless 0
lrwxr-xr-x 1 root wheel hidden 25 25 Oct 15:39 home -> /System/Volumes/Data/home
drwxr-xr-x 2 root wheel hidden 64 8 Dec 2020 opt
drwxr-xr-x 6 root wheel sunlnk,hidden 192 25 Oct 15:39 private
drwxr-xr-x@ 65 root wheel restricted,hidden 2080 24 Aug 16:59 sbin
com.apple.rootless 0
lrwxr-xr-x@ 1 root wheel restricted,hidden 11 24 Aug 16:59 tmp -> private/tmp
com.apple.rootless 0
drwxr-xr-x@ 11 root wheel restricted,hidden 352 24 Aug 16:59 usr
com.apple.rootless 0
lrwxr-xr-x@ 1 root wheel restricted,hidden 11 24 Aug 16:59 var -> private/var
com.apple.rootless 0
Sample commandline:
zig run src/walk.zig 2>&1 -- / LLVM |less -S
Actual output:
> /Library/Frameworks/Xamarin.iOS.framework/Versions/15.10.0.5/LLVM
» /Library/Frameworks/Xamarin.iOS.framework/Versions/15.10.0.5/LLVM
> /System/Volumes/Data/Library/Frameworks/Xamarin.iOS.framework/Versions/15.10.0.5/LLVM
» /Library/Frameworks/Xamarin.iOS.framework/Versions/15.10.0.5/LLVM
> /System/Volumes/Data/Users/hippietrail/.vscode-insiders/extensions/ms-vscode.cpptools-1.5.1/LLVM
» /Users/hippietrail/.vscode-insiders/extensions/ms-vscode.cpptools-1.5.1/LLVM
> /System/Volumes/Data/Users/hippietrail/.vscode/extensions/ms-vscode.cpptools-1.12.4-darwin-arm64/LLVM
» /Users/hippietrail/.vscode/extensions/ms-vscode.cpptools-1.12.4-darwin-arm64/LLVM
> /System/Volumes/Data/Applications/Xcode.app/Contents/Applications/Instruments.app/Contents/PlugIns/DTLLVMBinaryAnalysisPlugin.xrplugin
» /Applications/Xcode.app/Contents/Applications/Instruments.app/Contents/PlugIns/DTLLVMBinaryAnalysisPlugin.xrplugin
> /Users/hippietrail/.vscode-insiders/extensions/ms-vscode.cpptools-1.5.1/LLVM
» /Users/hippietrail/.vscode-insiders/extensions/ms-vscode.cpptools-1.5.1/LLVM
> /Users/hippietrail/.vscode/extensions/ms-vscode.cpptools-1.12.4-darwin-arm64/LLVM
» /Users/hippietrail/.vscode/extensions/ms-vscode.cpptools-1.12.4-darwin-arm64/LLVM
> /Applications/Xcode.app/Contents/Applications/Instruments.app/Contents/PlugIns/DTLLVMBinaryAnalysisPlugin.xrplugin
» /Applications/Xcode.app/Contents/Applications/Instruments.app/Contents/PlugIns/DTLLVMBinaryAnalysisPlugin.xrplugin
done
Expected Behavior
Directories should only be listed at their true locations as is the case with symlinks, not also at the locations they're firmlinked to. I believe that would look like this:
> /Library/Frameworks/Xamarin.iOS.framework/Versions/15.10.0.5/LLVM
» /Library/Frameworks/Xamarin.iOS.framework/Versions/15.10.0.5/LLVM
> /Users/hippietrail/.vscode-insiders/extensions/ms-vscode.cpptools-1.5.1/LLVM
» /Users/hippietrail/.vscode-insiders/extensions/ms-vscode.cpptools-1.5.1/LLVM
> /Users/hippietrail/.vscode/extensions/ms-vscode.cpptools-1.12.4-darwin-arm64/LLVM
» /Users/hippietrail/.vscode/extensions/ms-vscode.cpptools-1.12.4-darwin-arm64/LLVM
> /Applications/Xcode.app/Contents/Applications/Instruments.app/Contents/PlugIns/DTLLVMBinaryAnalysisPlugin.xrplugin
» /Applications/Xcode.app/Contents/Applications/Instruments.app/Contents/PlugIns/DTLLVMBinaryAnalysisPlugin.xrplugin
done
A more-or-less equivalent Rust program shares the behaviour of Zig, but Swift does not. This makes me think the dir walker code in each language's libraries does in fact need to handle firmlinks specifically and that Swift must do do. This despite claims they should be transparent to existing code.
Here's my Swift code's output:
dirwalker / LLVM
******** / ********
Library/Frameworks/Xamarin.iOS.framework/Versions/15.10.0.5/LLVM
Users/hippietrail/.vscode-insiders/extensions/ms-vscode.cpptools-1.5.1/LLVM
Users/hippietrail/.vscode/extensions/ms-vscode.cpptools-1.12.4-darwin-arm64/LLVM
Applications/Xcode.app/Contents/Applications/Instruments.app/Contents/PlugIns/DTLLVMBinaryAnalysisPlugin.xrplugin
---- all done ----
And the Swift code itself. You can see there's no special code for firmlinks:
import Foundation
import AppKit
let fileManager = FileManager.default
let resKeys : [URLResourceKey] = [.isDirectoryKey, .fileSizeKey, .isSymbolicLinkKey]
let startURL: URL = URL(string: fileManager.currentDirectoryPath)!
guard CommandLine.arguments.count == 3 else {
print("** usage: dirwalker path string")
exit(1);
}
let pathArg = CommandLine.arguments[1]
let matchArg = CommandLine.arguments[2]
if let path = pathArg.addingPercentEncoding(withAllowedCharacters: .urlQueryAllowed) {
if let url = URL(string: path) {
let en = fileManager.enumerator(at: url,
includingPropertiesForKeys: resKeys,
options: [.producesRelativePathURLs],
errorHandler: { (url, error) -> Bool in
return true }
)!
mainloop: for case let fileURL as URL in en {
do {
let rv = try fileURL.resourceValues(forKeys: Set(resKeys))
if let d = rv.isDirectory, d {
let filename: String = fileURL.lastPathComponent;
if filename.contains(matchArg) {
print(fileURL.relativePath)
}
}
} catch {
print("** error 2:", error)
}
}
}
}
print("done")
I did research this issue a while back, but here are my findings/notes in case it might help someone who wants to implement it or comment their opinion on it:
-
firmlinks can only apply on directories.
-
While there is an internal flag for identifying firmlinks, it is not set or used on APFS (see here)
-
What some projects implements like rofl0r/ncdu (with
--exclude-firmlinks
) is checking the reference attribute to see if it points to the same location (see here).- Another way of identifying firmlinks that I found is by using
GETPATH_NOFIRMLINK (102)
when usingfcntl
call to show the firmlink reference and compare it. This is only documented in XCode man pages. This would look similarfs.realpath
oros.realpath
darwin functions (getFdPath
). https://github.com/ziglang/zig/blob/a1aa55ebe52636f28335fd7ab30321fc52f48775/lib/std/os.zig#L5218-L5229
- Another way of identifying firmlinks that I found is by using
-
Synthetic firmlinks can be created since MacOS Catalina via
synthetic.conf
(Although i am not sure whether it actually creates a firmlink or an identifiable symlink with flag, needs further checking). -
For system firmlinks,
/usr/share/firmlinks
contains the list of firmlinks.
What's interesting is that MacOS's /bin/realpath
does not resolve firmlinks.
Here are way that I think can be used to solve this is:
-
when calling
nextDarwin
and finding a Directory, double check the reference/realpath and set entry kind depending on the result. https://github.com/ziglang/zig/blob/a1aa55ebe52636f28335fd7ab30321fc52f48775/lib/std/fs.zig#L391 -
Documenting that Walker does not recognize/classify firmlinks and let the user do this check manually and/or maybe add a Firmlink check function?
-
Check with
/usr/share/firmlinks
andsynthetic.conf
(this can apply for any of the above).
I am not sure what's the best approach here as there aren't many system firmlinks and it's uncommon to synthesize (if that's even possible).
Note: I could be missing some other info, but this is what I have found.