file-next
file-next copied to clipboard
Skip undef $file in _candidate_files
readdir $fh can return undef in certain circumstances, like reading directories on SMB mounts.
Thanks for this. What led you to discover this?
Hi Andy. I discovered this years ago, and have been using a patched 1.09 I think since. I forgot all about it until we did a review of patched perl libs recently, and I thought I'd see if it was maintained here on github.
The application is a music-file scanner that mostly scans CIFS/SMB mounts across the LAN. I discovered that in certain cases - and I wish my patch commit were less vague - readdir() will return undef. I don't recall it actually missing files, but instead returning an "extra" undef.
I'll see if I have some other notes on it somewhere. Thanks for accepting the PR, I look forward to upgrading to the new version!
Aha! I found the complete bug report for this problem, from 2014: "Samba doesn't like to see hi-bit Latin-1 characters (in the range \x80-\xFF) in file names when the share is mounted with iocharset=utf8 (as we do). The files are not only unreadable (they can't be opened), but because of a bug in File::Next, they can cause the indexer to crash.
I don't know that there is a way to get Samba to open or recognize these files. Patching File::Next should prevent them from crashing the scanner."
(Of course, the bug is arguably in readdir() or Samba, not File::Next.)
It's a little concerning that there's nothing in the docs that address "why it might return undef".
readdir DIRHANDLE
Returns the next directory entry for a directory opened by
"opendir". If used in list context, returns all the rest of the
entries in the directory. If there are no more entries, returns
the undefined value in scalar context and the empty list in list
context.
I'm not sure that silently ignoring an undef is the way to go. Thoughts?
Hey Andy.
There definitely is a directory entry for the file, but translations in the file system are what end up causing readdir() to return undef as a file name. So one might say it's not really a bug in readdir() either.
But according to the bug report I found, the result of not eliminating an undef in that loop was an endless loop somewhere in File::Next. I don't see it as obvious in _candidate_files(), but $fullpath will end up with the directory again, not a file name if $file is undef.
I agree we don't want to return undef
to the user, but do we do it silently, or do we give an error so the user knows something is up?
I'm not sure what the error would be other than 'undefined filename' - which doesn't give much of a context. I suppose it's something. It would still be pretty hard to know what file is causing it, but it's better than being stuck in a permanent loop.