file-next icon indicating copy to clipboard operation
file-next copied to clipboard

Skip undef $file in _candidate_files

Open creeble opened this issue 4 years ago • 7 comments

readdir $fh can return undef in certain circumstances, like reading directories on SMB mounts.

creeble avatar May 13 '20 20:05 creeble

Thanks for this. What led you to discover this?

petdance avatar May 14 '20 03:05 petdance

Hi Andy. I discovered this years ago, and have been using a patched 1.09 I think since. I forgot all about it until we did a review of patched perl libs recently, and I thought I'd see if it was maintained here on github.

The application is a music-file scanner that mostly scans CIFS/SMB mounts across the LAN. I discovered that in certain cases - and I wish my patch commit were less vague - readdir() will return undef. I don't recall it actually missing files, but instead returning an "extra" undef.

I'll see if I have some other notes on it somewhere. Thanks for accepting the PR, I look forward to upgrading to the new version!

creeble avatar May 14 '20 16:05 creeble

Aha! I found the complete bug report for this problem, from 2014: "Samba doesn't like to see hi-bit Latin-1 characters (in the range \x80-\xFF) in file names when the share is mounted with iocharset=utf8 (as we do). The files are not only unreadable (they can't be opened), but because of a bug in File::Next, they can cause the indexer to crash.

I don't know that there is a way to get Samba to open or recognize these files. Patching File::Next should prevent them from crashing the scanner."

(Of course, the bug is arguably in readdir() or Samba, not File::Next.)

creeble avatar May 14 '20 17:05 creeble

It's a little concerning that there's nothing in the docs that address "why it might return undef".

readdir DIRHANDLE
        Returns the next directory entry for a directory opened by
        "opendir". If used in list context, returns all the rest of the
        entries in the directory. If there are no more entries, returns
        the undefined value in scalar context and the empty list in list
        context.

I'm not sure that silently ignoring an undef is the way to go. Thoughts?

petdance avatar May 14 '20 19:05 petdance

Hey Andy.

There definitely is a directory entry for the file, but translations in the file system are what end up causing readdir() to return undef as a file name. So one might say it's not really a bug in readdir() either.

But according to the bug report I found, the result of not eliminating an undef in that loop was an endless loop somewhere in File::Next. I don't see it as obvious in _candidate_files(), but $fullpath will end up with the directory again, not a file name if $file is undef.

creeble avatar May 14 '20 19:05 creeble

I agree we don't want to return undef to the user, but do we do it silently, or do we give an error so the user knows something is up?

petdance avatar May 15 '20 02:05 petdance

I'm not sure what the error would be other than 'undefined filename' - which doesn't give much of a context. I suppose it's something. It would still be pretty hard to know what file is causing it, but it's better than being stuck in a permanent loop.

creeble avatar May 15 '20 06:05 creeble