param icon indicating copy to clipboard operation
param copied to clipboard

Globbing in FileSelector is not cross-platform

Open jlstevens opened this issue 7 years ago • 1 comments

In particular, the way the default path is validated may break on windows. The issue comes down to this bit of code:

    def update(self):
        self.objects = sorted(glob.glob(self.path))
        if self.default in self.objects:
            return
        self.default = self.objects[0] if self.objects else None

The issue is that is you specify a path like '../example/*/*.html using /, this slash is partially retained for the fixed part of the string (i.e '../example') but glob fills out the rest of the path using os.path.sep.

This appearance of os.path.sep then makes the code above OS dependent - the default can fail to register as being in self.objects (string comparison) even if the file does exist.

A temporary (ugly!) hack around this issue is to specify something like:

FileSelector(path=os.path.sep.join(['glob', 'path', '*.html']),
             default=os.path.sep.join(['default', 'path','example.html'])

jlstevens avatar Dec 13 '16 18:12 jlstevens

At least on Windows 10 and python 3.8, I have not been able to reproduce any problem where one style is "partially retained for the fixed part of the string (i.e '../example') but glob fills out the rest of the path". On the contrary, glob seems to convert the full path from POSIX to Windows just fine:

image

However, I think there may indeed still be an issue, just not to do with chimeric individual paths like that. Specifically, I think we may need to convert the initial default from POSIX (which is what we say defaults should be provided in) to the underlying OS format before comparing with the result of globbing. E.g.:

@@ -1822,8 +1822,11 @@ class FileSelector(ObjectSelector):
 
     def update(self):
         self.objects = sorted(glob.glob(self.path))
-        if self.default in self.objects:
-            return
+        import pathlib
+        if self.default is not None:
+            default = str(pathlib.PurePath(pathlib.PurePosixPath(self.default)))
+            if default in self.objects:
+                return
         self.default = self.objects[0] if self.objects else None
 
     def get_range(self):

Otherwise, I think we'll see the provided default silently being dropped and replaced with the first result of globbing (last line of the code above). I haven't tested the default computation above, and I think the right thing to do is to add a test case that provides an initial default value along with a path that has a glob, and then the test can make sure that the default is taken as specified, not simply as the first match on the glob (which I believe is what will happen now). It's a bit tricky to write that test in a cross-platform way, but I think we could do it using files in a temporary directory, and then we should be able to show that the test will fail on current master but hopefully pass with the change above made.

jbednar avatar Jun 28 '21 20:06 jbednar

I could reproduce the issue reported by Jean-Luc a long long time ago! In a./dummy folder create issue139.py containing the following and run it from there:

import param

class P(param.Parameterized):
    fs = param.FileSelector(path='../doc/*.md')

print(P.param.fs.objects)

On Macos the output is:

['../doc/about.md', '../doc/comparisons.md', '../doc/developer_guide.md', '../doc/getting_started.md', '../doc/index.md', '../doc/releases.md', '../doc/roadmap.md']

while on Windows it is

['../doc\\about.md', '../doc\\comparisons.md', '../doc\\developer_guide.md', '../doc\\getting_started.md', '../doc\\index.md', '../doc\\releases.md', '../doc\\roadmap.md']

So indeed the slash is partially retained!

maximlt avatar Jul 27 '23 06:07 maximlt