Use Path.resolve() consistently to prevent UNC/drive letter ambiguity
This fixes an issue I had on Windows where pathlib reported that one or more output paths in CNMF did not have a common root with the output dir. This is because the file paths were normalized with Path.resolve(), whereas the directory path was not. On Windows, resolve() converts any remote mount paths from drive letter to UNC format, and in order to compute the relative path, both paths must be in the same format.
This change calls resolve() on the directory path before computing the other paths relative to it. (Another solution could be to just create these relative paths directly and concatenate them to the dir path when saving data.) I also changed the paths.split function, which also calls relative_to, to always resolve both paths before trying to compute the relative path.
@kushalkolar Here's an example of the problem, using some of the existing code from cnmf.py (sorry for the delay). The resolve method converts drive paths to UNC paths if the drive is mapped to a network share on the current system. Then, if you call relative_to on such a path with a drive letter path as the argument, you get an error. Calling resolve on the common root path (output_dir) and defining all the other paths relative to this one should prevent this issue.
I'm not sure it can be tested automatically; would definitely require some changes to the Windows test runner to set up this situation.
In [1]: from pathlib import Path
In [2]: drive_path = Path('Z:\\foo\\bar')
In [3]: output_dir = drive_path.parent.joinpath('baz')
In [4]: output_dir
Out[4]: WindowsPath('Z:/foo/baz')
In [5]: output_path = output_dir.joinpath('baz.hdf5').resolve()
In [6]: output_path
Out[6]: WindowsPath('//proektdata/LabData/foo/baz/baz.hdf5')
In [7]: output_path.relative_to(output_dir.parent)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[7], line 1
----> 1 output_path.relative_to(output_dir.parent)
File ~\AppData\Local\miniforge3\envs\mesviz\lib\pathlib.py:818, in PurePath.relative_to(self, *other)
816 if (root or drv) if n == 0 else cf(abs_parts[:n]) != cf(to_abs_parts):
817 formatted = self._format_parsed_parts(to_drv, to_root, to_parts)
--> 818 raise ValueError("{!r} is not in the subpath of {!r}"
819 " OR one path is relative and the other is absolute."
820 .format(str(self), str(formatted)))
821 return self._from_parsed_parts('', root if n == 1 else '',
822 abs_parts[n:])
ValueError: '\\\\proektdata\\LabData\\foo\\baz\\baz.hdf5' is not in the subpath of 'Z:\\foo' OR one path is relative and the other is absolute.
I think let's wait until #300 so that CI works and then rebase to make sure this works on all platform before merging
300 is merged, did you mean a different one?
I ran into this same problem. If that would work, one can use a synced network directory to work directory which would ease syncronisation, backup and multisite access to the process. On windows, junctions could be used to get rid of spaces in the path. Any chance to have this completed?
Absolutely let me take a look later today! Fell through the cracks
Thank you!
@ethanbb I think you have maintainer permissions here, merge whenever you think it's ready
@kushalkolar OK, I'm working on getting at least one of the test runs to succeed. Also, it seems like I still need an approving review from a reviewer with write access.
I'm not reproducing this tolerance error on my Windows machine...
Max relative difference: 105.32658421 that seems really high, no? I see the rtol is already 100 though. @kushalkolar do you know about this?
Anyway, the tests are all passing on my local setup with this issue, and the failure in the Windows runner here isn't related to the issue, so I'll go ahead and merge.
I use windows and my data root is a junction to a folder having spaces in its name.
using proekt/rel-path-fix, f4e5b2f24bd77d90c3ade756bf7d6991ab7c7059 it works fine. using origin/master cf6b0168f53efe84649313fb43dc842c7d66bdb2 i get an error at df.caiman.add_item( algo='mcorr', input_movie_path=movie_path, ...
i get error: ... c:\Users\gerge\anaconda3\envs\caiman\lib\site-packages\mesmerize_core\caiman_extensions\common.py:252, in CaimanDataFrameExtensions.reload_from_disk(self) 237 def reload_from_disk(self) -> pd.DataFrame: 238 """ 239 Returns the DataFrame on disk. 240 ... 59 "hyphens ( - ), underscores ( _ ) or periods ( . )" 60 ) 61 return path
ValueError: Paths must only contain alphanumeric characters, hyphens ( - ), underscores ( _ ) or periods ( . )
my movie_path is: "C:\Users\gerge\Projects\ChemoGen\ChemoGen02\Region_1\PREPARED\IMAGE_DATA_UG.h5"
im not sure the underlying CaImAn repo version has anything to do with this.
@katonage ah ok, I did make some changes to call resolve earlier on inputted paths, thinking it would be better to use the canonical form wherever possible. I do see the potential for that to cause a problem though if a junction or other alias was used intentionally to bypass errors with unsupported types of paths in the called procedures.
@kushalkolar is there a good reason mesmerize-core validates to disallow paths with spaces? If not, the overly strict validation is the real bug. If so, I can see if it would work to keep aliases whenever possible and only resolve them when a relative path needs to be computed.
The reason is it creates a run file script which is what runs in the subprocess, and dealing with spaces there can be hard
On Sun, Mar 2, 2025, 10:22 AM Ethan Blackwood @.***> wrote:
@katonage https://github.com/katonage ah ok, I did make some changes to call resolve earlier on inputted paths, thinking it would be better to use the canonical form wherever possible. I do see the potential for that to cause a problem though if a junction or other alias was used intentionally to bypass errors with unsupported types of paths in the called procedures.
@kushalkolar https://github.com/kushalkolar is there a good reason mesmerize-core validates to disallow paths with spaces? If not, the overly strict validation is the real bug. If so, I can see if it would work to keep aliases whenever possible and only resolve them when a relative path needs to be computed.
— Reply to this email directly, view it on GitHub https://github.com/nel-lab/mesmerize-core/pull/304#issuecomment-2692779610, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACHXXRCQNJNB26EWLAHCINT2SMOZ5AVCNFSM6AAAAABXZ4GE3WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMOJSG43TSNRRGA . You are receiving this because you were mentioned.Message ID: @.***> [image: ethanbb]ethanbb left a comment (nel-lab/mesmerize-core#304) https://github.com/nel-lab/mesmerize-core/pull/304#issuecomment-2692779610
@katonage https://github.com/katonage ah ok, I did make some changes to call resolve earlier on inputted paths, thinking it would be better to use the canonical form wherever possible. I do see the potential for that to cause a problem though if a junction or other alias was used intentionally to bypass errors with unsupported types of paths in the called procedures.
@kushalkolar https://github.com/kushalkolar is there a good reason mesmerize-core validates to disallow paths with spaces? If not, the overly strict validation is the real bug. If so, I can see if it would work to keep aliases whenever possible and only resolve them when a relative path needs to be computed.
— Reply to this email directly, view it on GitHub https://github.com/nel-lab/mesmerize-core/pull/304#issuecomment-2692779610, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACHXXRCQNJNB26EWLAHCINT2SMOZ5AVCNFSM6AAAAABXZ4GE3WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMOJSG43TSNRRGA . You are receiving this because you were mentioned.Message ID: @.***>
OK thanks. When I get a chance I'll add a test for this and try to make the subprocess backend work with paths with spaces.
Thank you very much for your efforts!
@katonage Can you check that it works now? You should also be able to use it without the junction, if you want to.
Edit: forgot to say I added a dependency, you will have to install mslex with conda or pip.
It works for me now with both junctions and paths with spaces. Thank you!