spikeinterface
spikeinterface copied to clipboard
run_sorter in container fails on Windows when recording is not on OS drive (D:) — in_container_sorting path malformed
Summary
We encountered a failure when using spikeinterface.sorters.run_sorter(..., docker_image=...) on Windows 11, when the recording is stored on a secondary drive (D:). The sorter completes, and the container stops, but the loading of the result fails because the path to the serialized recording inside in_container_sorting is malformed — it lacks a drive letter and begins with a backslash.
I have tried overwriting the recording object path kwarg with a POSIX version of the path and it does not change the error.
This issue does not occur:
- When running the same call without Docker (in Conda)
- When the recording path is on the OS drive (C:)
System Info
- Windows 11
- SpikeInterface commit 9d6ad5b29
- Docker image: 'datajoint-spikeinterface:latest` (custom-built)
- Base image: NVIDIA CUDA image (
nvidia/cuda:11.7.1) - SpikeInterface installed from local copy
Here is the command as I run it:
sorting = run_sorter(sorter_name=sorter_name["name"], recording=recording, folder=output_path, installation_mode="folder", spikeinterface_folder_source=code_src[0], remove_existing_folder=True, verbose=True, docker_image=docker_image[0])
Error Trace:
ValueError: D:\NP_sorted_backup\...\cleaning\sorting\in_container_sorting is not a file or a folder. It should point to either a json, pickle file or a folder that is the result of extractor.save(...)
Inners tack trace:
FileNotFoundError: [Errno 2] No such file or directory: '\\NP_sorted_backup\\...\\cleaning\\binary.json'
I notice that the path structure is missing the drive information.
Here is the full error:
→ Populating SortingCompute
2025-03-27 12:03:20,918::INFO::sorting.py::Populating SortingCompute for ['Flea_2024-08-13_1'], key: {'recording_id': 1, 'doe': datetime.datetime(2024, 8, 13, 0, 0), 'attempt': 1, 'probe_id': 1, 'filtered': 1, 'clean_path': 'D:\\NP_sorted_backup\\2024-08-13_13-44-43_Flea_adaptation_implicit\\Record Node 101\\experiment1\\recording1\\continuous\\Neuropix-PXI-100.ProbeA\\cleaning', 'id': 1}
Fixed folder_path: D:/NP_sorted_backup/2024-08-13_13-44-43_Flea_adaptation_implicit/Record Node 101/experiment1/recording1/continuous/Neuropix-PXI-100.ProbeA/cleaning
True
2025-03-27 12:03:20,929::INFO::sorting.py::Saving new sorted recording to D:\NP_sorted_backup\2024-08-13_13-44-43_Flea_adaptation_implicit\Record Node 101\experiment1\recording1\continuous\Neuropix-PXI-100.ProbeA\cleaning\sorting.
Starting container
Running kilosort4 sorter inside spencer/datajoint-spikeinterface:latest
Stopping container
2025-03-27 22:02:19,578::WARNING::sorting.py::Unexpected error in the recording processing pipeline: D:\NP_sorted_backup\2024-08-13_13-44-43_Flea_adaptation_implicit\Record Node 101\experiment1\recording1\continuous\Neuropix-PXI-100.ProbeA\cleaning\sorting\in_container_sorting is not a file or a folder. It should point to either a json, pickle file or a folder that is the result of extractor.save(...)
Traceback (most recent call last):
File "C:\Users\mathi\anaconda3\envs\si_env_rolling\Lib\site-packages\spikeinterface\sorters\runsorter.py", line 667, in run_sorter_container
sorting = SorterClass.get_result_from_folder(folder)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\mathi\anaconda3\envs\si_env_rolling\Lib\site-packages\spikeinterface\sorters\basesorter.py", line 334, in get_result_from_folder
recording = cls.load_recording_from_folder(output_folder, with_warnings=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\mathi\anaconda3\envs\si_env_rolling\Lib\site-packages\spikeinterface\sorters\basesorter.py", line 212, in load_recording_from_folder
recording = load_extractor(json_file, base_folder=output_folder)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\mathi\anaconda3\envs\si_env_rolling\Lib\site-packages\spikeinterface\core\base.py", line 1184, in load_extractor
return BaseExtractor.load(file_or_folder_or_dict, base_folder=base_folder)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\mathi\anaconda3\envs\si_env_rolling\Lib\site-packages\spikeinterface\core\base.py", line 769, in load
extractor = BaseExtractor.from_dict(d, base_folder=base_folder)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\mathi\anaconda3\envs\si_env_rolling\Lib\site-packages\spikeinterface\core\base.py", line 515, in from_dict
extractor = _load_extractor_from_dict(dictionary)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\mathi\anaconda3\envs\si_env_rolling\Lib\site-packages\spikeinterface\core\base.py", line 1123, in _load_extractor_from_dict
extractor = extractor_class(**new_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\mathi\anaconda3\envs\si_env_rolling\Lib\site-packages\spikeinterface\core\binaryfolder.py", line 31, in __init__
with open(folder_path / "binary.json", "r") as f:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '\\NP_sorted_backup\\2024-08-13_13-44-43_Flea_adaptation_implicit\\Record Node 101\\experiment1\\recording1\\continuous\\Neuropix-PXI-100.ProbeA\\cleaning\\binary.json'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\mathi\github\auxPipelines-DataJoint_Mathis\neuropixels\neuropixels_schemas\np_pipeline\schemas\sorting.py", line 146, in make
sorting = run_sorter(
^^^^^^^^^^^
File "C:\Users\mathi\anaconda3\envs\si_env_rolling\Lib\site-packages\spikeinterface\sorters\runsorter.py", line 210, in run_sorter
return run_sorter_container(
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\mathi\anaconda3\envs\si_env_rolling\Lib\site-packages\spikeinterface\sorters\runsorter.py", line 670, in run_sorter_container
sorting = load_extractor(in_container_sorting_folder)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\mathi\anaconda3\envs\si_env_rolling\Lib\site-packages\spikeinterface\core\base.py", line 1184, in load_extractor
return BaseExtractor.load(file_or_folder_or_dict, base_folder=base_folder)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\mathi\anaconda3\envs\si_env_rolling\Lib\site-packages\spikeinterface\core\base.py", line 805, in load
raise ValueError(error_msg)
ValueError: D:\NP_sorted_backup\2024-08-13_13-44-43_Flea_adaptation_implicit\Record Node 101\experiment1\recording1\continuous\Neuropix-PXI-100.ProbeA\cleaning\sorting\in_container_sorting is not a file or a folder. It should point to either a json, pickle file or a folder that is the result of extractor.save(...)
I believe we've been able to use docker from different drives, but it's been a while since we've tried. One easy question to start with... what is the exact value of output_folder that you are putting into the run_sorter?
I have the feeling that we handle this but I do not remember. Is your case : the input (raw recordin) is on a different drive than the output (sorter folder) ?
I believe we've been able to use docker from different drives, but it's been a while since we've tried. One easy question to start with... what is the exact value of
output_folderthat you are putting into therun_sorter?
The output folder is a subdirectory in the same directory as the recording object. I think that it is being located fine because the actual output of the sorting is saved there correctly in full. It is just the in_containter_sorting folder that is missing. But the output path would look like this:
[36]: output_path
Out[36]: WindowsPath('D:/NP_Raw_backup/2024-12-12_13-56-16_Fossa_adaptation_implicit/Record Node 101/experiment1/recording1/continuous/Neuropix-PXI-107.ProbeD/cleaning/sorting')
I have the feeling that we handle this but I do not remember. Is your case : the input (raw recordin) is on a different drive than the output (sorter folder) ?
No, our target directory is a subdirectory of the folder that holds the recording object.
One other question that's Windows specific. Do you have long paths enabled on your computer. Windows does this think where they limit how nested files can be. So if you have the default (unless they've changed) you might be too nested.
I would recommend two things:
- Try a local sorter (even if you want to do a tiny test with SC2 or TDC2--those come built into spikeinterface)
- Try to reduce the nesting of your folders as a test
- bonus: read up about enabling longer paths on Windows and then check on your computer (I had to this for my workstation :) )
One other question that's Windows specific. Do you have long paths enabled on your computer. Windows does this think where they limit how nested files can be. So if you have the default (unless they've changed) you might be too nested.
I would recommend two things:
- Try a local sorter (even if you want to do a tiny test with SC2 or TDC2--those come built into spikeinterface)
- Try to reduce the nesting of your folders as a test
- bonus: read up about enabling longer paths on Windows and then check on your computer (I had to this for my workstation :) )
I do not think that total path length is an issue here, as spikeinterface can interact with even deeper nested files and directories when we create the sorting analyzer.
I believe this is specifically an issue with these lines in run_sorter link.
# find input folder of recording for folder bind
rec_dict = recording.to_dict(recursive=True)
recording_input_folders = find_recording_folders(rec_dict)
if platform.system() == "Windows":
rec_dict = windows_extractor_dict_to_unix(rec_dict)
I followed back the windows_extractor_dict_to_unix function and pulled all of the helper functions it calls. When I run them in isolation on a windows recording object, they rewrite the path kwargs without the drive information.
If you want to test this I wrote a script that just runs this function, and then prints the path in the recording object.
If you save this code as a python script and run it in any environment that has spikeinterface installed, you should be able to see if they behavior is the same. Just pass it a path to a correctly formatted recording object.
import platform
from pathlib import Path
from copy import deepcopy
from spikeinterface import load_extractor
# === SpikeInterface-derived helper functions ===
def path_to_unix(path):
"""Convert a Windows path to unix format"""
path = Path(path)
if platform.system() == "Windows":
path = Path(str(path)[str(path).find(":") + 1 :])
return path.as_posix()
def is_dict_extractor(d: dict) -> bool:
"""Check if a dict describes an extractor"""
if not isinstance(d, dict):
return False
return all(k in d for k in ("module", "class", "version", "annotations"))
def recursive_path_modifier(d, func, target="path", copy=True):
if copy:
dc = deepcopy(d)
else:
dc = d
if "kwargs" in dc:
kwargs = dc["kwargs"]
recursive_path_modifier(kwargs, func, copy=False)
for k, v in kwargs.items():
if isinstance(v, dict) and is_dict_extractor(v):
recursive_path_modifier(v, func, copy=False)
elif isinstance(v, list):
for vl in v:
if isinstance(vl, dict) and is_dict_extractor(vl):
recursive_path_modifier(vl, func, copy=False)
return dc
else:
for k, v in d.items():
if target in k:
if v is None:
continue
if isinstance(v, (str, Path)):
dc[k] = func(v)
elif isinstance(v, list):
dc[k] = [func(e) for e in v]
else:
raise ValueError(f"{k} key for path must be str or list[str]")
return dc
def windows_extractor_dict_to_unix(d):
return recursive_path_modifier(d, path_to_unix, target="path", copy=True)
def print_paths_from_kwargs(d, prefix=""):
if not isinstance(d, dict):
return
for k, v in d.items():
key_path = f"{prefix}.{k}" if prefix else k
if "path" in k.lower():
print(f"{key_path}: {v}")
if isinstance(v, dict):
print_paths_from_kwargs(v, key_path)
elif isinstance(v, list):
for i, item in enumerate(v):
if isinstance(item, dict):
print_paths_from_kwargs(item, f"{key_path}[{i}]")
# === Load and convert a recording object ===
if __name__ == "__main__":
import sys
if len(sys.argv) != 2:
print("Usage: python test_si_paths.py <clean_path>")
sys.exit(1)
clean_path = sys.argv[1]
recording = load_extractor(clean_path)
rec_dict = recording.to_dict(recursive=True)
print("➤ BEFORE CONVERSION:")
print_paths_from_kwargs(rec_dict)
if platform.system() == "Windows":
rec_dict = windows_extractor_dict_to_unix(rec_dict)
print("\n➤ AFTER CONVERSION:")
print_paths_from_kwargs(rec_dict)
When I try this I get a result like this:
(si_env_rolling) C:\Users\mathi\github\auxPipelines-DataJoint_Mathis\neuropixels\neuropixels_schemas>python test_si_paths.py "D:\NP_sorted_backup\2024-08-13_13-44-43_Flea_adaptation_implicit\Record Node 101\experiment1\recording1\continuous\Neuropix-PXI-100.ProbeA\cleaning"
➤ BEFORE CONVERSION:
kwargs.folder_path: D:\NP_sorted_backup\2024-08-13_13-44-43_Flea_adaptation_implicit\Record Node 101\experiment1\recording1\continuous\Neuropix-PXI-100.ProbeA\cleaning
relative_paths: False
➤ AFTER CONVERSION:
kwargs.folder_path: /NP_sorted_backup/2024-08-13_13-44-43_Flea_adaptation_implicit/Record Node 101/experiment1/recording1/continuous/Neuropix-PXI-100.ProbeA/cleaning
relative_paths: False
This matches the traceback error I see when I try using run_sorter.
FileNotFoundError: [Errno 2] No such file or directory: '\\NP_sorted_backup\\2024-08-13_13-44-43_Flea_adaptation_implicit\\Record Node 101\\experiment1\\recording1\\continuous\\Neuropix-PXI-100.ProbeA\\cleaning\\binary.json'
Sorry and if you do a local sorter does the same issue happen? Like Sam said it used to be impossible to have different drives, but we had fixed that. (I routinely sort across c drive and a network mapped drive) and I've sorted across c and d. But I personally do it locally. Could you try a local sorter and see if you get the same error. Then we can narrow down more to docker issues, which I think it could be. Because I would bet the docker doesn't have an associated drive.
I'll test your script and see what it does on my machine!
Per the issue summary above "This issue does not occur:
- When running the same call without Docker (in Conda)
- When the recording path is on the OS drive (C:)"
It really does seem that this drive deletion is by design though. Its even in the descriptive comments
Hey all, just checking the status of this -- anything more we can provide? @zm711 cc @SpencerBowles ? 🙏🏼
It really does seem that this drive deletion is by design though. Its even in the descriptive comments
Hi @SpencerBowles this is only to map the original parent recording (with the drive info) to a unix-like path to use in the container.
I'll look into this next week. Sorry about the delay on this!
@SpencerBowles can you test this in the meanwhile
- keep the recording path on the D: drive
- specify the
run_sorterfolderin the C: drive
I think this should work and it would save you from moving large files around..you can then just save the sorting object to D: in a following call:
sorting = `ss.run_sorter(..., folder="C:\...", docker_image=True)`
sorting.save(folder="D:\...")