spikeinterface There might be a bug during the folder creation process

Hello, whenever I use the run_sorter() method with the options 'mountainsort5', 'spykingcircus2', 'tridesclous2', or 'herdingspikes', I consistently encounter the error message 'Folder xxx_output already exists'. I noticed that the path parameter in run_sorter() is set to the default value, and there are no existing files in the specified path. It appears that there might be a bug occurring during the folder creation process. Would you mind taking a look at this issue and possibly resolving it? Thank you for your assistance!

Apr 04 '24 09:04 daisy-zsn

Hi @daisy-zsn

Could you provide a script that reproduces the issue? I personally haven't encountered this.

What do you mean with "specified path"? Is it the current folder? By default the run sorter will create a folder output_{sorter_name} in the current folder, unless you specify a different output_folder. So if you run the same script twice it will (correctly) trigger this error. You can use the remove_existing_folder=True to overwrite the output folder.

Apr 04 '24 10:04 alejoe91

I have a theory, but I don't know much about these sorters, so you'll have to tell me if it makes sense @alejoe91 and @samuelgarcia .

If you call recording.save(format="zarr", ...) (or similar) with n_jobs > 1, and your default multiprocessing context is not "fork", then the child processes don't inherit the parent's memory directly. They start a fresh Python interpreter, which means the module you're running is re-imported. If this module doesn't have an if __name__ == "__main__": guard, the entire script, including recording.save() executes again in each child process, which causes the assertion error (because the output directory was already created by the parent/main process).

I just ran into this because I upgraded from Python 3.13 to Python 3.14. On my platform (Linux), Python 3.14 changed the default multiprocessing context from "fork" to "forkserver", uncovering this race condition. So all of a sudden, the same script with the same version of zarr (v2.18.7) and the same version of spikeinterface (v0.103.3) started producing this assertion error.

Thankfully it's an easy fix: use if __name__ == "__main__": and wrap your code in a function, for example:

import spikeinterface as si

def main():
    recording = si.read_spikeglx(...)

    # Execute pipeline
    ...

    recording.save(format="zarr", ...)

if __name__ == "__main__":
    main()

You could also not do this and set mp_context="fork" in the call to save, but that's not recommended. You should use the if __name__ == "__main__": guard, regardless of which mp_context you use.

It was a nightmare to figure out and debug. I wouldn't be surprised if you more of these issues as adoption of 3.14 picks up.

Dec 16 '25 20:12 grahamfindlay

I just want to clarify that this is NOT an issue with spikeinterface, it is just a user error that is really easy to make if you're copy-pasting code snippets from a tutorial or how-to guide into a my_pipeline.py and executing that.

Dec 17 '25 21:12 grahamfindlay

We should definitely add this to the docs!!!

Dec 18 '25 11:12 alejoe91

yes agree we should be very clear to use the if __name__ == "__main__": in script using spikeinterface because of the multiprocessing

Dec 22 '25 12:12 samuelgarcia