spikeinterface icon indicating copy to clipboard operation
spikeinterface copied to clipboard

Error running run_sorter in py script from command prompt

Open oghenand opened this issue 2 years ago • 39 comments

Hi!

I'm trying to run Ironclust using Spike Interface on a .py file that I want to directly run from the command prompt. However, when I try to do so, I get an odd error likely due to saving of the data to the output folder. When I run the code from jupyter notebook, it runs completely fine, but the error seems to stem from using a .py file. I've attached the output of my error message below. If anyone has encountered a similar error before and would know a fix, that'd be great, thanks!

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\ProgramData\miniconda3\lib\multiprocessing\spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "C:\ProgramData\miniconda3\lib\multiprocessing\spawn.py", line 125, in _main
    prepare(preparation_data)
  File "C:\ProgramData\miniconda3\lib\multiprocessing\spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "C:\ProgramData\miniconda3\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "C:\ProgramData\miniconda3\lib\runpy.py", line 289, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "C:\ProgramData\miniconda3\lib\runpy.py", line 96, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "C:\ProgramData\miniconda3\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\Allison\source\repos\Spike_Sorting_Automation_1\Spike_Sorting_Automation_1\Spike_Sorting_Automation_1.py", line 149, in <module>
    IC_si = ss.run_sorter('ironclust', recording = recording_prb, output_folder = r'C:\Users\Allison\Desktop\auto_pilot_721', verbose=False)
  File "C:\Users\Allison\AppData\Roaming\Python\Python310\site-packages\spikeinterface\sorters\runsorter.py", line 143, in run_sorter
    return run_sorter_local(**common_kwargs)
  File "C:\Users\Allison\AppData\Roaming\Python\Python310\site-packages\spikeinterface\sorters\runsorter.py", line 158, in run_sorter_local
    output_folder = SorterClass.initialize_folder(
  File "C:\Users\Allison\AppData\Roaming\Python\Python310\site-packages\spikeinterface\sorters\basesorter.py", line 120, in initialize_folder
    shutil.rmtree(str(output_folder))
  File "C:\ProgramData\miniconda3\lib\shutil.py", line 750, in rmtree
    return _rmtree_unsafe(path, onerror)
  File "C:\ProgramData\miniconda3\lib\shutil.py", line 615, in _rmtree_unsafe
    _rmtree_unsafe(fullname, onerror)
  File "C:\ProgramData\miniconda3\lib\shutil.py", line 615, in _rmtree_unsafe
    _rmtree_unsafe(fullname, onerror)
  File "C:\ProgramData\miniconda3\lib\shutil.py", line 620, in _rmtree_unsafe
    onerror(os.unlink, fullname, sys.exc_info())
  File "C:\ProgramData\miniconda3\lib\shutil.py", line 618, in _rmtree_unsafe
    os.unlink(fullname)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\Allison\\Desktop\\auto_pilot_721\\sorter_output\\ironclust_dataset\\raw.mda'
Running ironclust...
Running ironclust...
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\ProgramData\miniconda3\lib\multiprocessing\spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "C:\ProgramData\miniconda3\lib\multiprocessing\spawn.py", line 125, in _main
    prepare(preparation_data)
  File "C:\ProgramData\miniconda3\lib\multiprocessing\spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "C:\ProgramData\miniconda3\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "C:\ProgramData\miniconda3\lib\runpy.py", line 289, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "C:\ProgramData\miniconda3\lib\runpy.py", line 96, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "C:\ProgramData\miniconda3\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\Allison\source\repos\Spike_Sorting_Automation_1\Spike_Sorting_Automation_1\Spike_Sorting_Automation_1.py", line 149, in <module>
    IC_si = ss.run_sorter('ironclust', recording = recording_prb, output_folder = r'C:\Users\Allison\Desktop\auto_pilot_721', verbose=False)
  File "C:\Users\Allison\AppData\Roaming\Python\Python310\site-packages\spikeinterface\sorters\runsorter.py", line 143, in run_sorter
    return run_sorter_local(**common_kwargs)
  File "C:\Users\Allison\AppData\Roaming\Python\Python310\site-packages\spikeinterface\sorters\runsorter.py", line 158, in run_sorter_local
    output_folder = SorterClass.initialize_folder(
  File "C:\Users\Allison\AppData\Roaming\Python\Python310\site-packages\spikeinterface\sorters\basesorter.py", line 120, in initialize_folder
    shutil.rmtree(str(output_folder))
  File "C:\ProgramData\miniconda3\lib\shutil.py", line 750, in rmtree
    return _rmtree_unsafe(path, onerror)
  File "C:\ProgramData\miniconda3\lib\shutil.py", line 615, in _rmtree_unsafe
    _rmtree_unsafe(fullname, onerror)
  File "C:\ProgramData\miniconda3\lib\shutil.py", line 615, in _rmtree_unsafe
    _rmtree_unsafe(fullname, onerror)
  File "C:\ProgramData\miniconda3\lib\shutil.py", line 620, in _rmtree_unsafe
Setting IRONCLUST_PATH environment variable for subprocess calls to: C:\Users\Allison\Desktop\IntanToNWB-main\ironclust
    onerror(os.unlink, fullname, sys.exc_info())
  File "C:\ProgramData\miniconda3\lib\shutil.py", line 618, in _rmtree_unsafe
Traceback (most recent call last):
    os.unlink(fullname)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\Allison\\Desktop\\auto_pilot_721\\sorter_output\\ironclust_dataset\\raw.mda'
  File "<string>", line 1, in <module>
C:\Users\Allison\Desktop\burgers_auto.rhd
  File "C:\ProgramData\miniconda3\lib\multiprocessing\spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "C:\ProgramData\miniconda3\lib\multiprocessing\spawn.py", line 125, in _main
    prepare(preparation_data)
  File "C:\ProgramData\miniconda3\lib\multiprocessing\spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "C:\ProgramData\miniconda3\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "C:\ProgramData\miniconda3\lib\runpy.py", line 289, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "C:\ProgramData\miniconda3\lib\runpy.py", line 96, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "C:\ProgramData\miniconda3\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\Allison\source\repos\Spike_Sorting_Automation_1\Spike_Sorting_Automation_1\Spike_Sorting_Automation_1.py", line 149, in <module>
    IC_si = ss.run_sorter('ironclust', recording = recording_prb, output_folder = r'C:\Users\Allison\Desktop\auto_pilot_721', verbose=False)
  File "C:\Users\Allison\AppData\Roaming\Python\Python310\site-packages\spikeinterface\sorters\runsorter.py", line 143, in run_sorter
    return run_sorter_local(**common_kwargs)
  File "C:\Users\Allison\AppData\Roaming\Python\Python310\site-packages\spikeinterface\sorters\runsorter.py", line 158, in run_sorter_local
    output_folder = SorterClass.initialize_folder(
  File "C:\Users\Allison\AppData\Roaming\Python\Python310\site-packages\spikeinterface\sorters\basesorter.py", line 120, in initialize_folder
    shutil.rmtree(str(output_folder))
  File "C:\ProgramData\miniconda3\lib\shutil.py", line 750, in rmtree
    return _rmtree_unsafe(path, onerror)
  File "C:\ProgramData\miniconda3\lib\shutil.py", line 615, in _rmtree_unsafe
    _rmtree_unsafe(fullname, onerror)
  File "C:\ProgramData\miniconda3\lib\shutil.py", line 615, in _rmtree_unsafe
    _rmtree_unsafe(fullname, onerror)
  File "C:\ProgramData\miniconda3\lib\shutil.py", line 620, in _rmtree_unsafe
    onerror(os.unlink, fullname, sys.exc_info())
  File "C:\ProgramData\miniconda3\lib\shutil.py", line 618, in _rmtree_unsafe
    os.unlink(fullname)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\Allison\\Desktop\\auto_pilot_721\\sorter_output\\ironclust_dataset\\raw.mda'
Channel ids: <bound method BaseRecordingSnippets.get_channel_ids of IntanRecordingExtractor: 128 channels - 1 segments - 30.0kHz - 60.002s
  file_path: C:\Users\Allison\Desktop\burgers_auto.rhd>
Sampling frequency: 30000
Number of channels: 128
Running ironclust...
Traceback (most recent call last):
  File "C:\Users\Allison\source\repos\Spike_Sorting_Automation_1\Spike_Sorting_Automation_1\Spike_Sorting_Automation_1.py", line 149, in <module>
    IC_si = ss.run_sorter('ironclust', recording = recording_prb, output_folder = r'C:\Users\Allison\Desktop\auto_pilot_721', verbose=False)
  File "C:\Users\Allison\AppData\Roaming\Python\Python310\site-packages\spikeinterface\sorters\runsorter.py", line 143, in run_sorter
    return run_sorter_local(**common_kwargs)
  File "C:\Users\Allison\AppData\Roaming\Python\Python310\site-packages\spikeinterface\sorters\runsorter.py", line 162, in run_sorter_local
    SorterClass.setup_recording(recording, output_folder, verbose=verbose)
  File "C:\Users\Allison\AppData\Roaming\Python\Python310\site-packages\spikeinterface\sorters\basesorter.py", line 196, in setup_recording
    cls._setup_recording(recording, sorter_output_folder, sorter_params, verbose)
  File "C:\Users\Allison\AppData\Roaming\Python\Python310\site-packages\spikeinterface\sorters\external\ironclust.py", line 169, in _setup_recording
    MdaRecordingExtractor.write_recording(recording=recording, save_path=str(dataset_dir), verbose=False,
  File "C:\Users\Allison\AppData\Roaming\Python\Python310\site-packages\spikeinterface\extractors\mdaextractors.py", line 118, in write_recording
    write_binary_recording(recording, file_paths=save_file_path, dtype=dtype,
  File "C:\Users\Allison\AppData\Roaming\Python\Python310\site-packages\spikeinterface\core\core_tools.py", line 280, in write_binary_recording
    executor.run()
  File "C:\Users\Allison\AppData\Roaming\Python\Python310\site-packages\spikeinterface\core\job_tools.py", line 364, in run
    for res in results:
  File "C:\ProgramData\miniconda3\lib\concurrent\futures\process.py", line 570, in _chain_from_iterable_of_lists
    for element in iterable:
  File "C:\ProgramData\miniconda3\lib\concurrent\futures\_base.py", line 621, in result_iterator
    yield _result_or_cancel(fs.pop())
  File "C:\ProgramData\miniconda3\lib\concurrent\futures\_base.py", line 319, in _result_or_cancel
    return fut.result(timeout)
  File "C:\ProgramData\miniconda3\lib\concurrent\futures\_base.py", line 458, in result
    return self.__get_result()
  File "C:\ProgramData\miniconda3\lib\concurrent\futures\_base.py", line 403, in __get_result
    raise self._exception
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.

oghenand avatar Jul 21 '23 23:07 oghenand

Which recording are you using?

A solution that work for some of our users is to transform to our binary extractor first which should work better with multiprocessing.

As @alejoe91 pointed out in another issue this is a problem in windows:

https://stackoverflow.com/questions/27215462/permissionerror-winerror-32-the-process-cannot-access-the-file-because-it-is

h-mayorquin avatar Jul 25 '23 08:07 h-mayorquin

Hi!

How do you transform the binary extractor? I'm using an intan rhd recording for the analysis, btw.

I read the issue you linked, and I can try updating the package if that will help? Let me know if there's anything else you suggest I can do!

I found a work-around running the code in an ipynb file, but getting it to run in a py file would be ideal!

Thanks!

oghenand avatar Jul 25 '23 15:07 oghenand

From your recorder you just do

cached_recording = your_recording.save() 

This will create a copy of your recorder saved in your computer. So there is some extra space that will be used. But now that I think twice about it, I am less sure this will solve your problem since I don't think that intan should have any problems.

Can you create to run your script with the following recording just to see if we can reproduce your error with something that you can run:

from spikeinterface.core.generate import generate_lazy_recording


recording = generate_lazy_recording(full_traces_size_GiB=1.0) 
recording

Please share your script and instead of using your recorder use the one that I just provided. If the error is reproduced then we can debug easier on our side.

h-mayorquin avatar Jul 26 '23 06:07 h-mayorquin

Hello,

I have the same issue, or at least I think it is the same, when running spykingcircus2 on the dev version ('0.99.0.dev0'), with anaconda on Windows. I do not have the issue with the pip installation ('0.98.2'). Did you find a solution? I've tried to save the recording but it did not make any difference. Here is the error message: SpikeSortingError: Spike sorting error trace: Traceback (most recent call last): File "C:\Users\katia.lehongre\AppData\Local\anaconda3\envs\si_dev\Lib\site-packages\spikeinterface-0.99.0.dev0-py3.11.egg\spikeinterface\sorters\basesorter.py", line 234, in run_from_folder SorterClass._run_from_folder(sorter_output_folder, sorter_params, verbose) File "C:\Users\katia.lehongre\AppData\Local\anaconda3\envs\si_dev\Lib\site-packages\spikeinterface-0.99.0.dev0-py3.11.egg\spikeinterface\sorters\internal\spyking_circus2.py", line 112, in _run_from_folder labels, peak_labels = find_cluster_from_peaks( ^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\katia.lehongre\AppData\Local\anaconda3\envs\si_dev\Lib\site-packages\spikeinterface-0.99.0.dev0-py3.11.egg\spikeinterface\sortingcomponents\clustering\main.py", line 41, in find_cluster_from_peaks labels, peak_labels = method_class.main_function(recording, peaks, params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\katia.lehongre\AppData\Local\anaconda3\envs\si_dev\Lib\site-packages\spikeinterface-0.99.0.dev0-py3.11.egg\spikeinterface\sortingcomponents\clustering\random_projections.py", line 242, in main_function shutil.rmtree(tmp_folder / "sorting") File "C:\Users\katia.lehongre\AppData\Local\anaconda3\envs\si_dev\Lib\shutil.py", line 759, in rmtree return _rmtree_unsafe(path, onerror) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\katia.lehongre\AppData\Local\anaconda3\envs\si_dev\Lib\shutil.py", line 622, in _rmtree_unsafe onerror(os.unlink, fullname, sys.exc_info()) File "C:\Users\katia.lehongre\AppData\Local\anaconda3\envs\si_dev\Lib\shutil.py", line 620, in _rmtree_unsafe os.unlink(fullname) PermissionError: [WinError 32] Le processus ne peut pas accéder au fichier car ce fichier est utilisé par un autre processus: '\\l2export\iss02.epimicro\patients\shared\Virginie\SLI\SLI_analysis_spikes\data\SI\02420\p1\mOcp\SI_pos_th6_rec\0\spykingcircus2\sorter_output\clustering\sorting\spikes.npy'

Thanks!

KLehongre avatar Aug 24 '23 09:08 KLehongre

Not sure this is the same bug, but maybe. Yours seems to be very much related to internal machinery of SC2, I'll try to have a look quickly

yger avatar Aug 24 '23 09:08 yger

I have encountered the same problem running spyking circus 2 on windows (running 0.99.0.dev0). I have narrowed it down to this section of code in the random_projections.py file (lines 196 - 213):

elif cleaning_method == "matching":
    # create a tmp folder
    if params["tmp_folder"] is None:
        name = "".join(random.choices(string.ascii_uppercase + string.digits, k=8))
        tmp_folder = get_global_tmp_folder() / name
    else:
        tmp_folder = Path(params["tmp_folder"])

    if params["shared_memory"]:
        waveform_folder = None
        mode = "memory"
    else:
        waveform_folder = tmp_folder / "waveforms"
        mode = "folder"

    sorting_folder = tmp_folder / "sorting"
    sorting = NumpySorting.from_times_labels(spikes["sample_index"], spikes["unit_index"], fs)
    sorting = sorting.save(folder=sorting_folder)
    we = extract_waveforms(
                recording,
                sorting,
                waveform_folder,
                ms_before=params["ms_before"],
                ms_after=params["ms_after"],
                **params["job_kwargs"],
                return_scaled=False,
                mode=mode,
            )

as well as

labels, peak_labels = remove_duplicates_via_matching(
                we, noise_levels, peak_labels, job_kwargs=cleaning_matching_params, **cleaning_params
            )

In my specific scenario I have a tmp_folder location already set and am not using shared memory. Thus it saves a spikes.npy file in base_folder/sorter_output/clustering/sorting Later on in the code there is a section where it tries to remove said folder.

if params["tmp_folder"] is None:
    shutil.rmtree(tmp_folder)
else:
    
    shutil.rmtree(tmp_folder / "waveforms")
    # this line causes the crash because the spikes.npy file is still referenced by something.
    shutil.rmtree(tmp_folder / "sorting") 

It is apparent that somewhere it opens the spikes.py file for streaming or it isn't close completely. I was able to delete the directories before calling remove_duplicates_via_matching() by manually calling del for the sorting and we object. However, I was unable to delete them after the call. I thus believe that somewhere in remove_duplicates_via_matching() another reference to the file is created. I was unable to find it though. I have tried using both the garbage collector to cut any lose ties and manually deleting all local variables (even the ones that are actually returned by the function) before trying to remove the directories but this did not have the desired effect and still ended with a WinError 32 because of spikes.npy. I also don't see any way to really combat this since the spikes.npy file is created regardless of any settings that the user can make and, at least for spyking circus, the folders have to be deleted because folders are created in the same location later on.

LeMuellerGuy avatar Sep 26 '23 14:09 LeMuellerGuy

Thanks all for reporting this. Definitivelly we need to handle very carffully the memmaped files after use, especially when we want to delete then.

samuelgarcia avatar Sep 27 '23 06:09 samuelgarcia

I also have the suspicion that some of these problems come from decision of handling the creation and deletion of temporary files.

h-mayorquin avatar Sep 28 '23 06:09 h-mayorquin

Thanks for the feedback, and indeed, there is something fishy with the folders here, I can not understand why it can not be removed properly. One way to deal with that is to add ignore_errors=True to rmtree(), but it only solves partially the problem

yger avatar Sep 29 '23 11:09 yger

https://docs.python.org/3/library/shutil.html#rmtree-example

Docs for shutil say that for Windows files with readonly bit set can cause this error so to override you have to do this but it still might not work:

Quote from their docs

This example shows how to remove a directory tree on Windows where some of the files have their read-only bit set. It uses the onexc callback to clear the readonly bit and reattempt the remove. Any subsequent failure will propagate.


import os, stat
import shutil

def remove_readonly(func, path, _):
    "Clear the readonly bit and reattempt the removal"
    os.chmod(path, stat.S_IWRITE)
    func(path)

shutil.rmtree(directory, onexc=remove_readonly)

I really think this is just a fundamental difference between Windows and Unix-like systems with Windows making it very difficult to make and delete temp files.

zm711 avatar Oct 06 '23 13:10 zm711

I'm wondering if anyone has tested re-running their analyses. Spikeinterface (0.99.1 & 0.100.0dev--especially 0.100.0dev) have had some additions/fixes to make working on Windows better. It would be great to know if anyone has tried using the package since these updates so we can see if real world cases have been improved.

zm711 avatar Dec 19 '23 21:12 zm711

I reran spykingcircus2 on the 0.100.0.dev version but the issue is still present

LeMuellerGuy avatar Jan 05 '24 21:01 LeMuellerGuy

Can you copy the exact error that is appearing below?

zm711 avatar Jan 05 '24 21:01 zm711

On line 126 in basesorter.py: Translation of error message: The process can't access the file because it is being used by another process

Exception has occurred: PermissionError
[WinError 32] Der Prozess kann nicht auf die Datei zugreifen, da sie von einem anderen Prozess verwendet wird: 'C:\\Users\\lennart\\sciebo\\Masterarbeit\\CodeProjects\\spikeinterface\\spykingcircus2_output\\sorter_output\\sorting\\spikes.npy'
  File "C:\Users\lennart\spikeinterface\src\spikeinterface\sorters\basesorter.py", line 126, in initialize_folder
    shutil.rmtree(str(output_folder))
  File "C:\Users\lennart\spikeinterface\src\spikeinterface\sorters\runsorter.py", line 171, in run_sorter_local
    output_folder = SorterClass.initialize_folder(recording, output_folder, verbose, remove_existing_folder)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\lennart\spikeinterface\src\spikeinterface\sorters\runsorter.py", line 148, in run_sorter
    return run_sorter_local(**common_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\lennart\sciebo\Masterarbeit\CodeProjects\spikeinterface\AxonTracking\concatenateRecordings.py", line 52, in <module>
PermissionError: [WinError 32] Der Prozess kann nicht auf die Datei zugreifen, da sie von einem anderen Prozess verwendet wird: 
'C:\\Users\\lennart\\sciebo\\Masterarbeit\\CodeProjects\\spikeinterface\\spykingcircus2_output\\sorter_output\\sorting\\spikes.npy'

Edit: I guess that means that the issue has kind of moved to a different location. At least when running spykingcircus

LeMuellerGuy avatar Jan 05 '24 21:01 LeMuellerGuy

Thanks @LeMuellerGuy . Yeah I think the base issue affecting all sorters has been fixed. Would you be willing to test any other sorter so we can confirm this is isolated to spykingcircus2 now?

zm711 avatar Jan 05 '24 21:01 zm711

I'll have a look and discuss that with @samuelgarcia next week. Since I am not using windows, I never encountered these issues....

yger avatar Jan 05 '24 21:01 yger

@yger, I'm on windows currently so if you need me to run any tests for SC2 feel free to shoot me a slack (but I know Sam is doing dual-boot currently so he can also help with all the tests). But these permission errors come up all the time because Windows tries to delete files right away with shutil rather than put them in a queue to be deleted later. Sam fixed a bunch of these using some weakref and tweaking some memory mapping of files (so he can explain better), but I think SC2 using shutil.rm internally right?

zm711 avatar Jan 05 '24 21:01 zm711

I will have a deeper look at things tomorrow since it is quite late here already. I'll test some other sorters too.

LeMuellerGuy avatar Jan 05 '24 21:01 LeMuellerGuy

Yes, there are some internal calls to shutil.rmtree in SC2.... I could try to get rid of them maybe, i'll also have a look tomorrow. If you want to have a go, this is now easy with the GT recordings generated on the fly. Let me know if the following lines are crashing on windows

rec, gt_sorting = si.generate_ground_truth_recording(num_channels=32) sorting = si.run_sorter('spykingcircus2', rec, remove_existing_folder=True)

yger avatar Jan 05 '24 21:01 yger

So on the first try it runs fine without a problem.

sorting = si.run_sorter('spykingcircus2', rec, remove_existing_folder=True, output_folder='test-yger', verbose=True)
detect peaks using locally_exclusive with n_jobs = 8 and chunk_size = 25000
detect peaks using locally_exclusive:   0%|          | 0/10 [00:00<?, ?it/s]
We found 1239 peaks in total
We kept 1239 peaks for clustering
extracting features with n_jobs = 8 and chunk_size = 25000
extracting features:   0%|          | 0/10 [00:00<?, ?it/s]
We found 5 raw clusters, starting to clean with matching...
extract waveforms shared_memory multi buffer with n_jobs = 8 and chunk_size = 25000
extract waveforms shared_memory multi buffer:   0%|          | 0/10 [00:00<?, ?it/s]
extract waveforms shared_memory multi buffer with n_jobs = 8 and chunk_size = 25000
extract waveforms shared_memory multi buffer:   0%|          | 0/10 [00:00<?, ?it/s]
We kept 5 non-duplicated clusters...
extract waveforms shared_memory multi buffer with n_jobs = 8 and chunk_size = 25000
extract waveforms shared_memory multi buffer:   0%|          | 0/10 [00:00<?, ?it/s]
extract waveforms shared_memory multi buffer with n_jobs = 8 and chunk_size = 25000
extract waveforms shared_memory multi buffer:   0%|          | 0/10 [00:00<?, ?it/s]
find spikes (circus-omp-svd) with n_jobs = 8 and chunk_size = 2500
find spikes (circus-omp-svd):   0%|          | 0/100 [00:00<?, ?it/s]
We found 1063 spikes
spykingcircus2 run time 107.89s

If I then rerun I get the same error as above:


  Cell In[160], line 1
    sorting = si.run_sorter('spykingcircus2', rec, remove_existing_folder=True, output_folder='test-yger', verbose=True)

  File ~\Documents\GitHub\spikeinterface\src\spikeinterface\sorters\runsorter.py:148 in run_sorter
    return run_sorter_local(**common_kwargs)

  File ~\Documents\GitHub\spikeinterface\src\spikeinterface\sorters\runsorter.py:171 in run_sorter_local
    output_folder = SorterClass.initialize_folder(recording, output_folder, verbose, remove_existing_folder)

  File ~\Documents\GitHub\spikeinterface\src\spikeinterface\sorters\basesorter.py:126 in initialize_folder
    shutil.rmtree(str(output_folder))

  File ~\anaconda3\envs\spykes\Lib\shutil.py:759 in rmtree
    return _rmtree_unsafe(path, onerror)

  File ~\anaconda3\envs\spykes\Lib\shutil.py:617 in _rmtree_unsafe
    _rmtree_unsafe(fullname, onerror)

  File ~\anaconda3\envs\spykes\Lib\shutil.py:617 in _rmtree_unsafe
    _rmtree_unsafe(fullname, onerror)

  File ~\anaconda3\envs\spykes\Lib\shutil.py:622 in _rmtree_unsafe
    onerror(os.unlink, fullname, sys.exc_info())

  File ~\anaconda3\envs\spykes\Lib\shutil.py:620 in _rmtree_unsafe
    os.unlink(fullname)

PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\ZM\\test-yger\\sorter_output\\sorting\\spikes.npy'

zm711 avatar Jan 05 '24 21:01 zm711

I reran the same code from yesterday evening again and it (thankfully) produced the same result as @zm711. It worked on the first time and crashed on folder initialization the second time. I also tried Tridesclous2 plugged into the same code which also produced the same error (trace below). It also crashed on the second time running it when trying to overwrite the existing folder. I assume that external sorters will also suffer from the same problem.

Exception has occurred: PermissionError
[WinError 32] Der Prozess kann nicht auf die Datei zugreifen, da sie von einem anderen Prozess verwendet wird: 'C:\\Users\\lennart\\sciebo\\Masterarbeit\\CodeProjects\\spikeinterface\\tridesclous2_output\\sorter_output\\sorting\\spikes.npy'
  File "C:\Users\lennart\spikeinterface\src\spikeinterface\sorters\basesorter.py", line 126, in initialize_folder
    shutil.rmtree(str(output_folder))
  File "C:\Users\lennart\spikeinterface\src\spikeinterface\sorters\runsorter.py", line 171, in run_sorter_local
    output_folder = SorterClass.initialize_folder(recording, output_folder, verbose, remove_existing_folder)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\lennart\spikeinterface\src\spikeinterface\sorters\runsorter.py", line 148, in run_sorter
    return run_sorter_local(**common_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\lennart\sciebo\Masterarbeit\CodeProjects\spikeinterface\AxonTracking\concatenateRecordings.py", line 55, in <module>
    multisorting = run_sorter('tridesclous2', concated, remove_existing_folder=True, verbose = True, **params)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
PermissionError: [WinError 32] Der Prozess kann nicht auf die Datei zugreifen, da sie von einem anderen Prozess verwendet wird: 'C:\\Users\\lennart\\sciebo\\Masterarbeit\\CodeProjects\\spikeinterface\\tridesclous2_output\\sorter_output\\sorting\\spikes.npy'

LeMuellerGuy avatar Jan 06 '24 15:01 LeMuellerGuy

Hi all. Thanks a lot for this tests. I guess that the spikes vector from the first run is not copied in memory and stiçll is a memmap which is a bad idea. I will make some test in this PR #2267. I have now a windows fully wokring!!

samuelgarcia avatar Jan 09 '24 09:01 samuelgarcia

I guess there is still the issue or a related one, trying to run built-in spike sorters give an error related to shared memory: ` nbytes = int(np.prod(shape) * dtype.itemsize) Error running tridesclous2 Traceback (most recent call last): File "D:\Users\ggaugain\Documents\postdoc\neuron_recordings\test_spikeinterface.py", line 104, in sorting = si.run_sorter(sorter_name=sname, recording=recording_saved, output_folder=base_folder /sname, remove_existing_folder=True,verbose=True, **sorter_params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\Users\ggaugain\Anaconda3\envs\si_env\Lib\site-packages\spikeinterface\sorters\runsorter.py", line 174, in run_sorter return run_sorter_local(**common_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\Users\ggaugain\Anaconda3\envs\si_env\Lib\site-packages\spikeinterface\sorters\runsorter.py", line 224, in run_sorter_local SorterClass.run_from_folder(output_folder, raise_error, verbose) File "D:\Users\ggaugain\Anaconda3\envs\si_env\Lib\site-packages\spikeinterface\sorters\basesorter.py", line 293, in run_from_folder raise SpikeSortingError( spikeinterface.sorters.utils.misc.SpikeSortingError: Spike sorting error trace: Traceback (most recent call last): File "D:\Users\ggaugain\Anaconda3\envs\si_env\Lib\site-packages\spikeinterface\sorters\basesorter.py", line 258, in run_from_folder SorterClass._run_from_folder(sorter_output_folder, sorter_params, verbose) File "D:\Users\ggaugain\Anaconda3\envs\si_env\Lib\site-packages\spikeinterface\sorters\internal\tridesclous2.py", line 123, in _run_from_folder recording = cache_preprocessing( ^^^^^^^^^^^^^^^^^^^^ File "D:\Users\ggaugain\Anaconda3\envs\si_env\Lib\site-packages\spikeinterface\sortingcomponents\tools.py", line 92, in cache_preprocessing recording = recording.save_to_memory(format="memory", shared=True, **job_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "D:\Users\ggaugain\Anaconda3\envs\si_env\Lib\site-packages\spikeinterface\core\base.py", line 853, in save_to_memory cached = self._save(format="memory", sharedmem=sharedmem, **save_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "D:\Users\ggaugain\Anaconda3\envs\si_env\Lib\site-packages\spikeinterface\core\baserecording.py", line 490, in _save cached = SharedMemoryRecording.from_recording(self, **job_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "D:\Users\ggaugain\Anaconda3\envs\si_env\Lib\site-packages\spikeinterface\core\numpyextractors.py", line 215, in from_recording traces_list, shms = write_memory_recording(source_recording, buffer_type="sharedmem", **job_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "D:\Users\ggaugain\Anaconda3\envs\si_env\Lib\site-packages\spikeinterface\core\recording_tools.py", line 322, in write_memory_recording arr, shm = make_shared_array(shape, dtype) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "D:\Users\ggaugain\Anaconda3\envs\si_env\Lib\site-packages\spikeinterface\core\core_tools.py", line 167, in make_shared_array shm = SharedMemory(name=None, create=True, size=nbytes) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "D:\Users\ggaugain\Anaconda3\envs\si_env\Lib\multiprocessing\shared_memory.py", line 77, in init raise ValueError("'size' must be a positive integer")

ValueError: 'size' must be a positive integer`

GabrielGaugain avatar Feb 16 '24 08:02 GabrielGaugain

@GabrielGaugain,

which version of spikeinterface are you using right now? @samuelgarcia and @yger have both done work on tdc2 and sc2 since this issue was first open. So if you haven't updated to the most recent version

pip install -U spikeinterface

I would start with that and see if that fixes it. If you've already done that then let us know so they can see what is happening.

zm711 avatar Feb 16 '24 11:02 zm711

@zm711 I have the latest version of spikeinterface, install with the same command but still got the same issue. Only tridesclous works among tridesclous2, spykingcircus2, and mountainsort5.

GabrielGaugain avatar Mar 22 '24 15:03 GabrielGaugain

Howdy Gabriel!

So TDC2 and SC2 have a caching error that fails on Windows. I have opened the discussion here #2164. Could you post your specific error there. @samuelgarcia and @yger will need to see the specific errors so they can work on fixing it for Windows users. I think we are close.

For Mountainsort5 there is an issue with the cleanup of files which is a known problem for Windows. We are currently deciding how we want to fix in this thread #2607. I can give you a work-around if you're willing to install spikeinterface from source and are using at least python 3.10.

zm711 avatar Mar 22 '24 15:03 zm711

What if you add the option cache_preprocessing={'mode' : None} when you are launching SC2? Does it work? Because I've encountered similar issues on windows machine with the automated cache.

yger avatar Mar 22 '24 15:03 yger

Same issue, It seems to go deeper when only one processor is used as setting n_jobs to 1 gave another error: File "d:\Users\ggaugain\Anaconda3\envs\si_env\Lib\site-packages\scipy\signal\_savitzky_golay.py", line 101, in savgol_coeffs raise ValueError("polyorder must be less than window_length.") ValueError: polyorder must be less than window_length.

Same for TDC2 with the following error: File "d:\Users\ggaugain\Anaconda3\envs\si_env\Lib\site-packages\sklearn\utils\validation.py", line 969, in check_array raise ValueError( ValueError: Found array with 0 sample(s) (shape=(0, 20)) while a minimum of 1 is required by TruncatedSVD.`

Might be specific to my data you think ?

GabrielGaugain avatar Mar 22 '24 15:03 GabrielGaugain

@GabrielGaugain as far as I know the polyorder error you got running in what I assume is Spyking Circus 2 is related to the savgol filter that (was?) used in it. The window length parameter has to be set according to your data's sampling rate as specified in https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.savgol_filter.html, specifically that the window length must be less or equal to the number of data points and that the polynomial order must be less than the window length. Because the parameter of spyking circus sets the window length in milliseconds, it is tied to your sampling rate. As far as I know the whole savgol part has been removed in the latest version (correct me if I'm wrong here) though.

LeMuellerGuy avatar Mar 22 '24 16:03 LeMuellerGuy

I guess there is still the issue or a related one, trying to run built-in spike sorters give an error related to shared memory: ` nbytes = int(np.prod(shape) * dtype.itemsize) Error running tridesclous2 Traceback (most recent call last): File "D:\Users\ggaugain\Documents\postdoc\neuron_recordings\test_spikeinterface.py", line 104, in sorting = si.run_sorter(sorter_name=sname, recording=recording_saved, output_folder=base_folder /sname, remove_existing_folder=True,verbose=True, **sorter_params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\Users\ggaugain\Anaconda3\envs\si_env\Lib\site-packages\spikeinterface\sorters\runsorter.py", line 174, in run_sorter return run_sorter_local(**common_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\Users\ggaugain\Anaconda3\envs\si_env\Lib\site-packages\spikeinterface\sorters\runsorter.py", line 224, in run_sorter_local SorterClass.run_from_folder(output_folder, raise_error, verbose) File "D:\Users\ggaugain\Anaconda3\envs\si_env\Lib\site-packages\spikeinterface\sorters\basesorter.py", line 293, in run_from_folder raise SpikeSortingError( spikeinterface.sorters.utils.misc.SpikeSortingError: Spike sorting error trace: Traceback (most recent call last): File "D:\Users\ggaugain\Anaconda3\envs\si_env\Lib\site-packages\spikeinterface\sorters\basesorter.py", line 258, in run_from_folder SorterClass._run_from_folder(sorter_output_folder, sorter_params, verbose) File "D:\Users\ggaugain\Anaconda3\envs\si_env\Lib\site-packages\spikeinterface\sorters\internal\tridesclous2.py", line 123, in _run_from_folder recording = cache_preprocessing( ^^^^^^^^^^^^^^^^^^^^ File "D:\Users\ggaugain\Anaconda3\envs\si_env\Lib\site-packages\spikeinterface\sortingcomponents\tools.py", line 92, in cache_preprocessing recording = recording.save_to_memory(format="memory", shared=True, **job_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "D:\Users\ggaugain\Anaconda3\envs\si_env\Lib\site-packages\spikeinterface\core\base.py", line 853, in save_to_memory cached = self._save(format="memory", sharedmem=sharedmem, **save_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "D:\Users\ggaugain\Anaconda3\envs\si_env\Lib\site-packages\spikeinterface\core\baserecording.py", line 490, in _save cached = SharedMemoryRecording.from_recording(self, **job_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "D:\Users\ggaugain\Anaconda3\envs\si_env\Lib\site-packages\spikeinterface\core\numpyextractors.py", line 215, in from_recording traces_list, shms = write_memory_recording(source_recording, buffer_type="sharedmem", **job_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "D:\Users\ggaugain\Anaconda3\envs\si_env\Lib\site-packages\spikeinterface\core\recording_tools.py", line 322, in write_memory_recording arr, shm = make_shared_array(shape, dtype) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "D:\Users\ggaugain\Anaconda3\envs\si_env\Lib\site-packages\spikeinterface\core\core_tools.py", line 167, in make_shared_array shm = SharedMemory(name=None, create=True, size=nbytes) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "D:\Users\ggaugain\Anaconda3\envs\si_env\Lib\multiprocessing\shared_memory.py", line 77, in init raise ValueError("'size' must be a positive integer")

ValueError: 'size' must be a positive integer`

This error I actually know how to solve. It is caused by that np.prod.

I will make a patch.

h-mayorquin avatar Mar 22 '24 18:03 h-mayorquin