spikeinterface
spikeinterface copied to clipboard
Remove separate default job_kwarg `n_jobs` for sorters
Fixes #2696
Right now we aren't respecting the global n_jobs that user sets so the only way to override use all cores is to specifically pass n_jobs to the sorter. This is unintuitive if we allow the user to set this globally except in the case of sorters where it has to be set locally.
The current global default appears to be n_jobs=1 so we discussed whether that should be changed for people who don't know to set n_jobs so let me know if we want to do that in this PR too.
The current global default appears to be
n_jobs=1so we discussed whether that should be changed for people who don't know to setn_jobsso let me know if we want to do that in this PR too.
I vote for n_jobs=0.8 :)
@samuelgarcia @h-mayorquin @chrishalcrow what do you guys think?
I guess the one concern with changing to a default of multiprocessing is that we may we see more issues whereas with opt-in not everyone is using those. But if we decide to do a default then I think 0.7 or 0.8 seems safe enough!
I think that 0.8 cores sounds good as a default. My intuition is that we should leave one physical core free as most people multitask x D.
@samuelgarcia now that you're back opinions on n_jobs default? I think we hard code a default for now as we wait for a jobs_kwargs optimizer in the future.
If this is for the sorter 0.8 sounds OK (because it is only to write this binary).
But we need to have in mind that on some computing nodes you can have many many cores that would lead to an enormous ram demand.
So for the main n_jobs I prefer to keep 1 so it force the user to think about it to improve speed.
I don't like changing the n_jobs without the user knowing. So if the desired global n_jobs would be n_jobs=1 then I would prefer just to feed that in to sorters as well and the end user can decide to switch to multiprocessing for sorters. But having n_jobs >1 seems like it might take more debate eh?
Based on #2826 and #2820. I think even for sorters we should default to n_jobs=1 because some computers aren't working with multiprocessing at all. Currently not all sorters accept n_jobs as a parameter so maybe we add that to base sorting instead?
I agree. Maybe a warning at the fix_job_kwargs level that "warns" that using n_jobs=1 is not optimal?
I am slightly against the warning but I can't think on a better way to let them know : /
Two suggestions if we go with the warning:
- It should not be in a very general place as some sorters don't accept that and then it becomes and irrelevant warning.
- There should be a way to deactivate it that does not require you to do multiprocessing. As in, yes, I am aware but no, I don't want to use multiprocesing option.
@samuelgarcia suggested a global flag global_kwargs_set that by session is set to True if the user calls the set_global_job_kwargs. I think it's a good idea :)
@zm711 I made the suggested changes.
Now the fix_job_kwargs warns if n_jobs=1 and it hasn't been set (either globally, or locally).
Also extended a bit the job_kwargs documentation.
Thanks Alessio!
I am OK with the logic. I made a proposal.