spikeinterface
spikeinterface copied to clipboard
`SortingAnalyzer.compute(save=True)` is a little unclear
So when using the compute function to do postprocessing, qc etc, I can give save=True or save=False. For testing I just created a SortingAnalyzer in memory. Although it makes sense that save=True does nothing since I'm in memory we will definitely need to document that this is how it works.
I can imagine a workflow where I created my SortingAnalyzer in memory and just assume that by setting save=True it automatically saves the result without actually doing the save_as. This isn't super clear or user friendly to understand the distinction right now.
Solutions could be:
- Better warning messaging so people understand how extensions work with memory mode
- Better documentation (not done yet I know :) , but an important section that would need to be added)
Otherwise, the extensions all seem to be working pretty well. I think the error messaging to say "you are missing this extension run comput(extension) " is super useful. That was a really nice touch :)
I'm going to be testing some plotting this afternoon.
Yes we need to docuemtn, this is one of the best new stuff we have in the new design 2 apporaches:
- everything in memory (and then save=True/False is ignored) and then save_as()
- everything on disk (zarr or npy) and then save on the fly (or not) every computation. The save=False make a lot of sens in spikeinterface-gui because you may want to view crosscorrelogram with new bins without changing the folder itself. See https://github.com/SpikeInterface/spikeinterface-gui/pull/53
I agree it's great. I'm constantly wishing I could tweak one tiny thing without changing everything. I think the concern from a mental schema perspective is
save and save_as are semantically loaded in that they do the same thing except save_as you change the name of your file. So I as a user may (erroneously in this case) assume just by running save it does the same thing as save_as just without changing the format. I'm trying to think of a good way to make sure users don't make this incorrect mental leap when it is so easy based on using Office or Apple's docs. (I assume LibreOffice is the same semantically).
Basically I'm saying that people will bring in their own preconceived notions of save vs save_as.
For example the term export is often used instead when changing the format of a file. So many users will be used to save to save in place same title. save as will save in place with new title to file. export will change the format of the file and optionally change the name.
Hello, just working on docs, and now also think this is quite confusing. Just to make Zach's point more explicit, consider the following code:
recording, sorting = generate_ground_truth_recording(
durations=[
5.0,
],
sampling_frequency=30_000.0,
num_channels=6,
num_units=10,
generate_sorting_kwargs=dict(firing_rates=6.0, refractory_period_ms=4.0),
noise_kwargs=dict(noise_levels=5.0, strategy="tile_pregenerated"),
seed=2205,
)
sorting_analyzer = create_sorting_analyzer(sorting, recording, format="memory", sparse=True)
sorting_analyzer.save_as(format="binary_folder", folder="a_sorting_analyzer")
sorting_analyzer.compute("random_spikes", save=True)
I would expect the random_spikes to be saved in the folder I had created. And I'd be annoyed that it wasn't and there was no warning, since I explicitly wrote save=True.
Here, the sorting_analyzer is in memory, but the user is trying to save the extension in a folder. I see that it's a bit tricky to resolve. Ideally you'd have save=False by default if the analyzer was in memory, but save=True by default if the analyzer was on disk (using lots of Nones and ifs). Then (I'd say) post a warning if the user tried save=True when the analyzer was in memory. Thoughts?
Also, it would be nice to have an easy way to link up a sorting_analyzer (which is currently in memory) to a folder. I think the current way is to load it? ie this code:
from spikeinterface.core import load_sorting_analyzer
sorting_analyzer = load_sorting_analyzer(folder="a_sorting_analyzer")
sorting_analyzer.compute("random_spikes", save=True)
does what I was expecting the earlier code to do.
Also closing this with #2719.