spikeinterface icon indicating copy to clipboard operation
spikeinterface copied to clipboard

Add IO tutorial

Open h-mayorquin opened this issue 1 year ago • 6 comments

OK, I have been very busy but been meaning to this since #3053 and #2958

We should add an IO tutorial were we explain was is the way that we intended for spikeinterface objects to be saved.

This is a discussion post to discuss the details.

My opinion:

  • I think it should include a description of the two main functions to save. save_to_binary and save_to_zarr and what kind of arguments it supports.
  • A link to the write to nwb how to (on the slow makings by me!).
  • Their relationship to provenance, efficiency and a description of the formats. For the folder the structure should include the structure of the folder and for zarr the same equivalent tree (@alejoe91 )

Probably some of this information is distributed in the modules documentation. I will need to fish was already there and just add structure to it.

h-mayorquin avatar Jun 27 '24 17:06 h-mayorquin

Also the structure of analyzers folders/part paths!

alejoe91 avatar Jun 27 '24 17:06 alejoe91

This sounds great!! I am not so familiar with save_to_binary and save_to_zarr, where does the recording.save() fit it? Are there any other saving functions?

JoeZiminski avatar Jun 27 '24 18:06 JoeZiminski

save() is a convenience function router that ends up in one or the other through a rather complicated path that I aim to document as some point : )

h-mayorquin avatar Jun 27 '24 18:06 h-mayorquin

I do think but I forget where this was stated (I think it was @JoeZiminski ), our docstring formatting injection really fails for save. Sometimes I try to remember what arguments I need for saving a sorting vs saving a recording and the docstring isn't perfect. So i really support an IO tutorial so that we at least lay it out! Thanks for writing this up @h-mayorquin !

zm711 avatar Jun 28 '24 13:06 zm711

Great so just to review, ATM there is:

  1. si.write_binary_recording (writes recording to a single .raw file with no spike-interface metadata).
  2. si.write_to_h5_dataset_format similar to write_binary_recording but so an h5 file
  3. recording.save_to_memory() I'm not so sure what this does but it looks very cool
  4. recording.save_to_binary() Saves to folder with data stored in binary + some spikeinterface metadata
  5. recording.save_to_zarr(). Same as above but with zarr
  6. the recording.save() frontend. Convenience function around the recording methods.

It's awesome that so many file writing methods are supported. I wonder if these is some room for API optimisation, although it is certainly not simple. It is complicated by the fact that 1) There are different (all useful) ways to save the data, as a standalone file (binary, h5) or in "spikienterface-format", and that these functions all require different kwarg sets. Initially I thought it would be nice to route everything through recording.save() and make everything else private, but the differing kwarg sets make this impossible.

Some ways to streamline might be: make a distinction between spikeinterface-style saving (e.g. save_as_spikeinterface_format(format="binary") (with a better name) to distinguish it from the standalone binary write_binary_recording as easy to get confused between these. It might also be worth moving write_binary_recording and write_to_h5_daaset_format to the recording object so everything is in one place? and somehow incorporating these into the save() function? (these could be the front-end interface for these functions discussed in #2958).

I'm not 100% sure on the above, the number 1 thing to help make all this clear will be this IO tutorial, it will be super useful!

JoeZiminski avatar Jul 01 '24 12:07 JoeZiminski

Related to here: #3111

h-mayorquin avatar Jul 01 '24 18:07 h-mayorquin