spikeinterface icon indicating copy to clipboard operation
spikeinterface copied to clipboard

Sort input extensions

Open chrishalcrow opened this issue 1 year ago • 6 comments

Automatically sorts a list of extensions, for computation, so that all parents are “on the left” of their children. Improvement suggested by @alejoe91 .

For example, the SortingAnalyzer extension waveforms depends on random_spikes. If you run

sorting_analyzer.compute([“waveforms”, “random_spikes”])

this fails as the compute function scans left to right. It’s a bit annoying since the user has tried to include random_spikes. It’s also easy to image a user running sorting_analyzer.compute([“waveforms”]). si tells the user they need to have random_spikes and they naturally add it on the right of the list. So: worth improving!

This change would automatically sort the inputted list so that all parents are on the left of their children. The function itself doesn’t check if the whole list is valid (i.e. it doesn’t return an error if random_spikes isn’t there when calculating waveforms), it only sorts it. These checks happen downstream.

The function I’ve written is a bit awkward, because the compute parameters are dictionaries (if you input a list, this gets converted to a dictionary internally, so kwargs can be easily handled.). And dictionaries aren’t really meant to have an order: so insertions at given indices are tricky etc. Here, I turn the dict into two lists, do the sorting, then zip. If anyone thinks of a cleaner algorithm, let me know.

Also added tests.

chrishalcrow avatar Apr 11 '24 12:04 chrishalcrow

this fails as the compute function scans left to right. It’s a bit annoying since the user has tried to include random_spikes. It’s also easy to image a user running sorting_analyzer.compute([“waveforms”]). si tells the user they need to have random_spikes and they naturally add it on the right of the list. So: worth improving!

Isn't it easier to just tell the user to add the needed extension before the extension that generates the error?. I think this is easier to mantain and document. I also like it makes the dependency graph explicit on the list. All of this only matters if the dependency graph of the extensions is meant to become more complicated. If not, then either solution should work just fine.

If we go the way of parsing the dependency tree and re-order it as we do here we should be careful to not alter the list that the user passes as they might use it for something else and get an error because we modified their state. In concrete, we should copy at the beginning.

h-mayorquin avatar Apr 11 '24 22:04 h-mayorquin

Isn't it easier to just tell the user to add the needed extension before the extension that generates the error?. I think this is easier to mantain and document. I also like it makes the dependency graph explicit on the list. All of this only matters if the dependency graph of the extensions is meant to become more complicated. If not, then either solution should work just fine.

You could be right. The error reporting also helps the user be aware of the parent/child structure which might help overall understanding. If we use an error reporting method, it could:

  1. Just report "please put parents to the left of children"
  2. Report the first error it finds "please put random_spikes to the left of waveforms"
  3. Find and report all the errors as a list! "please put random_spikes to the left of waveforms. Please put waveform to the left of templates" I like 3: helpful and educational. @alejoe91 , opinions?

If we go the way of parsing the dependency tree and re-order it as we do here we should be careful to not alter the list that the user passes as they might use it for something else and get an error because we modified their state. In concrete, we should copy at the beginning.

Yes, good point thanks! So I shouldn't have edited extensions - I'll change on the next update.

chrishalcrow avatar Apr 12 '24 08:04 chrishalcrow

I think it's not a mistake to pass ["templates", "random_spikes"] as input, so I don't think that the techicality of what depends o what is so educational that should trigger an error.

alejoe91 avatar Apr 12 '24 08:04 alejoe91

I've updated it so that extensions is not edited/updated. Thanks @h-mayorquin

So the other issue comes down to whether ['waveforms', 'random_spikes'] is an error or not. This decision is beyond my paygrade ;)

chrishalcrow avatar Apr 12 '24 10:04 chrishalcrow

I've updated it so that extensions is not edited/updated. Thanks @h-mayorquin

So the other issue comes down to whether ['waveforms', 'random_spikes'] is an error or not. This decision is beyond my paygrade ;)

I believe this should be allowed and it's not an error from the user! Especially when passing a dict as input to the compute, re-sorting the dict according to dependencies is unnecessarily complex

alejoe91 avatar Apr 12 '24 10:04 alejoe91

Hi Chris. I am reading this very very late!! I am sorry. I did a few comments.

samuelgarcia avatar May 14 '24 20:05 samuelgarcia

merci!

samuelgarcia avatar May 21 '24 13:05 samuelgarcia