elephant icon indicating copy to clipboard operation
elephant copied to clipboard

Modify error handling in function spike_contrast (spike_train_synchrony.py)

Open ManuelCiba opened this issue 10 months ago • 6 comments

Description:

This pull request suggests an enhancement for the spike_contrast function. Currently, the function raises a TypeError for input spike trains containing no more than 1 spike. The proposed modification replaces the TypeError with a warning to improve the user experience during batch analysis.

More precisely, the function will return a complete data structure 'trace' with all bin_size values, even if the synchrony calculation was not possible. The synchrony values will be 'nan'. This will lead to consistent data structures among full spike trains and empty spike trains.

Changes:

  • Replaced TypeError with a warning in the spike_train_synchrony.py function spike_contrast.
  • Removed the test case in the test_spike_train_synchrony.py function test_invalid_data.

Testing:

  • spike_contrast has been tested.
  • Changes to test_spike_train_synchrony were made directly on GitHub, so no local tests have been run yet.

ManuelCiba avatar Apr 04 '24 15:04 ManuelCiba

Hello @ManuelCiba! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 197:5: E722 do not use bare 'except' Line 198:80: E501 line too long (115 > 79 characters) Line 199:80: E501 line too long (93 > 79 characters) Line 199:94: W291 trailing whitespace Line 200:80: E501 line too long (97 > 79 characters) Line 200:98: W291 trailing whitespace Line 202:80: E501 line too long (115 > 79 characters) Line 202:116: W291 trailing whitespace

Line 92:1: W293 blank line contains whitespace Line 93:80: E501 line too long (91 > 79 characters) Line 93:92: W291 trailing whitespace Line 95:9: E265 block comment should start with '# '

Comment last updated at 2024-04-17 12:54:48 UTC

pep8speaks avatar Apr 04 '24 15:04 pep8speaks

Hello @ManuelCiba thank you for reaching out to us.

I ran the following minimal example, with the code on branch ManuelCiba:modify_error_handling_in_spike-contrast:

import neo
from quantities import ms
from elephant.spike_train_synchrony import spike_contrast

# Create list of spiektrains, where each spiketrain contains only one spike
spiketrains = [
    neo.SpikeTrain([10] * ms, t_stop=1000 * ms),
    neo.SpikeTrain([20] * ms, t_stop=1000 * ms),
]
for idx, spiketrain in enumerate(spiketrains):
    print(f"Spike train no. {idx}: {spiketrain}")

# Run spike_contrast from elephant.spike_train_synchrony
synchrony, spike_contrast_trace = spike_contrast(
    spiketrains, return_trace=True
)

print(f"Synchrony: {synchrony}")
print(f"Contrast: {spike_contrast_trace.contrast}")
print(f"Active Spiketrains: {spike_contrast_trace.active_spiketrains}")
print(f"Synchrony: {spike_contrast_trace.synchrony}")
print(f"Binsize: {spike_contrast_trace.bin_size}")

which yields:

Spike train no. 0: [10.] ms
Spike train no. 1: [20.] ms
elephant/elephant/spike_train_synchrony.py:201: UserWarning: All input spiketrains contain no more than 1 spike.
  warnings.warn("All input spiketrains contain no more than 1 spike.")
Synchrony: 0.5
Contrast: [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.75, 0.75, 0.75, 0.75, 0.25, 0.25, 0.25]
Active Spiketrains: [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.6666666666666667, 0.6666666666666667, 0.6666666666666667, 0.6666666666666667, 0.6666666666666667, 0.6666666666666667, 0.6666666666666667, 0.5, 0.5, 0.5, 0.5, 0.0, 0.0, 0.0]
Synchrony: [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.33333333333333337, 0.33333333333333337, 0.33333333333333337, 0.33333333333333337, 0.33333333333333337, 0.33333333333333337, 0.33333333333333337, 0.375, 0.375, 0.375, 0.375, 0.0, 0.0, 0.0]
Binsize: [500.         450.         405.         364.5        328.05
 295.245      265.7205     239.14845    215.233605   193.7102445
 174.33922005 156.90529805 141.21476824 127.09329142 114.38396227
 102.94556605  92.65100944  83.3859085   75.04731765  67.54258588
  60.7883273   54.70949457  49.23854511  44.3146906   39.88322154
  35.89489938  32.30540945  29.0748685   26.16738165  23.55064349
  21.19557914  19.07602122  17.1684191   15.45157719  13.90641947

Did I get the described scenario correctly with this minimal example? I assume batch processing refers to processing a number of lists of spike trains. i.e. list1 = [st1, st2, st3, ...], list2 = [st1, st2, st3, ...] in this case the process will be interrupted if one call e.g. spike_contrast(list1) raises an error.

Is this the desired output?, or should the value of synchrony be NaN, more precisely be of type numpy.nan ? (Currently it raises a warning and returns 0.5)

The current implementation (Elephant v1.0.0) raises an error for the above script:

Spike train no. 0: [10.] ms
Spike train no. 1: [20.] ms
Traceback (most recent call last):
elephant/build/test_issue_#626.py", line 14, in <module>
    synchrony, spike_contrast_trace = spike_contrast(
elephant/elephant/spike_train_synchrony.py", line 193, in spike_contrast
    isi_min = min(np.diff(st).min() for st in spiketrains if len(st) > 1)
ValueError: min() arg is an empty sequence

I was expecting: ValueError: All input spiketrains contain no more than 1 spike.

PS: Just out of curiosity, are you the author of "Spike-contrast: A novel time scale independent and multivariate measure of spike train synchrony" ? Please correct me if I get this wrong.

Moritz-Alexander-Kern avatar Apr 05 '24 13:04 Moritz-Alexander-Kern

Hello Moritz,

Thank you for the quick reply and for testing the code!

You are right, I'm the author. Recently, I switched from Matlab to Python, so I'm finally using Elephant for my data analysis :)

Regarding batch analysis: I have, for example, 100 recordings, each containing 60 parallel recorded spike trains. For each of the 100 recordings, I want to calculate the synchrony, having all 60 spike trains in one list. Some of the 100 recordings do not contain any spikes. However, I want to save the synchrony curve (trace.synchrony) also for these empty files, in order to generate a consistent data structure that can be used for machine learning, etc.

Of course, it is also possible for the user to handle this case in their script. But I noticed that it is more convenient to receive a result from the function (since the number of bins depends on t_start and t_stop).

Regarding your example: A synchrony value of 0.5 is mathematically correct considering how Spike-contrast is defined (since both spikes are within the first bin). This particular spike train could be considered as non-stationary since all the activity is just in the beginning. In the paper, I mentioned that non-stationary spike trains will lead to unreliable synchrony values. So, we could leave it to the responsibility of the user to ensure "good" spike trains(?).

ManuelCiba avatar Apr 05 '24 16:04 ManuelCiba

Hello @ManuelCiba ,

Thank you for your insights, now I understand more.

I didn't consider the stationarity of spike trains, I was wondering about this line:

https://github.com/NeuralEnsemble/elephant/blob/fe5053a52a3aaa5c86d9a9c61f72de14b364df1a/elephant/spike_train_synchrony.py#L193

For me the distinction between stationary and non-stationary spike trains is not obvious in this line, especially when dealing with cases where there are only one or two spikes per spiketrain. Given this ambiguity, I agree, to leave it to the user to ensure good spike trains, only the "obvious cases" are handled by returning NaN? Here NaN serves as a convenient way to handle cases where numerical values cannot be represented or computed in a meaningful way, providing a clear indication of such situations in the results.

So in conclusion, we could proceed with returning NaN for empty spike trains, as there is no sensible definition of synchrony in such cases? and raise a Warning? For spike trains containing one spike, we could continue providing a value, but still raise a warning? or also return NaN and raise a Warning?

By adopting this approach, we maintain clarity in cases where synchrony cannot be computed due to empty spike trains while providing results for cases where spikes are present. This aligns with the preference to leave the responsibility of ensuring "nice" spike trains to the user. Currently I don't see an obvious way to handle non-stationary spike trains which lead to unreliable synchrony values. Maybe we could add a section to the documentation explaining/warning about this?

Looking forward to hear your thoughts on this.

Moritz-Alexander-Kern avatar Apr 09 '24 14:04 Moritz-Alexander-Kern

For completion: Here is an example with empty spike trains using the code on branch ManuelCiba:modify_error_handling_in_spike-contrast:

import neo
from quantities import ms
from elephant.spike_train_synchrony import spike_contrast

# Create list of spiektrains, where each spiketrain contains only one spike
spiketrains = [
    neo.SpikeTrain([]*ms, t_stop=1000 * ms),
    neo.SpikeTrain([]*ms, t_stop=1000 * ms),
]
for idx, spiketrain in enumerate(spiketrains):
    print(f"Spike train no. {idx}: {spiketrain}")

# Run spike_contrast from elephant.spike_train_synchrony
synchrony, spike_contrast_trace = spike_contrast(
    spiketrains, return_trace=True
)


print(f"Synchrony: {synchrony}")
print(f"Contrast: {spike_contrast_trace.contrast}")
print(f"Active Spiketrains: {spike_contrast_trace.active_spiketrains}")
print(f"Synchrony: {spike_contrast_trace.synchrony}")
print(f"Binsize: {spike_contrast_trace.bin_size}")

Output:

Spike train no. 0: [] ms
Spike train no. 1: [] ms
/elephant/elephant/spike_train_synchrony.py:201: UserWarning: All input spiketrains contain no more than 1 spike.
  warnings.warn("All input spiketrains contain no more than 1 spike.")
/elephant/elephant/spike_train_synchrony.py:224: RuntimeWarning: invalid value encountered in scalar divide
  active_st = (np.sum(n_k * theta_k) / np.sum(theta_k) - 1) / (
/elephant/elephant/spike_train_synchrony.py:226: RuntimeWarning: invalid value encountered in scalar divide
  contrast = np.sum(np.abs(np.diff(theta_k))) / (2 * n_spikes_total)
Synchrony: nan
Contrast: [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
Active Spiketrains: [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
Synchrony: [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
Binsize: [500.         450.         405.         364.5        328.05
 295.245      265.7205     239.14845    215.233605   193.7102445
 174.33922005 156.90529805 141.21476824 127.09329142 114.38396227
 102.94556605  92.65100944  83.3859085   75.04731765  67.54258588
  60.7883273   54.70949457  49.23854511  44.3146906   39.88322154
  35.89489938  32.30540945  29.0748685   26.16738165  23.55064349
  21.19557914  19.07602122  17.1684191   15.45157719  13.90641947
  12.51577752  11.26419977  10.1377798 ] ms

Moritz-Alexander-Kern avatar Apr 09 '24 14:04 Moritz-Alexander-Kern

Coverage Status

coverage: 88.239% (-0.03%) from 88.264% when pulling 47db6108b54e44a97a01d0ffb9d8af88dc0933b7 on ManuelCiba:modify_error_handling_in_spike-contrast into a4e601e1b70d8b7eef5dd555da4af20acc20ec01 on NeuralEnsemble:master.

coveralls avatar Apr 17 '24 13:04 coveralls