elephant
elephant copied to clipboard
Modify error handling in function spike_contrast (spike_train_synchrony.py)
Description:
This pull request suggests an enhancement for the spike_contrast
function. Currently, the function raises a TypeError
for input spike trains containing no more than 1 spike. The proposed modification replaces the TypeError
with a warning to improve the user experience during batch analysis.
More precisely, the function will return a complete data structure 'trace' with all bin_size values, even if the synchrony calculation was not possible. The synchrony values will be 'nan'. This will lead to consistent data structures among full spike trains and empty spike trains.
Changes:
- Replaced
TypeError
with a warning in thespike_train_synchrony.py
functionspike_contrast
. - Removed the test case in the
test_spike_train_synchrony.py
functiontest_invalid_data
.
Testing:
-
spike_contrast
has been tested. - Changes to
test_spike_train_synchrony
were made directly on GitHub, so no local tests have been run yet.
Hello @ManuelCiba! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:
- In the file
elephant/spike_train_synchrony.py
:
Line 197:5: E722 do not use bare 'except' Line 198:80: E501 line too long (115 > 79 characters) Line 199:80: E501 line too long (93 > 79 characters) Line 199:94: W291 trailing whitespace Line 200:80: E501 line too long (97 > 79 characters) Line 200:98: W291 trailing whitespace Line 202:80: E501 line too long (115 > 79 characters) Line 202:116: W291 trailing whitespace
- In the file
elephant/test/test_spike_train_synchrony.py
:
Line 92:1: W293 blank line contains whitespace Line 93:80: E501 line too long (91 > 79 characters) Line 93:92: W291 trailing whitespace Line 95:9: E265 block comment should start with '# '
Comment last updated at 2024-04-17 12:54:48 UTC
Hello @ManuelCiba thank you for reaching out to us.
I ran the following minimal example, with the code on branch ManuelCiba:modify_error_handling_in_spike-contrast:
import neo
from quantities import ms
from elephant.spike_train_synchrony import spike_contrast
# Create list of spiektrains, where each spiketrain contains only one spike
spiketrains = [
neo.SpikeTrain([10] * ms, t_stop=1000 * ms),
neo.SpikeTrain([20] * ms, t_stop=1000 * ms),
]
for idx, spiketrain in enumerate(spiketrains):
print(f"Spike train no. {idx}: {spiketrain}")
# Run spike_contrast from elephant.spike_train_synchrony
synchrony, spike_contrast_trace = spike_contrast(
spiketrains, return_trace=True
)
print(f"Synchrony: {synchrony}")
print(f"Contrast: {spike_contrast_trace.contrast}")
print(f"Active Spiketrains: {spike_contrast_trace.active_spiketrains}")
print(f"Synchrony: {spike_contrast_trace.synchrony}")
print(f"Binsize: {spike_contrast_trace.bin_size}")
which yields:
Spike train no. 0: [10.] ms
Spike train no. 1: [20.] ms
elephant/elephant/spike_train_synchrony.py:201: UserWarning: All input spiketrains contain no more than 1 spike.
warnings.warn("All input spiketrains contain no more than 1 spike.")
Synchrony: 0.5
Contrast: [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.75, 0.75, 0.75, 0.75, 0.25, 0.25, 0.25]
Active Spiketrains: [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.6666666666666667, 0.6666666666666667, 0.6666666666666667, 0.6666666666666667, 0.6666666666666667, 0.6666666666666667, 0.6666666666666667, 0.5, 0.5, 0.5, 0.5, 0.0, 0.0, 0.0]
Synchrony: [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.33333333333333337, 0.33333333333333337, 0.33333333333333337, 0.33333333333333337, 0.33333333333333337, 0.33333333333333337, 0.33333333333333337, 0.375, 0.375, 0.375, 0.375, 0.0, 0.0, 0.0]
Binsize: [500. 450. 405. 364.5 328.05
295.245 265.7205 239.14845 215.233605 193.7102445
174.33922005 156.90529805 141.21476824 127.09329142 114.38396227
102.94556605 92.65100944 83.3859085 75.04731765 67.54258588
60.7883273 54.70949457 49.23854511 44.3146906 39.88322154
35.89489938 32.30540945 29.0748685 26.16738165 23.55064349
21.19557914 19.07602122 17.1684191 15.45157719 13.90641947
Did I get the described scenario correctly with this minimal example? I assume batch processing refers to processing a number of lists of spike trains. i.e. list1 = [st1, st2, st3, ...], list2 = [st1, st2, st3, ...]
in this case the process will be interrupted if one call e.g. spike_contrast(list1)
raises an error.
Is this the desired output?, or should the value of synchrony
be NaN
, more precisely be of type numpy.nan
? (Currently it raises a warning and returns 0.5
)
The current implementation (Elephant v1.0.0) raises an error for the above script:
Spike train no. 0: [10.] ms
Spike train no. 1: [20.] ms
Traceback (most recent call last):
elephant/build/test_issue_#626.py", line 14, in <module>
synchrony, spike_contrast_trace = spike_contrast(
elephant/elephant/spike_train_synchrony.py", line 193, in spike_contrast
isi_min = min(np.diff(st).min() for st in spiketrains if len(st) > 1)
ValueError: min() arg is an empty sequence
I was expecting: ValueError: All input spiketrains contain no more than 1 spike.
PS: Just out of curiosity, are you the author of "Spike-contrast: A novel time scale independent and multivariate measure of spike train synchrony" ? Please correct me if I get this wrong.
Hello Moritz,
Thank you for the quick reply and for testing the code!
You are right, I'm the author. Recently, I switched from Matlab to Python, so I'm finally using Elephant for my data analysis :)
Regarding batch analysis: I have, for example, 100 recordings, each containing 60 parallel recorded spike trains. For each of the 100 recordings, I want to calculate the synchrony, having all 60 spike trains in one list. Some of the 100 recordings do not contain any spikes. However, I want to save the synchrony curve (trace.synchrony) also for these empty files, in order to generate a consistent data structure that can be used for machine learning, etc.
Of course, it is also possible for the user to handle this case in their script. But I noticed that it is more convenient to receive a result from the function (since the number of bins depends on t_start and t_stop).
Regarding your example: A synchrony value of 0.5 is mathematically correct considering how Spike-contrast is defined (since both spikes are within the first bin). This particular spike train could be considered as non-stationary since all the activity is just in the beginning. In the paper, I mentioned that non-stationary spike trains will lead to unreliable synchrony values. So, we could leave it to the responsibility of the user to ensure "good" spike trains(?).
Hello @ManuelCiba ,
Thank you for your insights, now I understand more.
I didn't consider the stationarity of spike trains, I was wondering about this line:
https://github.com/NeuralEnsemble/elephant/blob/fe5053a52a3aaa5c86d9a9c61f72de14b364df1a/elephant/spike_train_synchrony.py#L193
For me the distinction between stationary and non-stationary spike trains is not obvious in this line, especially when dealing with cases where there are only one or two spikes per spiketrain. Given this ambiguity, I agree, to leave it to the user to ensure good spike trains, only the "obvious cases" are handled by returning NaN? Here NaN serves as a convenient way to handle cases where numerical values cannot be represented or computed in a meaningful way, providing a clear indication of such situations in the results.
So in conclusion, we could proceed with returning NaN for empty spike trains, as there is no sensible definition of synchrony in such cases? and raise a Warning? For spike trains containing one spike, we could continue providing a value, but still raise a warning? or also return NaN and raise a Warning?
By adopting this approach, we maintain clarity in cases where synchrony cannot be computed due to empty spike trains while providing results for cases where spikes are present. This aligns with the preference to leave the responsibility of ensuring "nice" spike trains to the user. Currently I don't see an obvious way to handle non-stationary spike trains which lead to unreliable synchrony values. Maybe we could add a section to the documentation explaining/warning about this?
Looking forward to hear your thoughts on this.
For completion: Here is an example with empty spike trains using the code on branch ManuelCiba:modify_error_handling_in_spike-contrast:
import neo
from quantities import ms
from elephant.spike_train_synchrony import spike_contrast
# Create list of spiektrains, where each spiketrain contains only one spike
spiketrains = [
neo.SpikeTrain([]*ms, t_stop=1000 * ms),
neo.SpikeTrain([]*ms, t_stop=1000 * ms),
]
for idx, spiketrain in enumerate(spiketrains):
print(f"Spike train no. {idx}: {spiketrain}")
# Run spike_contrast from elephant.spike_train_synchrony
synchrony, spike_contrast_trace = spike_contrast(
spiketrains, return_trace=True
)
print(f"Synchrony: {synchrony}")
print(f"Contrast: {spike_contrast_trace.contrast}")
print(f"Active Spiketrains: {spike_contrast_trace.active_spiketrains}")
print(f"Synchrony: {spike_contrast_trace.synchrony}")
print(f"Binsize: {spike_contrast_trace.bin_size}")
Output:
Spike train no. 0: [] ms
Spike train no. 1: [] ms
/elephant/elephant/spike_train_synchrony.py:201: UserWarning: All input spiketrains contain no more than 1 spike.
warnings.warn("All input spiketrains contain no more than 1 spike.")
/elephant/elephant/spike_train_synchrony.py:224: RuntimeWarning: invalid value encountered in scalar divide
active_st = (np.sum(n_k * theta_k) / np.sum(theta_k) - 1) / (
/elephant/elephant/spike_train_synchrony.py:226: RuntimeWarning: invalid value encountered in scalar divide
contrast = np.sum(np.abs(np.diff(theta_k))) / (2 * n_spikes_total)
Synchrony: nan
Contrast: [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
Active Spiketrains: [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
Synchrony: [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
Binsize: [500. 450. 405. 364.5 328.05
295.245 265.7205 239.14845 215.233605 193.7102445
174.33922005 156.90529805 141.21476824 127.09329142 114.38396227
102.94556605 92.65100944 83.3859085 75.04731765 67.54258588
60.7883273 54.70949457 49.23854511 44.3146906 39.88322154
35.89489938 32.30540945 29.0748685 26.16738165 23.55064349
21.19557914 19.07602122 17.1684191 15.45157719 13.90641947
12.51577752 11.26419977 10.1377798 ] ms