spikeinterface icon indicating copy to clipboard operation
spikeinterface copied to clipboard

Error during TDT Recording Extractor

Open kedoxey opened this issue 1 year ago • 4 comments

This error appears during the TDT recording extractor when converting to NWB format within NWB Guide and NeuroConv

Traceback (most recent call last):
  File "flask/app.py", line 1484, in full_dispatch_request
  File "flask/app.py", line 1469, in dispatch_request
  File "flask_restx/api.py", line 404, in wrapper
  File "flask/views.py", line 109, in view
  File "flask_restx/resource.py", line 46, in dispatch_request
  File "namespaces/neuroconv.py", line 70, in post
  File "manageNeuroconv/manage_neuroconv.py", line 414, in get_metadata_schema
  File "manageNeuroconv/manage_neuroconv.py", line 377, in instantiate_custom_converter
  File "neuroconv/nwbconverter.py", line 80, in __init__
    self.data_interface_objects = {
  File "neuroconv/nwbconverter.py", line 81, in <dictcomp>
    name: data_interface(**source_data[name])
  File "neuroconv/datainterfaces/ecephys/tdt/tdtdatainterface.py", line 40, in __init__
    super().__init__(
  File "neuroconv/datainterfaces/ecephys/baserecordingextractorinterface.py", line 38, in __init__
    self.recording_extractor = self.get_extractor()(**source_data)
  File "spikeinterface/extractors/neoextractors/tdt.py", line 34, in __init__
  File "spikeinterface/extractors/neoextractors/neobaseextractor.py", line 244, in __init__
  File "spikeinterface/core/baserecording.py", line 41, in __init__
  File "spikeinterface/core/baserecordingsnippets.py", line 25, in __init__
TypeError: data type "<class 'int'>" not understood

kedoxey avatar Jun 12 '24 21:06 kedoxey

@kedoxey If you still have it, the similar traceback that you got from the command line call also displayed the final line in SI with the error - can you copy that over? It had something to do with setting a numpy dtype

For context, I the latest versions of SI/neo work fine on current known example TDT data, but something subtle changes with @kedoxey's data

This type of error stems from a call of the form

numpy.dtype(str(type(1)))

implying some instance/type did not get parsed as expected (casted a string on the type rather than the value or something similar)

CodyCBakerPhD avatar Jun 12 '24 21:06 CodyCBakerPhD

Error trace from command line:

Traceback (most recent call last):
  File "/Users/katedoxey/Desktop/research/projects/CRCNS/data/nwb_conversion/convert_tdt.py", line 14, in <module>
    interface = TdtRecordingInterface(folder_path=tdt_dir, gain=1.0, verbose=False)
  File "/Users/katedoxey/miniconda3/lib/python3.9/site-packages/neuroconv/datainterfaces/ecephys/tdt/tdtdatainterface.py", line 40, in __init__
    super().__init__(
  File "/Users/katedoxey/miniconda3/lib/python3.9/site-packages/neuroconv/datainterfaces/ecephys/baserecordingextractorinterface.py", line 38, in __init__
    self.recording_extractor = self.get_extractor()(**source_data)
  File "/Users/katedoxey/miniconda3/lib/python3.9/site-packages/spikeinterface/extractors/neoextractors/tdt.py", line 34, in __init__
    NeoBaseRecordingExtractor.__init__(
  File "/Users/katedoxey/miniconda3/lib/python3.9/site-packages/spikeinterface/extractors/neoextractors/neobaseextractor.py", line 244, in __init__
    BaseRecording.__init__(self, sampling_frequency, chan_ids, dtype)
  File "/Users/katedoxey/miniconda3/lib/python3.9/site-packages/spikeinterface/core/baserecording.py", line 41, in __init__
    BaseRecordingSnippets.__init__(
  File "/Users/katedoxey/miniconda3/lib/python3.9/site-packages/spikeinterface/core/baserecordingsnippets.py", line 25, in __init__
    self._dtype = np.dtype(dtype)
TypeError: data type "<class 'int'>" not understood

kedoxey avatar Jun 12 '24 21:06 kedoxey

@CodyCBakerPhD this does not look like a spikeinterface problem. It seems that the at some point in the chain the dtype is serialized to string str(int) and then is not unserliazed correctly.

Probably something with the json at the communication between neuroconv and the guide.

h-mayorquin avatar Jun 14 '24 14:06 h-mayorquin

This is really confusing because these lines:

https://github.com/SpikeInterface/spikeinterface/blob/bd89c998afb29c204167a4f7aff360a1cd631da5/src/spikeinterface/extractors/neoextractors/neobaseextractor.py#L244-L245

Before calling the __init__ of BaseRecording on neobasextractors wher the error of @kedoxey is raised the dtype should already have been casted to dtype.

I don't think there is a way where the output of dtype can be a str(int). Is there?

So I think that there is something outside of spikeinterface when handling this case of neo maybe?

https://github.com/NeuralEnsemble/python-neo/blob/5d66cced6f85129e715d9f257e6240b25cfcb0cf/neo/rawio/tdtrawio.py#L253-L256

@kedoxey could you share your data just to be 100 % this is the case?

h-mayorquin avatar Jun 14 '24 15:06 h-mayorquin

Hi! I am facing this same error. If it is helpful this happens only if .sev files are in the TDT folder. .sev files are raw data from each channel of the ephys data. These .sev files are not included in every recording and one has to enable this option and hence this error does not crop up every time. For example in this post the sample TDT data does not have .sev files.

But this one does! I'll see if I can replicate the issue with this data.

Edit: Yes! I've replicated the issue with the example above. Specifically occurs when one tries to read stream_name="b'RSn1'"

arnabiswas avatar Feb 16 '25 00:02 arnabiswas

@arnabiswas thanks a bunch. Can you paste the code that you used to replicate the error? or you mean you replicated it on the guide?

h-mayorquin avatar Feb 17 '25 17:02 h-mayorquin

The error occurs when running spikeinterface.extractors.read_tdt(folder_path, stream_name = "b'RSn1'") with this dataset. You would need to rename the parent folders to make the extractor work. The name of the parent folder should be changed to 512ch else you will get a file not found error. Let me know if you need any more details and thanks for looking into this.

arnabiswas avatar Feb 18 '25 06:02 arnabiswas

Thanks a lot, that's enough. This is a busy week for me as we have an event at work but I will look into this as soon as possible.

h-mayorquin avatar Feb 18 '25 15:02 h-mayorquin

Yes, I will need more help with the file structure:

from pathlib import Path
from spikeinterface.extractors import read_tdt

folder_location = Path("/home/heberto/data/tdt/512ch")
folder_path = folder_location / "dataset_0_single_block"
assert folder_path.is_dir()

recording = read_tdt(folder_path=folder_path.parent , stream_name="b'RSn1'")

FileNotFoundError: [Errno 2] No such file or directory: '/home/heberto/data/tdt/512ch/dataset_0_single_block/512ch_dataset_0_single_block.Tbk'

And here is the file structure that I ama using:

Command: 	 tree -shL 3
[4.0K]  .
└── [4.0K]  512ch
    └── [4.0K]  dataset_0_single_block
        ├── [287K]  512ch_reconly_all-181123_B24_rest.Tbk
        ├── [313K]  512ch_reconly_all-181123_B24_rest.Tdx
        ├── [3.5M]  512ch_reconly_all-181123_B24_rest.tev
        ├── [ 38K]  512ch_reconly_all-181123_B24_rest.tin
        ├── [  23]  512ch_reconly_all-181123_B24_rest.tnt
        ├── [177K]  512ch_reconly_all-181123_B24_rest.tsq
        ├── [  80]  512ch_reconly_all-181123_B24_rest_Wav5_Ch1.sev
        ├── [  80]  512ch_reconly_all-181123_B24_rest_Wav5_Ch2.sev
        ├── [  80]  512ch_reconly_all-181123_B24_rest_Wav5_Ch3.sev
        ├── [  80]  512ch_reconly_all-181123_B24_rest_Wav5_Ch4.sev
        └── [ 809]  StoresListing.txt

3 directories, 11 files

how this should be modified? Using the folder_path without parent also brings another error.

h-mayorquin avatar Feb 19 '25 01:02 h-mayorquin

Sorry I did not give you the full correct path the last time round. For you the file path should be /home/heberto/data/tdt/512ch/reconly_all-181123_B24_rest/

Renaming the middle folder should hopefully get this working for you.

from pathlib import Path
from spikeinterface.extractors import read_tdt

folder_location = Path("/home/heberto/data/tdt/512ch")
folder_path = folder_location / "reconly_all-181123_B24_rest"
assert folder_path.is_dir()

recording = read_tdt(folder_path=folder_path.parent , stream_name="b'RSn1'")

     22         BaseExtractor.__init__(self, channel_ids)
     23         self._sampling_frequency = sampling_frequency
---> 24         self._dtype = np.dtype(dtype)
     25 
     26     @property

TypeError: data type "<class 'int'>" not understood

arnabiswas avatar Feb 19 '25 02:02 arnabiswas

Ok, I know what is causing the error. I will need to discuss this one with @samuelgarcia, I will push a fix.

h-mayorquin avatar Feb 21 '25 01:02 h-mayorquin

Hi, @arnabiswas

I finally had time to come back to this. The fix is in python-neo:

https://github.com/NeuralEnsemble/python-neo/pull/1650

Could you pull that branch and see if you can extract your data?

The thing is that I am sure this will allow you to read your data but I still think that we might need further adjustments so your data is not only read but also read correctly (I know!).

Let me know when you have a time to take a look at this.

h-mayorquin avatar Mar 05 '25 20:03 h-mayorquin

Thank you @h-mayorquin for taking a shot at this. I will try this out over the weekend and let you know. So far I have been converting my data to a binary file and then reading it, so I should have a reference for comparison. Thank you, I'll get back soon.

arnabiswas avatar Mar 07 '25 03:03 arnabiswas