python-neo icon indicating copy to clipboard operation
python-neo copied to clipboard

Incorrect channel IDs in `SpikeGadgetsRecordingExtractor`

Open khl02007 opened this issue 2 years ago • 5 comments

Describe the bug originally here https://github.com/SpikeInterface/spikeinterface/issues/1260 Recordings from SpikeGadgets hardware have two possible channel maps: HWChan (a channel map defined by the amplifier) and Trodes channel map (for displaying on Trodes, the data acquisition software for SpikeGadgets). The mapping between these are in the header of the file that contains the recording (in .rec format). The channel IDs one gets with get_channel_ids from a SpikeGadgetsRecordingExtractor is in the Trodes order (though the channel names are HWChan names), which is sensible. But the order of the traces is completely scrambled (neither Trodes nor HWChan).

To Reproduce

  • Get example data here https://ucsf.box.com/s/ah2u2avamtf4hj54mryxqo1kvl09t7tp
  • Load it
recording_rec = si.read_spikegadgets('.../20220919_130120/20220919_130120.rec',
                                     stream_name='trodes')
  • Download latest version of Trodes https://bitbucket.org/mkarlsso/trodes/downloads/
  • this contains a useful util called trodesexport which converts data to binary
  • run trodesexport to convert the file to binary:
trodesexport -rec path_to_file.rec -kilosort
  • load binary
recording_bin_ks = si.BinaryRecordingExtractor(file_paths='.../20220919_130120/20220919_130120.kilosort/20220919_130120.group0.dat', sampling_frequency=30e3, num_chan=128, dtype='int16')
  • compare. e.g. plot the first channel of recording_rec and recording_bin_ks
  • extra: to get the channel map info, look in the header of rec file

Expected behaviour they should match (but they don't)

Environment: latest version of Trodes and spikeinterface

khl02007 avatar Feb 12 '23 01:02 khl02007

@JuliaSprenger @samuelgarcia do you have time to look into this? I'm not familiar with SpikeGadgets.

Unless @khl02007 knows exactly what should be changed ;)

alejoe91 avatar Feb 16 '23 14:02 alejoe91

Sorry, I don't have any experience with this IO. @samuelgarcia Could you take over this one?

JuliaSprenger avatar Apr 27 '23 11:04 JuliaSprenger

Hi Kyu Julia and Alessio. I think I get enhough solicitation to answer : "yes I will have look" but I do not promise anything about the delay...

samuelgarcia avatar Apr 27 '23 14:04 samuelgarcia

Thanks to the recently released version of Trodes that gives the location of data from each channel within a packet, I think I finally understand what is going on here.

As I mentioned previously, the issue is that the mapping between the voltage traces and the channel IDs is incorrect in the current version of SpikeGadgetsRecordingExtractor. When one does recording.get_channel_ids(), one gets a list of channel IDs that correspond to the hwChan of each channel in the order of appearance in the header of the .rec file. But this does not match the order of the voltage traces one gets from recording.get_traces(). I learned that this is because it is not the order in which the data is arranged in the packets streamed from the recording hardware. Rather, the data in packets are saved according to the following order:

hwChan ID of channel 0 of first chip, hwChan ID of channel 0 of second chip, hwChan ID of channel 0 of third chip, ..., hwChan ID of channel 0 of Nth chip, 
hwChan ID of channel 1 of first chip, hwChan ID of channel 1 of second chip, ..., hwChan ID of channel 1 of Nth chip,
...

In other words, the order is chip-first, hwChan-second. Here, chip refers to the amplifier chip used to acquire data (usually part of the SpikeGadgets product) e.g. the 32-channel chip from Intan. For example, a 128-channel recording using these chips would have the data in the following hwChan order:

[  0,  32,  64,  96,   1,  33,  65,  97,   2,  34,  66,  98,   3, 35,  67,  99,   4,  36,  68, 100,
   5,  37,  69, 101,  6,  38,  70,  102, 7,  39,  71, 103,  8, 40,  72, 104,   9,  41,  73, 105,
  10,  42,  74, 106,  11,  43,  75, 107,  12,  44,  76, 108,  13,  45,  77, 109,  14,  46,  78, 110, 
  15,  47,  79, 111,  16,  48,  80, 112,  17,  49,  81, 113,  18,  50,  82, 114,  19,  51,  83, 115,
  20,  52,  84, 116,  21,  53,  85, 117,  22,  54,  86, 118,  23,  55,  87, 119,  24,  56,  88, 120,
  25,  57,  89, 121,   26,  58,  90, 122,  27,  59,  91, 123,  28,  60,  92, 124,  29, 61,  93, 125, 
  30,  62,  94, 126,  31,  63,  95, 127]

When I reordered the channels of a SpikeGadgetsRecordingExtractor with this (using the channel_slice trick) I get the correct mapping from channel IDs to the voltage traces (verified by comparing against the binary recording extracted from the rec file using trodesexport).

So to parse the data correctly, we need to know (1) how many channels there are and (2) how many channels there are per chip. Dividing (1) by (2) gives you the number of chips (assuming only one type of chip is used). Both (1) and (2) can be found by parsing the header of the rec file. Specifically, (1) can be found by parsing HardwareConfiguration field and (2) can be found by parsing SpikeConfiguration field.

Now that we understand the issue better, I think we can fix it. I can't promise to do this very soon so others should feel free to take this up, but may get to this eventually.

khl02007 avatar May 25 '23 19:05 khl02007

@rly demo of the workaround:

  1. First, download hwChan_1t_128ch.trodesconf that sets the channel order of extract binary to increasing HWChan order (https://ucsf.box.com/s/18tpk3tclcagjk3wrj9p5qxrl9kzskyo)
  2. Run trodesexport -rec FILENAME.rec -reconfig hwChan_1t_128ch.trodesconf -kilosort
rec_sg = si.read_spikegadgets(file_path='FILENAME.rec', stream_id="trodes" )
rec_sg.set_channel_gains(0.195)
rec_sg.set_channel_offsets(0)

rec_binary = si.BinaryRecordingExtractor(file_paths='BINARY_FILENAME.dat', 
sampling_frequency=30e3, num_chan=128, dtype='int16')


def produce_channel_ids(n_total_channels, n_channels_per_chip=32):
    x = []
    for k in range(n_channels_per_chip):
        x.append([k+i*n_channels_per_chip for i in range(int(n_total_channels/n_channels_per_chip))])
    return [item for sublist in x for item in sublist]

channel_ids32 = produce_channel_ids(rec_sg.get_num_channels())

rec_sg_fixed = rec_sg.channel_slice(channel_ids=rec_sg.get_channel_ids(),
                                    renamed_channel_ids=channel_ids32)
  1. compare traces from rec_binary and rec_sg_fixed (they should be the same)

khl02007 avatar Jun 28 '23 09:06 khl02007

Fixed by #1496

zm711 avatar Jul 24 '24 19:07 zm711