neural_timeseries_diffusion Running the diffusion model on new data from AJILE ECOG Dataset

Hello!

I would like to apply the diffusion model on data extracted from another patient from the AJile ECOG Dataset, but I'm having trouble with data formatting.

I have a few questions:

What kind of preprocessing is required for the .nwb file, extracted from the dataset, to make it compatible with the model? Is there any guideline available for this?
Is the entire raw data processed by the model in this repository, or do I need to filter specific keys?
What changes do I need to make to the model configuration to adapt it for a new input?

Thank you for your time and for providing this great resource!

Oct 17 '24 23:10 artpedro

Hi!

Re 1: I'm personally not very familiar with the full AJILE Dataset myself! The experiments in the paper used the preprocessed data from the this paper (which is provided here).

Re 2: The model is in principle able to handle arbitrary neurophysiological recordings. As long as you have a training data set of $N$ recordings with $C$ channels and recording length $L$ (i.e, an $N \times C \times L$ tensor), you can apply the model to it.

Re 3: If you follow the same preprocessing as described in the paper above, the hyperparameters given in the repo might be a good starting point. In general, the hyperparameters associated to the structured convolutional layers are the most important to tune!

Hope this helps!

Oct 22 '24 09:10 jsvetter

Hello, same questions here. I'm quite new to the diffusion model. I have continuously recorded LFP data from ECoG electrodes with data shape e.g.,(3, 100000) # (channels, samples). I wish to use this data to impute (conditinally generating) one channel DBS data with shape e.g., (1, 100000). How should I arrange my data to feed it to the model? In the demo notebook, there is only one data with shape [340, 56, 260]. Any feedback is appreciated!

Oct 31 '24 03:10 zixiao-yin

Hi!

In this case, as in the paper, it would be possible to cut the recording into shorter segments to create a training data set of shape $(N,C,L)$.

Longer dependencies beyond the length of the segments will not be captured, but a trained model could still be applied to a segment of arbitrary length at inference time!

Nov 07 '24 12:11 jsvetter

Hi @jsvetter,

I'm trying to train your model on a time series dataset with 200 channels and 2000 samples (time stamps). How can I modify the code to make it compatible?

Thank you for your time and the awesome work you've done!

Apr 07 '25 20:04 alirouzbayani1