pykilosort
pykilosort copied to clipboard
Set up benchmarking & testing against the current MATLAB version
NB the results from this are likely to be hard to interpret without #18
Rough steps required:
- [x] Run eMouse simulation in MATLAB
- [x] Run Kilosort2 MATLAB version, serialize results to phy format
- [x] Run Kilosort2 python version, serialize results to phy format
- [ ] Determine metrics for similarity - which units found, how many spike times they share etc. IN PROGRESS
- [ ] Script to automatically do all of the above automatically. IN PROGRESS
- [ ] @rossant suggestion --> to find the divergence point in the implementation, compare the outputs of the different steps: preprocessing, main loop, postprocessing, when each step receives the same inputs in both MATLAB and Python IN PROGRESS
- [ ] Test using the process from above but pulling the eMouse simulation file and the MATLAB results from the internet (or using locally saved files) - this test should be independent of MATLAB and will serve as a regression / parity test.
Additional notes:
- identify specific datasets where there is a discrepancy between MATLAB and Python (I think @jaib1 has some ?)
- to find the divergence point in the implementation, compare the outputs of the different steps: preprocessing, main loop, postprocessing, when each step receives the same inputs in both MATLAB and Python
- once there is a good match between all tested datasets, add these datasets to the automated testing suite
It would be great if @jaib1 could redo his comparisons after we port the modified Cuda kernels from @jenniferColonell, which make the algorithm deterministic.
to find the divergence point in the implementation, compare the outputs of the different steps: preprocessing, main loop, postprocessing, when each step receives the same inputs in both MATLAB and Python
I'm doing this now. I have set up a test script that uses the matlab engine API to run the various steps of the sorting alongside the pykilosort version (https://uk.mathworks.com/help/matlab/matlab_external/get-started-with-matlab-engine-for-python.html). It's a little faster to iterate than relying on file-based checkpoints but I haven't actually nailed down where the differences are coming from yet.
Am going to have another dig tomorrow evening. Will keep this issue up to date.