[draft] new reader for curry files, using curry-python-reader code
hi all as discussed in #13033 here a draft PR to use the official curry reader code.
in a first step i just use the reader as a module (only fixed formatting to force it past pre-commit). it has some drawbacks (e.g. data is always loaded) and i did not implement all possible data yet (eg hpi, or epoched recordings) but in general it already works pretty well. tested it with all their example data, and one of my own recordings that didnt work in mne before.
it would be great to get some feedback how you want me to proceed with this @drammock @larsoner:
- do we want to stick with the module approach, leave their code untouched and work with the output (would allow easier updating when they push changes)
- or should i merge the code more thoroughly. making it easier to maintain and in terms of clarity
BACKGROUND: the curry data reader currently cant handle all/newer curry files plan is port code from the official curry reader into mne-python
for permission see #12855
closes #12795 #13033 #12855
do we want to stick with the module approach, leave their code untouched and work with the output (would allow easier updating when they push changes), or should i merge the code more thoroughly. making it easier to maintain and in terms of clarity
Given that @CurryKaiser refused our offer to help them package up their reader for PyPI / conda-forge, I see two remaining options:
-
"vendor" their code. To make it slightly future-proof, we could write a script (in
tools/I guess) that fetches the latest code from their repo, auto-formats it to make it compatible with MNE's pre-commit requirements, and puts the (formatted but otherwise unmodified) code inmne/io/curry/_vendored.py. (This is basically a manual version ofgit submodule updatebecause I don't think we should invokegit submodulefor this use case.) We then adapt our code inmne/io/curry.pyto be a wrapper around their code that basically just gets things into proper MNE container objects; and know that we might need to tweak our wrappers any time the vendored code is updated. -
Fully incorporate their code. Re-write their reader to align better with our codebase, in terms of variable names, idioms like
_check_optionor_validate_type, features likepreload=False, etc.
Personally I lean toward option 2. I say this because if we're going to try to support curry files, at a minimum we need to be able to fix bugs when they arise, and ideally we should be willing/able to incorporate new features that have high demand from our users (preload=False is an obvious first example). But if we're fixing bugs, do we open PRs to upstream (with no guarantee of responsiveness), or tweak our "thin" wrapper to handle more and more corner cases? Neither option is appealing, so at that point it starts to seem easier to me to just maintain the entire reader ourselves.
hum... my first reaction is to push a version 0.1 of their package on pypi and rely on this. Basically we maintain a fork and hope that the fork changes are accepted upstream... it feels less hacky and they also have a CI and some testing setup with test_data that I would not duplicate in mne-python...
Message ID: @.***>
hum... my first reaction is to push a version 0.1 of their package on pypi and rely on this. Basically we maintain a fork and hope that the fork changes are accepted upstream... it feels less hacky and they also have a CI and some testing setup with test_data that I would not duplicate in mne-python...
that indeed is less hacky than my approach to vendoring. I'd be OK with that outcome, though curious what @larsoner will think.
I'm fine with that idea but it would be good to get some blessing/permission from them to do this
@CurryKaiser
I'm fine with that idea but it would be good to get some blessing/permission from them to do this
xref to https://github.com/mne-tools/mne-python/issues/12855#issuecomment-2767203036 where I've asked for confirmation that Compumedics really doesn't want to be the packager and they're OK with us doing it.
@CurryKaiser
I'm fine with that idea but it would be good to get some blessing/permission from them to do this
xref to #12855 (comment) where I've asked for confirmation that Compumedics really doesn't want to be the packager and they're OK with us doing it.
And nothing has changed, so all good from our side. Sorry we couldn't package it for you. And thank you for working on this!
thanks @CurryKaiser !
ok, sounds like a plan.. I can start working on this again soon, if you give a go @agramfort @drammock @larsoner
I guess the fork should live in the mne-tools org? I have the necessary rights to create it
Yeah I think so
Yeah I think so
I already made the fork
ok, sounds like a plan.. I can start working on this again soon, if you give a go @agramfort https://github.com/agramfort @drammock https://github.com/drammock @larsoner https://github.com/larsonerMessage ID: @.***>
+1
xref to https://github.com/mne-tools/curry-python-reader/pull/1
i could need some guidance on 2 things:
-
channel locations: curry files come with channel locations, and for EEG it was straight forward to build a montage and apply. but for MEG it seems i need to use other functions. any pointers would help! do i need to populate
info["dig"]directly? -
HPI/CHPI data: some MEG files seem to come with these data. how do i store them in the
rawobject?
a few other things to discuss:
-
preloadeasiest would be to not offerpreload=Falseand just load the data to memory. a singleload_data()call would also be doable with the official reader, but a chunk reader not really (if we dont want to hack it; e.g. load all data and discard large parts). not sure i'm deep enough in themnecodebase to know what the implications are (e.g. computations, plots etc with unloaded data) what are your thought? -
epoched files the reader code looks as if there could be files with epoched recordings, but there are none among their sample files. do any of you know more about this? otherwise ill ask the curry devs
p.s. and could you remind me how to switch off the CIs when pushing these early commits?
Push commits with [ci skip] in the commit message and they long / expensive CIs shouldn't run (a few quick ones still will I think)
... for the cHPI stuff it's probably easiest to load a Neuromag raw FIF file with cHPI info and look at how the info is stored for example in info["hpi_subsystem"]. You can also look at the Info docs, especially the notes. It's not complete but it will help.
For preload, since preload=False is in main it would be a functionality regression to remove it. Once you know how the data are stored on disk and how to read an arbitrary time slice from it, it's really not bad to do the work to make preload=False work. So if you can figure this part out in some function, I can help you fit it into the _read_segment_file code. Since the .py file is only a few hundred lines (a lot of which seems like plotting etc. that we won't use), I'm cautiously optimistic we can figure it out and make it work. And then the python-curry-reader code can really be for reading metadata, annotations, sensor locs, etc. plus knowing where to read the data from disk. We can probably even keep the existing _read_segment_file, it should work in theory...
ok, _read_segment_file does indeed work unchanged.
the reader should now be more/less functional
- I'd still need some guidance on handling/storing the channel locations, esp. for MEG data
- HPI data - looks like I got it wrong - there might not be cHPI data after all, only HPI marker locations provided in different formats depending on the system
@CurryKaiser thanks for the permission to use the code, also from my side!
in another place you said you might be able to provide us with test files - could we perhaps get a small one with epoched recordings in it (format version shouldn't matter)? your repository for the python reader contains some test files that the reader interprets as epoched, but they dont seem to really be (perhaps the files were truncated for size)
Could be that they were truncated, let me check.
Ok, try these: EpochedData
thanks for the file @CurryKaiser
fyi, we have now packaged and published the curryreader on PyPI.
it can be installed via pip install curryreader
@drammock @larsoner @agramfort it is on PyPI but not on conda-forge - how is this case dealt with in MNE? should we also submit it to conda forge?
currently pip install mne[full] fetches it, but conda env create --file environment.yml doesnt
related question:
which pip dependency level in pyproject.toml should this go to? i treated curryreader like the antio package (for ANT neuro files) but this makes it an optional requirement (in mne[full]). i believe this mean it wont be automatically installed when calling pip install mne?
it is on PyPI but not on conda-forge - how is this case dealt with in MNE? should we also submit it to conda forge?
Yeah use grayskull, it's not too painful, see for example https://github.com/conda-forge/staged-recipes/pull/28279
Yeah use
grayskull,
see conda-forge PR: https://github.com/conda-forge/staged-recipes/pull/29754
@larsoner @drammock -
could you point me to where i set the curryreader dependency for the failing pipelines Tests/ubuntu-latest/pip and mne-tools.mne-python?
i tried a few places already, and all other pipelines fetch it..
the other fails are due to my code, i'll solve that down the line
could you point me to where i set the curryreader dependency for the failing pipelines Tests/ubuntu-latest/pip and mne-tools.mne-python?
Before doing that... any test that requires curryreader should have a pytest.importorskip that skips the test rather than fails. So the easiest way to do it is actually to have a module-level
pytest.importorskip("curryreader")
in mne/io/curry/tests/test_curry.py.
... but then to actually add it in the "right places", I would use git grep mffpy and git grep edfio to see where we install those optional dependencies. Doing the same for curryreader should be good enough!
i treated
curryreaderlike theantiopackage
yeah, had already done this with antio and edfio. mffpy was a good lead, gave a new location. but mne-tools.mne-pythonstill fails
I had some deadlines and will now be on vacation for a while. will continue working on the PR afterward
users are asking for this; xref to forum post: https://mne.discourse.group/t/mne-io-read-raw-curry-for-curry-9/11245/2
users are asking for this; xref to forum post: https://mne.discourse.group/t/mne-io-read-raw-curry-for-curry-9/11245/2
good to hear that. just came back from annual leave and can start working on it again