mne-python
mne-python copied to clipboard
ENH: Make _pick_to_idx public
@drammock other than
mne.io.constants.FIFF(which is pretty painless to keep) this is the list of what we have to keep around to make sibling packages happy. Do you think we should make some variant of_picks_to_idxpublic?
Originally posted by @larsoner in https://github.com/mne-tools/mne-python/pull/11903#discussion_r1301869325
We'll have to think about the right API and we might not want to expose all options but it should be doable
I would like to rework the channel selection/pick API with the following goal in mind:
- Reduce code duplication
- Reduce the number of functions/methods in the public API
- Harmonize the function names/arguments
- Propose a public API for
_pick_to_idx - Propose an API with input validation and a faster version skipping the validation
After going through mne._fiff.pick, I propose:
- Keep the functions returning a channel type as is,
get_channel_type_constants,channel_type,channel_indices_by_type. IMO, the naming schema is not consistent but those functions are not the priority. - Transform functions which operates on an object and return the same object into methods.
pick_channels_cov: move tomne.Covariance.pickpick_channels_forwardandpick_types_forwardmoved tomne.Forward.pickpick_infomoved tomne.Info.pick: this one is debatable as we need to prevent picking on anInfoobject attached to another object, e.g.Raw. IMO, we have 2 options:- (1) Keep
pick_infoas is and don't add a channel selection method to anInfoobject - (2) Remove
pick_info, addmne.Info._pick(private) which modifies the instance in place andmne.Info.pick(public) which always returns a copy, prevent assignment toobject.info, e.g. preventraw.info = "foo"(which might already be the case?)
- (1) Keep
- Merge
_picks_str_to_idx,_picks_to_idx,pick_channels,pick_channels_regexp,pick_typesinto:- Functions selecting on a list-like of channel names
- Functions selecting on an Info
- Private functions with additional
with_ref_meg: boolandreturn_kind: boolargument, if needed.
def pick_ch_names_to_idx(
ch_names: list[str] | tuple[str] | set[str] | NDArray[str],
picks: list[str | int], tuple[str, int], set[str | int], NDArray[str], NDArray[int], str, int,
exclude: list[str | int], tuple[str, int], set[str | int], NDArray[str], NDArray[int], str, int,
) -> NDArray[np.int32]:
pass
def pick_info_to_idx( # or pick_to_idx?
info: Info,
picks: list[str | int], tuple[str, int], set[str | int], NDArray[str], NDArray[int], str, int,
exclude: list[str | int], tuple[str, int], set[str | int], NDArray[str], NDArray[int], str, int,
) -> NDArray[np.int32]:
pass
Both picks or exclude support the same types, with str being either:
- a channel name
- a channel type
'all'or'data'- a regex
I'm not sure if allow_empty: bool should be part of the public API. IMO, this argument is linked to a second one validation: bool which controls the presence or absence of validation of the input, to e.g. speed-up the selection if you know your inputs are valid and/or if you don't 'care'. IMO, 2 options:
- (1): include 2 arguments,
allow_empty: bool = Falseandvalidation: bool = True, to the public API. Thus, you can now chose to skip input-validation and/or have some control on it. - (2) Keep the signature of the public API as above (pros: very simple) and add private functions
_pick_ch_names_to_idxand_pick_info_to_idxwhich skip input validation.allow_empty: bool = Falseshould likely be kept in those private functions.
Simultaneously, I would also deprecate for good the legacy pick_channels methods to tend towards a simpler and intuitive API. I believe those changes cover the majority of use-cases both in mne API and in external code. Those changes can be implemented piece by piece:
- the current
mne._fiff.pickmodule can become private e.g.mne._fiff._pick. Exisitng imports are fixed to keep using the existing function. - the new
mne._fiff.pickmodule is implemented and MNE code-base can be gradually updated, one module at a time, to transition frommne._fiff._picktomne._fiff.pick
The removal of the existing public pick methods in mne._fiff.pick is backward incompatible. The impact can be mitigated by either (1) a long deprecation cycle or (2) keeping those methods as legacy for now (especially as the documentation of those methods is at the mne-level, e.g. mne.pick_info, mne.pick_types).
I believe I have the time to work on this change before 1.7 is out, but let's take the time to think about the API first ;)
Without getting into too much detail, I would add to the priority list above:
@deprecateding as few things as possible in favor of@legacying them
In theory any new functionality like cov.pick should just extend existing functionality, so hopefully that function becomes a one-line wrapper like return cov.copy().pick(...), in which case keeping pick_channels_cov in the namespace carries minimal maintenance burden. We could even probably remove it from the API page to help with your point (2) above. I like the idea of modernizing the interfaces but leaving the (decade-?)old functions around for backward compat when the cost is minimal like this.
Beyond that I would suggest breaking this up as much as possible, otherwise the diff and review will be unmanageable. Hardest starting point (but maybe most satisfying?) might be to try to simplify the internal code to be more DRY. Simplest might be with something like implementing a cov.pick. A sort of in between would probably be to add the two new public functions you propose that directly address the _pick_to_idx-to-public issue in the title. But maybe you have some other plan...
There are a lot of details to work out with the above proposal, so I suggest @drammock and others who are interested first comment as well on the general idea, then @mscheltienne you propose what a first PR could include, then once we converge on that you can open it. That way we don't have to discuss everything all at once but can make incremental progress with a shared vision.
For me I'm overall +1 on these ideas subject to my suggestion to prefer @legacy over @deprecated.
I'll take a deeper look at the proposal soon, but meanwhile I'm adding a crossref to https://github.com/mne-tools/mne-python/issues/11531 and especially https://github.com/mne-tools/mne-python/issues/11531#issuecomment-1464442314
With the proposed approach, we can keep most, if nit all functions with the legacy decorator.
I agree we should break things apart. I propose to get a first PR with some minor changes and the structure of the new functions, i. e. function names, arguments and type-hints for the arguments and returns. We can iterate on the structure before coding the logic.
regarding pick_info
pick_infomoved tomne.Info.pick: this one is debatable as we need to prevent picking on anInfoobject attached to another object, e.g.Raw.
another possibility that occurs to me here is that raw.pick() and raw.info.pick() could do the same thing (i.e. both could modify raw inplace). On its face that seems like a good API to me, but it opens up the possibility of
my_info = raw.info
# ...many lines later:
my_info.pick() # modifies `raw` (possibly unwittingly)
So I think we should avoid that. And obviously we can't modify an attached info without modifying the instance it's attached to, so a public raw.info.pick() must return a copy, which is at odds with how .pick() works on all the instances; so I feel like it's an unintuitive API in that sense. To my mind that leaves only two possibilities: your option (1):
Keep
pick_infoas is and don't add a channel selection method to an Info object
or an option where info becomes aware of whether it's attached to an instance, and if so, refuses to execute its .pick() method. That feels a bit overengineered and possibly fragile, so I'm voting for your option(1).
regarding functions selecting on a list-like of channel names
You'll see in https://github.com/mne-tools/mne-python/issues/11531#issuecomment-1464442314 that I advocated for @legacying pick_channels and pick_channels_regexp, but that idea was not popular with @agramfort and @jasmainak at least. Now, you're at least proposing to keep the functionality in a new, more flexible function, so maybe the idea will be more popular, but I worry a little bit about a single function being "too magic" in deciding when to interpret its input as a regexp and when to treat it like a plain channel name. Perhaps we take inspiration from Pandas here? https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.filter.html#pandas.DataFrame.filter has items, like, and regex params which are mutually exclusive.
regarding pick and exclude
Maybe I'm misunderstanding but you seem to suggest that it will be possible to pass exclude="all" or exclude="data"; to me this seems pointless to support. Are there use cases you're imagining/encountering here?
For pick_info, I agree, let's keep the existing function as it would be counter-intuitive to have a method not modify the instance in-place.
For pick_channels and pick_channels_regexp, you wrote:
hey should also be marked @legacy or even deprecated/removed
I believe @jasmainak and @agramfort were not advocating against @legacy, but against deprecation. I would prefer to mark them as @legacy as well along the rest of the functions mentioned. IMO, @legacy functions should also be removed from the documentation build.
I worry a little bit about a single function being "too magic"
I usually agree and prefer to split functionalities, but here it doesn't seem that difficult. If a string is provided, is it a channel name? is it a channel type? is it a valid regex? We could even drop this last point by supporting re.Pattern directly for regex.
For pick and exclude, they should accept the same arguments.. beside those 2 non-sense cases and beside exclude=None since the default is () in other part of the API :sweat_smile: