aeon icon indicating copy to clipboard operation
aeon copied to clipboard

[BUG] `BaseCollectionEstimator` `_check_X` does not raise an error for different number of channels

Open MatthewMiddlehurst opened this issue 1 year ago • 2 comments

Describe the bug

_check_X checks whether a data collection is valid, but does not catch datasets which have a different number of channels for each case which is unsupported.

Steps/Code to reproduce the bug

import numpy as np
import pytest
from aeon.base import BaseCollectionEstimator

dummy = BaseCollectionEstimator()
all_tags = {
    "capability:multivariate": True,
    "capability:unequal_length": True,
    "capability:missing_values": True,
}
dummy.set_tags(**all_tags)
X = [np.random.random(size=(2, 10)), np.random.random(size=(3, 10))]
with pytest.raises(Exception):
    dummy._check_X(X)

Expected results

An informative exception is rasied.

Actual results

No exception is raised.

Versions

No response

MatthewMiddlehurst avatar Aug 02 '24 10:08 MatthewMiddlehurst

@aeon-actions-bot assign @Cyril-Meyer

@MatthewMiddlehurst I got some questions you may have answers to :

  1. Currently, the function getting the number of chan is get_n_channels which "return the number of ~~timepoints~~ channels in the first element of a collectiom". get_n_channels is used here and here. Can we transform this function with a parameters like get_n_channels(X, onlyFirst=True) to have the hability to get list of channels ?

  2. _check_X is the function which is supposed to check input, but _get_X_metadata (called by _check_X) is the function calling effectively get_n_channels. Can we raise the exception from here ? (note that it is also used in _convert_X)

  3. I've found multiple reference to aeon.registry.COLLECTIONS_DATA_TYPES in the doc, and replaced them with aeon.utils.registry.COLLECTIONS_DATA_TYPES as that's where I founded them. We may want to create an issue about this ? Was it a mistake from me ?

PS : when I ask "can we", it is to be understood as : "is it a good idea ?"

Cyril-Meyer avatar Aug 02 '24 16:08 Cyril-Meyer

@Cyril-Meyer

  1. I think your merged resolution is fine. We don't support different number of channels, so just raising an error instead of adding a parameter is ok.
  2. It will go through _check_X first, as long as the error is informative so the user knows what the issue is I'm ok with it going wherever, I think that has been achieved.
  3. Thanks for the fixes 🙂. Probably don't need an issue unless you know there are more that still need to be fixed.

MatthewMiddlehurst avatar Aug 06 '24 11:08 MatthewMiddlehurst

looks like #1895 fixed this, can we close @Cyril-Meyer ?

TonyBagnall avatar Sep 17 '24 21:09 TonyBagnall

looks like #1895 fixed this, can we close @Cyril-Meyer ?

I think we can if @MatthewMiddlehurst approve 😄

Cyril-Meyer avatar Sep 18 '24 10:09 Cyril-Meyer