pykoop icon indicating copy to clipboard operation
pykoop copied to clipboard

Some tests failing when running locally in Python 3.13 environment

Open matt-graham opened this issue 7 months ago • 2 comments

Raising as part of openjournals/joss-reviews#7947

Expected Behavior

All tests pass.

Actual Behavior

Some (64) tests failed.

pytest short test summary info output
=================================================================================================================== short test summary info ====================================================================================================================
FAILED tests/test_kernel_approximation.py::TestSkLearn::test_compatible_estimator[RandomFourierKernelApprox(random_state=1234)-check_n_features_in_after_fitting] - AssertionError: `RandomFourierKernelApprox.transform()` does not check for consistency between input number
FAILED tests/test_kernel_approximation.py::TestSkLearn::test_compatible_estimator[RandomFourierKernelApprox(method='weight_only',random_state=1234)-check_n_features_in_after_fitting] - AssertionError: `RandomFourierKernelApprox.transform()` does not check for consistency between input number
FAILED tests/test_kernel_approximation.py::TestSkLearn::test_compatible_estimator[RandomBinningKernelApprox(random_state=1234)-check_n_features_in_after_fitting] - AssertionError: `RandomBinningKernelApprox.transform()` does not check for consistency between input number
FAILED tests/test_koopman_pipeline.py::TestSkLearn::test_compatible_estimator[KoopmanPipeline(regressor=Edmd())-check_do_not_raise_errors_in_init_or_set_params] - ValueError: not enough values to unpack (expected 2, got 1)
FAILED tests/test_koopman_pipeline.py::TestSkLearn::test_compatible_estimator[KoopmanPipeline(regressor=Edmd())-check_n_features_in_after_fitting] - AssertionError: `KoopmanPipeline.predict()` does not check for consistency between input number
FAILED tests/test_koopman_pipeline.py::TestSkLearn::test_compatible_estimator[KoopmanPipeline(lifting_functions=[('pl',PolynomialLiftingFn())],regressor=Edmd())-check_do_not_raise_errors_in_init_or_set_params] - ValueError: not enough values to unpack (expected 2, got 1)
FAILED tests/test_koopman_pipeline.py::TestSkLearn::test_compatible_estimator[KoopmanPipeline(lifting_functions=[('pl',PolynomialLiftingFn())],regressor=Edmd())-check_n_features_in_after_fitting] - AssertionError: `KoopmanPipeline.predict()` does not check for consistency between input number
FAILED tests/test_koopman_pipeline.py::TestSkLearn::test_compatible_estimator[SplitPipeline()-check_do_not_raise_errors_in_init_or_set_params] - ValueError: not enough values to unpack (expected 2, got 1)
FAILED tests/test_koopman_pipeline.py::TestSkLearn::test_compatible_estimator[SplitPipeline()-check_n_features_in_after_fitting] - AssertionError: `SplitPipeline.transform()` does not check for consistency between input number
FAILED tests/test_koopman_pipeline.py::TestSkLearn::test_compatible_estimator[SplitPipeline(lifting_functions_state=[('pl',PolynomialLiftingFn())])-check_do_not_raise_errors_in_init_or_set_params] - ValueError: not enough values to unpack (expected 2, got 1)
FAILED tests/test_koopman_pipeline.py::TestSkLearn::test_compatible_estimator[SplitPipeline(lifting_functions_state=[('pl',PolynomialLiftingFn())])-check_n_features_in_after_fitting] - AssertionError: `SplitPipeline.transform()` does not check for consistency between input number
FAILED tests/test_lifting_functions.py::TestSkLearn::test_compatible_estimator[PolynomialLiftingFn()-check_n_features_in_after_fitting] - AssertionError: `PolynomialLiftingFn.transform()` does not check for consistency between input number
FAILED tests/test_lifting_functions.py::TestSkLearn::test_compatible_estimator[DelayLiftingFn()-check_n_features_in_after_fitting] - AssertionError: `DelayLiftingFn.transform()` does not check for consistency between input number
FAILED tests/test_lifting_functions.py::TestSkLearn::test_compatible_estimator[SkLearnLiftingFn(transformer=MaxAbsScaler())-check_n_features_in_after_fitting] - AssertionError: `SkLearnLiftingFn.transform()` does not check for consistency between input number
FAILED tests/test_lifting_functions.py::TestSkLearn::test_compatible_estimator[BilinearInputLiftingFn()-check_n_features_in_after_fitting] - AssertionError: `BilinearInputLiftingFn.transform()` does not check for consistency between input number
FAILED tests/test_lifting_functions.py::TestSkLearn::test_compatible_estimator[RbfLiftingFn(centers=QmcCenters(random_state=1234))-check_n_features_in_after_fitting] - AssertionError: `RbfLiftingFn.transform()` does not check for consistency between input number
FAILED tests/test_lifting_functions.py::TestSkLearn::test_compatible_estimator[ConstantLiftingFn()-check_n_features_in_after_fitting] - AssertionError: `ConstantLiftingFn.transform()` does not check for consistency between input number
FAILED tests/test_lifting_functions.py::TestSkLearn::test_compatible_estimator[KernelApproxLiftingFn(kernel_approx=RandomFourierKernelApprox(random_state=1234))-check_n_features_in_after_fitting] - AssertionError: `KernelApproxLiftingFn.transform()` does not check for consistency between input number
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[Edmd()-check_estimator_tags_renamed] - TypeError: Estimator Edmd has defined either `_more_tags` or `_get_tags`, but not `__sklearn_tags__`. If you're customizing tags, and need to support multiple scikit-learn versions, you can implement both `__sklearn_tags__` and `_more_tags` or `_get_t...
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[Edmd()-check_n_features_in_after_fitting] - AssertionError: `Edmd.predict()` does not check for consistency between input number
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[EdmdMeta()-check_estimator_tags_renamed] - TypeError: Estimator EdmdMeta has defined either `_more_tags` or `_get_tags`, but not `__sklearn_tags__`. If you're customizing tags, and need to support multiple scikit-learn versions, you can implement both `__sklearn_tags__` and `_more_tags` or `_g...
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[EdmdMeta()-check_n_features_in_after_fitting] - AssertionError: `EdmdMeta.predict()` does not check for consistency between input number
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[EdmdMeta(regressor=Ridge(alpha=1))-check_estimator_tags_renamed] - TypeError: Estimator EdmdMeta has defined either `_more_tags` or `_get_tags`, but not `__sklearn_tags__`. If you're customizing tags, and need to support multiple scikit-learn versions, you can implement both `__sklearn_tags__` and `_more_tags` or `_g...
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[EdmdMeta(regressor=Ridge(alpha=1))-check_fit_score_takes_y] - ValueError: Found input variables with inconsistent numbers of samples: [30, 1]
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[EdmdMeta(regressor=Ridge(alpha=1))-check_n_features_in_after_fitting] - AssertionError: `EdmdMeta.predict()` does not check for consistency between input number
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[EdmdMeta(regressor=Ridge(alpha=1))-check_pipeline_consistency] - ValueError: Found input variables with inconsistent numbers of samples: [30, 1]
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[EdmdMeta(regressor=Ridge(alpha=1))-check_regressors_train] - AssertionError
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[EdmdMeta(regressor=Ridge(alpha=1))-check_regressors_train(readonly_memmap=True)] - AssertionError
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[EdmdMeta(regressor=Ridge(alpha=1))-check_regressors_train(readonly_memmap=True,X_dtype=float32)] - AssertionError
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[EdmdMeta(regressor=Ridge(alpha=1))-check_methods_sample_order_invariance] - IndexError: index 5 is out of bounds for axis 0 with size 1
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[Dmdc()-check_estimator_tags_renamed] - TypeError: Estimator Dmdc has defined either `_more_tags` or `_get_tags`, but not `__sklearn_tags__`. If you're customizing tags, and need to support multiple scikit-learn versions, you can implement both `__sklearn_tags__` and `_more_tags` or `_get_t...
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[Dmdc()-check_n_features_in_after_fitting] - AssertionError: `Dmdc.predict()` does not check for consistency between input number
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[Dmd()-check_estimator_tags_renamed] - TypeError: Estimator Dmd has defined either `_more_tags` or `_get_tags`, but not `__sklearn_tags__`. If you're customizing tags, and need to support multiple scikit-learn versions, you can implement both `__sklearn_tags__` and `_more_tags` or `_get_ta...
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[Dmd()-check_fit_score_takes_y] - ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 1 is different from 3)
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[Dmd()-check_estimators_overwrite_params] - ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 1 is different from 2)
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[Dmd()-check_dont_overwrite_parameters] - ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 1 is different from 3)
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[Dmd()-check_estimators_fit_returns_self] - ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 1 is different from 2)
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[Dmd()-check_readonly_memmap_input] - ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 1 is different from 2)
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[Dmd()-check_n_features_in_after_fitting] - ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 1 is different from 4)
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[Dmd()-check_positive_only_tag_during_fit] - AssertionError: Estimator 'Dmd' raised ValueError unexpectedly. This happens when passing negative input values as X. If negative values are not supported for this estimator instance, then the tags.input_tags.positive_only tag needs to be set to True.
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[Dmd()-check_estimators_dtypes] - ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 1 is different from 5)
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[Dmd()-check_dtype_object] - ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 1 is different from 10)
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[Dmd()-check_pipeline_consistency] - ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 1 is different from 3)
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[Dmd()-check_estimators_nan_inf] - ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 1 is different from 3)
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[Dmd()-check_estimators_pickle] - ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 1 is different from 3)
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[Dmd()-check_estimators_pickle(readonly_memmap=True)] - ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 1 is different from 3)
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[Dmd()-check_f_contiguous_array_estimator] - ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 1 is different from 3)
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[Dmd()-check_regressors_train] - ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 1 is different from 10)
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[Dmd()-check_regressors_train(readonly_memmap=True)] - ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 1 is different from 10)
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[Dmd()-check_regressors_train(readonly_memmap=True,X_dtype=float32)] - ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 1 is different from 10)
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[Dmd()-check_regressor_data_not_an_array] - ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 1 is different from 10)
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[Dmd()-check_regressor_multioutput] - ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 5 is different from 10)
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[Dmd()-check_regressors_no_decision_function] - ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 1 is different from 4)
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[Dmd()-check_regressors_int] - ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 1 is different from 10)
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[Dmd()-check_methods_sample_order_invariance] - ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 1 is different from 3)
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[Dmd()-check_methods_subset_invariance] - ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 1 is different from 3)
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[Dmd()-check_dict_unchanged] - ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 1 is different from 3)
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[Dmd()-check_fit_idempotent] - ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 1 is different from 2)
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[Dmd()-check_fit_check_is_fitted] - ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 1 is different from 2)
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[Dmd()-check_n_features_in] - ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 1 is different from 2)
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[Dmd()-check_fit2d_predict1d] - ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 1 is different from 3)
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[DataRegressor()-check_estimator_tags_renamed] - TypeError: Estimator DataRegressor has defined either `_more_tags` or `_get_tags`, but not `__sklearn_tags__`. If you're customizing tags, and need to support multiple scikit-learn versions, you can implement both `__sklearn_tags__` and `_more_tags` o...
FAILED tests/test_regressors.py::TestSkLearn::test_compatible_estimator[DataRegressor()-check_n_features_in_after_fitting] - AssertionError: `DataRegressor.predict()` does not check for consistency between input number
FAILED tests/test_util.py::TestSkLearn::test_compatible_estimator[AnglePreprocessor()-check_n_features_in_after_fitting] - AssertionError: `AnglePreprocessor.transform()` does not check for consistency between input number
========================================================================================== 64 failed, 1819 passed, 474 deselected, 1889 warnings in 412.30s (0:06:52) ==========================================================================================

Steps to Reproduce the Problem

From a clone of repository at commit 2215358

  1. Create a blank Python 3.13 virtual environment - I used uv - and activate
    uv venv --python=3.13
    source .venv/bin/activate
    
  2. Install development dependencies and local package
    uv pip install -r requirements-dev.txt
    uv pip install .
    
  3. Run tests (skipping MOSEK dependent tests)
    pytest tests -k "not mosek"
    

Specifications

  • Package version: Current main (2215358)
  • Python version: 3.13
  • Platform: Ubuntu 24.04

matt-graham avatar May 09 '25 10:05 matt-graham

As tests appear to have all passed in the last GitHub Actions workflow run across Python versions 3.8 to 3.11, with this run being 6 months ago. It's not clear therefore if the tests are failing due to something changing in an (unpinned) upstream dependency, a Python 3.13 specific issue or something else.

matt-graham avatar May 09 '25 10:05 matt-graham

Looks like this is not specific to Python 3.13, as I also get failures when running in a Python 3.11 environment:

uv venv --python=3.11
source .venv/bin/activate
uv pip install -r requirements-dev.txt
uv pip install .
pytest tests -k "not mosek" -x

fails and exits with

FAILED tests/test_kernel_approximation.py::TestSkLearn::test_compatible_estimator[RandomFourierKernelApprox(random_state=1234)-check_n_features_in_after_fitting] - AssertionError: `RandomFourierKernelApprox.transform()` does not check for consistency between input number

There are also a series of deprecation warnings raised:

Full pytest error traceback info
____________________________________________________________________ TestSkLearn.test_compatible_estimator[RandomFourierKernelApprox(random_state=1234)-check_n_features_in_after_fitting] _____________________________________________________________________

name = 'RandomFourierKernelApprox', estimator_orig = RandomFourierKernelApprox(random_state=1234)

    @ignore_warnings(category=FutureWarning)
    def check_n_features_in_after_fitting(name, estimator_orig):
        # Make sure that n_features_in are checked after fitting
        tags = get_tags(estimator_orig)
    
        is_supported_X_types = tags.input_tags.two_d_array or tags.input_tags.categorical
    
        if not is_supported_X_types or tags.no_validation:
            return
    
        rng = np.random.RandomState(0)
    
        estimator = clone(estimator_orig)
        set_random_state(estimator)
        if "warm_start" in estimator.get_params():
            estimator.set_params(warm_start=False)
    
        n_samples = 10
        X = rng.normal(size=(n_samples, 4))
        X = _enforce_estimator_tags_X(estimator, X)
    
        if is_regressor(estimator):
            y = rng.normal(size=n_samples)
        else:
            y = rng.randint(low=0, high=2, size=n_samples)
        y = _enforce_estimator_tags_y(estimator, y)
    
        err_msg = (
            "`{name}.fit()` does not set the `n_features_in_` attribute. "
            "You might want to use `sklearn.utils.validation.validate_data` instead "
            "of `check_array` in `{name}.fit()` which takes care of setting the "
            "attribute.".format(name=name)
        )
    
        estimator.fit(X, y)
        assert hasattr(estimator, "n_features_in_"), err_msg
        assert estimator.n_features_in_ == X.shape[1], err_msg
    
        # check methods will check n_features_in_
        check_methods = [
            "predict",
            "transform",
            "decision_function",
            "predict_proba",
            "score",
        ]
        X_bad = X[:, [1]]
    
        err_msg = """\
            `{name}.{method}()` does not check for consistency between input number
            of features with {name}.fit(), via the `n_features_in_` attribute.
            You might want to use `sklearn.utils.validation.validate_data` instead
            of `check_array` in `{name}.fit()` and {name}.{method}()`. This can be done
            like the following:
            from sklearn.utils.validation import validate_data
            ...
            class MyEstimator(BaseEstimator):
                ...
                def fit(self, X, y):
                    X, y = validate_data(self, X, y, ...)
                    ...
                    return self
                ...
                def {method}(self, X):
                    X = validate_data(self, X, ..., reset=False)
                    ...
                return X
        """
        err_msg = textwrap.dedent(err_msg)
    
        msg = f"X has 1 features, but \\w+ is expecting {X.shape[1]} features as input"
        for method in check_methods:
            if not hasattr(estimator, method):
                continue
    
            callable_method = getattr(estimator, method)
            if method == "score":
                callable_method = partial(callable_method, y=y)
    
            with raises(
                ValueError, match=msg, err_msg=err_msg.format(name=name, method=method)
            ):
>               callable_method(X_bad)

.venv/lib/python3.11/site-packages/sklearn/utils/estimator_checks.py:4501: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
.venv/lib/python3.11/site-packages/sklearn/utils/_set_output.py:319: in wrapped
    data_to_wrap = f(self, X, *args, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = RandomFourierKernelApprox(random_state=0)
X = array([[ 0.40015721],
       [-0.97727788],
       [ 0.4105985 ],
       [ 0.12167502],
       [-0.20515826],
       [ 0.6536186 ],
       [-1.45436567],
       [ 1.46935877],
       [-1.98079647],
       [ 1.20237985]])

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Transform data.
    
        Parameters
        ----------
        X : np.ndarray
            Data matrix.
    
        Returns
        -------
        np.ndarray
            Transformed data matrix.
        """
        sklearn.utils.validation.check_is_fitted(self)
        X = sklearn.utils.validation.check_array(X)
        X_scaled = np.sqrt(2 * self.shape) * X
>       products = X_scaled @ self.random_weights_  # (n_samples, n_components)
E       ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 4 is different from 1)

pykoop/kernel_approximation.py:271: ValueError

The above exception was the direct cause of the following exception:

self = <test_kernel_approximation.TestSkLearn object at 0x7b7a3a8e7210>, estimator = RandomFourierKernelApprox(random_state=1234), check = functools.partial(<function check_n_features_in_after_fitting at 0x7b7a3ab4f060>, 'RandomFourierKernelApprox')

    @sklearn.utils.estimator_checks.parametrize_with_checks([
        pykoop.RandomFourierKernelApprox(
            method='weight_offset',
            random_state=1234,
        ),
        pykoop.RandomFourierKernelApprox(
            method='weight_only',
            random_state=1234,
        ),
        pykoop.RandomBinningKernelApprox(random_state=1234),
    ])
    def test_compatible_estimator(self, estimator, check):
        """Test ``scikit-learn`` compatibility of estimators."""
>       check(estimator)

tests/test_kernel_approximation.py:435: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
.venv/lib/python3.11/site-packages/sklearn/utils/_testing.py:147: in wrapper
    return fn(*args, **kwargs)
.venv/lib/python3.11/site-packages/sklearn/utils/estimator_checks.py:4498: in check_n_features_in_after_fitting
    with raises(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <sklearn.utils._testing._Raises object at 0x7b7a204dbb50>, exc_type = <class 'ValueError'>
exc_value = ValueError('matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 4 is different from 1)'), _ = <traceback object at 0x7b7a202080c0>

    def __exit__(self, exc_type, exc_value, _):
        # see
        # https://docs.python.org/2.5/whatsnew/pep-343.html#SECTION000910000000000000000
    
        if exc_type is None:  # No exception was raised in the block
            if self.may_pass:
                return True  # CM is happy
            else:
                err_msg = self.err_msg or f"Did not raise: {self.expected_exc_types}"
                raise AssertionError(err_msg)
    
        if not any(
            issubclass(exc_type, expected_type)
            for expected_type in self.expected_exc_types
        ):
            if self.err_msg is not None:
                raise AssertionError(self.err_msg) from exc_value
            else:
                return False  # will re-raise the original exception
    
        if self.matches is not None:
            err_msg = self.err_msg or (
                "The error message should contain one of the following "
                "patterns:\n{}\nGot {}".format("\n".join(self.matches), str(exc_value))
            )
            if not any(re.search(match, str(exc_value)) for match in self.matches):
>               raise AssertionError(err_msg) from exc_value
E               AssertionError: `RandomFourierKernelApprox.transform()` does not check for consistency between input number
E               of features with RandomFourierKernelApprox.fit(), via the `n_features_in_` attribute.
E               You might want to use `sklearn.utils.validation.validate_data` instead
E               of `check_array` in `RandomFourierKernelApprox.fit()` and RandomFourierKernelApprox.transform()`. This can be done
E               like the following:
E               from sklearn.utils.validation import validate_data
E               ...
E               class MyEstimator(BaseEstimator):
E                   ...
E                   def fit(self, X, y):
E                       X, y = validate_data(self, X, y, ...)
E                       ...
E                       return self
E                   ...
E                   def transform(self, X):
E                       X = validate_data(self, X, ..., reset=False)
E                       ...
E                   return X

.venv/lib/python3.11/site-packages/sklearn/utils/_testing.py:1114: AssertionError

There are also a series of sklearn deprecation warnings raised, which possibly points to this being due to some breaking change in sklearn?

matt-graham avatar May 09 '25 10:05 matt-graham

Thanks for pointing this out. sklearn has a set of built-in unit tests to make sure custom estimators fit their interface. These tests are quite strict (they match exact error messages) and they tend to change with new releases. I've updated everything to match their new tests, and I've also pinned a range of sklearn versions. It now works on my machine for Python 3.13, but the CI will tell us if it works for the other versions... There are some remaining deprecation warnings but they are upstream (in picos).

sdahdah avatar May 10 '25 23:05 sdahdah

Please take a look at the linked PR and let me know if you are happy with the changes.

sdahdah avatar May 10 '25 23:05 sdahdah