scikit-learn
scikit-learn copied to clipboard
⚠️ CI failed on Linux_Nightly.pylatest_pip_scipy_dev ⚠️
CI is still failing on Linux_Nightly.pylatest_pip_scipy_dev (Jun 26, 2023)
- test_parallel_train
- test_dict_learning_lassocd_readonly_data
- test_iforest_parallel_regression[61]
- test_ridge_regression[long-61-True-sparse_cg]
- test_ridge_regression[long-61-False-sparse_cg]
- test_ridge_regression[wide-61-True-sparse_cg]
- test_ridge_regression[wide-61-False-sparse_cg]
- test_ridge_regression_hstacked_X[long-61-True-sparse_cg]
- test_ridge_regression_hstacked_X[long-61-False-sparse_cg]
- test_ridge_regression_hstacked_X[wide-61-True-sparse_cg]
- test_model_pipeline_same_dense_and_sparse[Ridge-params6]
- test_ridge_regression_hstacked_X[wide-61-False-sparse_cg]
- test_ridge_regression_vstacked_X[long-61-True-sparse_cg]
- test_ridge_regression_vstacked_X[long-61-False-sparse_cg]
- test_ridge_regression_vstacked_X[wide-61-True-sparse_cg]
- test_ridge_regression_vstacked_X[wide-61-False-sparse_cg]
- test_ridge_regression_unpenalized[long-61-True-sparse_cg]
- test_ridge_regression_unpenalized[long-61-False-sparse_cg]
- test_ridge_regression_unpenalized[wide-61-True-sparse_cg]
- test_ridge_regression_unpenalized[wide-61-False-sparse_cg]
- test_ridge_regression_unpenalized_hstacked_X[long-61-True-sparse_cg]
- test_ridge_regression_unpenalized_hstacked_X[long-61-False-sparse_cg]
- test_ridge_regression_unpenalized_hstacked_X[wide-61-True-sparse_cg]
- test_ridge_regression_unpenalized_hstacked_X[wide-61-False-sparse_cg]
- test_ridge_regression_unpenalized_vstacked_X[long-61-True-sparse_cg]
- test_ridge_regression_unpenalized_vstacked_X[long-61-False-sparse_cg]
- test_ridge_regression_unpenalized_vstacked_X[wide-61-True-sparse_cg]
- test_ridge_regression_unpenalized_vstacked_X[wide-61-False-sparse_cg]
- test_ridge_regression_sample_weights[long-61-1.0-True-True-sparse_cg]
- test_ridge_regression_sample_weights[long-61-1.0-True-True-sag]
- test_ridge_regression_sample_weights[long-61-1.0-True-False-sparse_cg]
- test_ridge_regression_sample_weights[long-61-1.0-False-True-sparse_cg]
- test_ridge_regression_sample_weights[long-61-1.0-False-False-sparse_cg]
- test_ridge_regression_sample_weights[long-61-0.01-True-True-sparse_cg]
- test_ridge_regression_sample_weights[long-61-0.01-True-True-sag]
- test_ridge_regression_sample_weights[long-61-0.01-True-False-sparse_cg]
- test_ridge_regression_sample_weights[long-61-0.01-False-True-sparse_cg]
- test_ridge_regression_sample_weights[long-61-0.01-False-False-sparse_cg]
- test_ridge_regression_sample_weights[wide-61-1.0-True-True-sparse_cg]
- test_ridge_regression_sample_weights[wide-61-1.0-True-True-sag]
- test_ridge_regression_sample_weights[wide-61-1.0-True-False-sparse_cg]
- test_ridge_regression_sample_weights[wide-61-1.0-False-True-sparse_cg]
- test_ridge_regression_sample_weights[wide-61-1.0-False-False-sparse_cg]
- test_ridge_regression_sample_weights[wide-61-0.01-True-True-sparse_cg]
- test_ridge_regression_sample_weights[wide-61-0.01-True-True-sag]
- test_ridge_regression_sample_weights[wide-61-0.01-True-False-sparse_cg]
- test_ridge_regression_sample_weights[wide-61-0.01-False-True-sparse_cg]
- test_ridge_regression_sample_weights[wide-61-0.01-False-False-sparse_cg]
- test_ridge_individual_penalties
- test_solver_consistency[seed0-20-float32-0.1-sparse_cg-False]
- test_solver_consistency[seed0-20-float32-0.1-sparse_cg-True]
- test_solver_consistency[seed0-40-float32-1.0-sparse_cg-False]
- test_solver_consistency[seed0-40-float32-1.0-sparse_cg-True]
- test_solver_consistency[seed0-20-float64-0.2-sparse_cg-False]
- test_solver_consistency[seed0-20-float64-0.2-sparse_cg-True]
- test_solver_consistency[seed1-20-float32-0.1-sparse_cg-False]
- test_solver_consistency[seed1-20-float32-0.1-sparse_cg-True]
- test_solver_consistency[seed1-40-float32-1.0-sparse_cg-False]
- test_solver_consistency[seed1-40-float32-1.0-sparse_cg-True]
- test_solver_consistency[seed1-20-float64-0.2-sparse_cg-False]
- test_solver_consistency[seed1-20-float64-0.2-sparse_cg-True]
- test_solver_consistency[seed2-20-float32-0.1-sparse_cg-False]
- test_solver_consistency[seed2-20-float32-0.1-sparse_cg-True]
- test_solver_consistency[seed2-40-float32-1.0-sparse_cg-False]
- test_solver_consistency[seed2-40-float32-1.0-sparse_cg-True]
- test_solver_consistency[seed2-20-float64-0.2-sparse_cg-False]
- test_solver_consistency[seed2-20-float64-0.2-sparse_cg-True]
- test_cross_validate[True]
- test_cross_val_predict
- test_cross_val_predict_input_types
- test_ridge_classifier_with_scoring[DENSE_FILTER-cv1-None]
- test_ridge_classifier_with_scoring[DENSE_FILTER-cv1-accuracy]
- test_ridge_classifier_with_scoring[DENSE_FILTER-cv1-_accuracy_callable]
- test_ridge_classifier_with_scoring[SPARSE_FILTER-cv1-None]
- test_ridge_classifier_with_scoring[SPARSE_FILTER-cv1-accuracy]
- test_ridge_classifier_with_scoring[SPARSE_FILTER-cv1-_accuracy_callable]
- test_ridge_regression_custom_scoring[DENSE_FILTER-cv1]
- test_ridge_regression_custom_scoring[SPARSE_FILTER-cv1]
- test_dense_sparse[_test_ridge_cv]
- test_dense_sparse[_test_ridge_diabetes]
- test_dense_sparse[_test_multi_ridge_diabetes]
- test_dense_sparse[_test_ridge_classifiers]
- test_dense_sparse[_test_tolerance]
- test_sparse_design_with_sample_weights
- test_sparse_cg_max_iter
- test_ridge_fit_intercept_sparse[61-True-sparse_cg]
- test_ridge_fit_intercept_sparse[61-True-auto]
- test_ridge_fit_intercept_sparse[61-False-sparse_cg]
- test_ridge_fit_intercept_sparse[61-False-auto]
- test_ridge_fit_intercept_sparse_sag[61-True]
- test_ridge_fit_intercept_sparse_sag[61-False]
- test_ridge_regression_check_arguments_validity[auto-csr_matrix-None-False]
- test_ridge_regression_check_arguments_validity[auto-csr_matrix-sample_weight1-False]
- test_ridge_regression_check_arguments_validity[sparse_cg-array-None-False]
- test_ridge_regression_check_arguments_validity[sparse_cg-array-sample_weight1-False]
- test_ridge_regression_check_arguments_validity[sparse_cg-csr_matrix-None-False]
- test_ridge_regression_check_arguments_validity[sparse_cg-csr_matrix-sample_weight1-False]
- test_dtype_match[sparse_cg]
- test_ridge_regression_dtype_stability[0-sparse_cg]
- test_ridge_sample_weight_consistency[61-sparse_cg-tall-False-False]
- test_ridge_sample_weight_consistency[61-sparse_cg-tall-False-True]
- test_ridge_sample_weight_consistency[61-sparse_cg-tall-True-False]
- test_ridge_sample_weight_consistency[61-sparse_cg-tall-True-True]
- test_ridge_sample_weight_consistency[61-sparse_cg-wide-False-False]
- test_ridge_sample_weight_consistency[61-sparse_cg-wide-False-True]
- test_ridge_sample_weight_consistency[61-sparse_cg-wide-True-False]
- test_ridge_sample_weight_consistency[61-sparse_cg-wide-True-True]
- test_ridge_sample_weight_consistency[61-sag-tall-True-True]
- test_ridge_sample_weight_consistency[61-sag-wide-True-True]
- test_sag_regressor_computed_correctly
- test_sag_regressor[0]
- test_sag_regressor[1]
- test_sag_regressor[2]
- test_estimators[RegressorChain(base_estimator=Ridge())-check_estimator_sparse_data]
- test_estimators[Ridge()-check_estimator_sparse_data]
- test_estimators[RidgeClassifier()-check_estimator_sparse_data]
- test_estimators[MultiOutputRegressor(estimator=Ridge())-check_estimator_sparse_data]
- test_estimators[StackingRegressor(estimators=[('est1',Ridge(alpha=0.1)),('est2',Ridge(alpha=1))])-check_estimator_sparse_data]
- test_estimators[VotingRegressor(estimators=[('est1',Ridge(alpha=0.1)),('est2',Ridge(alpha=1))])-check_estimator_sparse_data]
- test_search_cv[HalvingGridSearchCV(cv=2,estimator=Ridge(),min_resources='smallest',param_grid={'alpha':[0.1,1.0]},random_state=0)-check_estimator_sparse_data0]
- test_search_cv[HalvingGridSearchCV(cv=2,estimator=Ridge(),min_resources='smallest',param_grid={'alpha':[0.1,1.0]},random_state=0)-check_estimator_sparse_data1]
- test_meta_estimators_delegate_data_validation[MultiOutputRegressor]
- test_meta_estimators_delegate_data_validation[StackingRegressor]
- test_meta_estimators_delegate_data_validation[TransformedTargetRegressor]
- test_meta_estimators_delegate_data_validation[VotingRegressor]
- test_base_chain_fit_and_predict_with_sparse_data_and_cv
/take
The culprit is pandas
dev. I will bisect to know which commit changed the behaviour.
So it comes from this commit: https://github.com/pandas-dev/pandas/pull/52542
It comes from calling pd.concat(..., ignore_index=True)
with a first dataset containing None
(thus an object dtype) with a second dataset containing np.nan
and float (thus a float64 dtype).
The previous behaviour cast the column as object
dtype while the new behaviour is casting into float64.
I am trying to craft a minimal reproducer.
There are plenty of (~280) recent errors probably due to a pandas change (and maybe numpy too?), symptoms look like this:
-
FutureWarning: is_sparse is deprecated and will be removed in a future version. Check isinstance(dtype, pd.SparseDtype) instead.
-
FutureWarning: The behavior of DataFrame concatenation with all-NA entries is deprecated. In a future version, this will no longer exclude all-NA columns when determining the result dtypes. To retain the old behavior, cast the all-NA columns to the desired dtype before the concat operation.
-
ValueError: setting an array element with a sequence.
-
DeprecationWarning: Conversion of an array with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you extract a single element from your array before performing this operation. (Deprecated NumPy 1.25.)
Number of failures look like this (quick and dirty analysis may miss a few kind of errors):
245 FutureWarning: is_sparse is deprecated and will be removed in a future version. Check `isinstance(dtype, pd.SparseDtype)` instead.
19 DeprecationWarning: Conversion of an array with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you extract a single element from your array before performing this operation. (Deprecated NumPy 1.25.)
14 ValueError: setting an array element with a sequence.
5 FutureWarning: The behavior of DataFrame concatenation with all-NA entries is deprecated. In a future version, this will no longer exclude all-NA columns when determining the result dtypes. To retain the old behavior, cast the all-NA columns to the desired dtype before the concat operation.
Opened #26287 about pandas is_sparse
Seems like some np.find_common_type DeprecationWarning are coming from pandas https://github.com/pandas-dev/pandas/issues/53236 and should hopefully be fixed soon.
For you information, SciPy is currently transitioning from the sparse matrix semantic to the sparse array semantic (see https://github.com/scikit-learn/scikit-learn/issues/26418 for discussing what it means for scikit-learn).
If tests using sparse data fail on pylatest_pip_scipy_dev
, feel free to ping me.
These issues have all been fixed. Let's close