scikit-multilearn icon indicating copy to clipboard operation
scikit-multilearn copied to clipboard

TypeError: '(array([ 2, 4, 6, ..., 3495, 3497, 3498]), slice(None, None, None))' is an invalid key

Open talhaanwarch opened this issue 5 years ago • 3 comments

I am not sure why this error occurs, the first thing i can think sk-multilearn accept data in a specific format.

from skmultilearn.model_selection import iterative_train_test_split
X_train, y_train, X_test, y_test = iterative_train_test_split(X, y, test_size = 0.5)

The shape of X and y variable is ((3500, 7), (3500, 8)) Y variable has 8 columns, each one represent a class.

C1     C2     C3      C4      C5      C6     C7     C8
0       0      0       0       0      0      0       1
1       0      0       1       0      0      0       0
0       1      0       1       0      1      0       0

any idea how can i resolve this issue

talhaanwarch avatar Feb 03 '20 04:02 talhaanwarch

I faced the same issue. You can see, that the source code uses the old pandas subsetting. If you reimplement the function and add .loc it works fine:

from skmultilearn.model_selection import IterativeStratification
stratifier = IterativeStratification(n_splits=2, order=2, sample_distribution_per_fold=[0.25, 0.75])
train_indexes, test_indexes = next(stratifier.split(X, y))
X_train, y_train = X.loc[train_indexes], y.loc[train_indexes]
X_test, y_test = X.loc[test_indexes], y.loc[test_indexes]

SylwiaOliwia2 avatar Apr 06 '20 19:04 SylwiaOliwia2

What is n_splits=2, order=2?

pratikchhapolika avatar Sep 12 '21 10:09 pratikchhapolika

I faced the same issue. You can see, that the source code uses the old pandas subsetting. If you reimplement the function and add .loc it works fine:

from skmultilearn.model_selection import IterativeStratification
stratifier = IterativeStratification(n_splits=2, order=2, sample_distribution_per_fold=[0.25, 0.75])
train_indexes, test_indexes = next(stratifier.split(X, y))
X_train, y_train = X.loc[train_indexes], y.loc[train_indexes]
X_test, y_test = X.loc[test_indexes], y.loc[test_indexes]

Thank you! This worked for us

zachschillaci27 avatar Jun 12 '23 16:06 zachschillaci27