skrub icon indicating copy to clipboard operation
skrub copied to clipboard

AggTarget raises when y is a Series

Open jeromedockes opened this issue 1 year ago • 1 comments

Describe the bug

When y is a series rather than a numpy array aggtarget raises an exception

Steps/Code to Reproduce

import pandas as pd
from skrub import AggTarget
import skrub

print(skrub.__version__)

X = pd.DataFrame({
    "flightId": range(1, 7),
    "from_airport": [1, 1, 1, 2, 2, 2],
    "total_passengers": [90, 120, 100, 70, 80, 90],
    "company": ["DL", "AF", "AF", "DL", "DL", "TR"],
})
y = pd.Series([1, 1, 0, 0, 1, 1])
agg_target = AggTarget(
    main_key="from_airport",
    operation=["hist(2)"],
)
agg_target.fit_transform(X, y)

Expected Results

no error

Actual Results

import pandas as pd
from skrub import AggTarget

X = pd.DataFrame({
    "flightId": range(1, 7),
    "from_airport": [1, 1, 1, 2, 2, 2],
    "total_passengers": [90, 120, 100, 70, 80, 90],
    "company": ["DL", "AF", "AF", "DL", "DL", "TR"],
})
y = pd.Series([1, 1, 0, 0, 1, 1])
agg_target = AggTarget(
    main_key="from_airport",
    operation=["hist(2)"],
)
agg_target.fit_transform(X, y)

Versions

I get the same result in `0.1.0` or in the main branch

jeromedockes avatar Jul 29 '24 15:07 jeromedockes

note: this only happens if the the y Series doesn't have a name

jeromedockes avatar Jul 29 '24 15:07 jeromedockes