LightGBM icon indicating copy to clipboard operation
LightGBM copied to clipboard

Pythonw.exe has Stopped Working when setting 'monotone_constraints'.

Open takeyama0 opened this issue 2 years ago • 1 comments

Description

2022/4/26 With Windows10, I encountered this error when I tried to execute lightgbm.train() with setting monotone_constraints params.

{ 'objective': 'regression',
  'verbose': -1,
  'monotone_constraints': [1] * 15 + [0] * 23 + [-1] * 4 } 

The error is described in the event viewer as follows.

Problem signature
P1: python.exe
P2: 3.9.5150.1013
P3: 60903347
P4: ucrtbase.dll
P5: 10.0.19041.789
P6: 2bd748bf
P7: 000000000007286e
P8: c0000409
P9: 0000000000000007
P10: 

2022/5/9 I found that using 'monotone_constraints' with Pandas 'category' type causes the error.

Reproducible example

2022/4/26 I cannot share the code as it is because I am dealing with confidential business data. I'm trying to create reproducible example code.

2022/5/9 I created the following reproducible sample code.

# import library
import numpy as np
import pandas as pd
import altair as alt
import lightgbm as lgb

# create example dataset
size = 100
df = pd.DataFrame(
    {
        "x": np.linspace(0, 10, size),
        "y": np.linspace(0, 10, size)**2 + 10 - (20 * np.random.random(size))
    } | {f"x{i}": np.random.random(size) if i < 40 else np.random.randint(0, 10, size) for i in range(0, 50)}
    
)

# When set_category is False, learning LightGBM model works. 
# But when you set True to set_category, the training will fail.
set_category = False

if set_category:
    df[[f'x{i}' for i in range(40, 50)]] = df[[f'x{i}' for i in range(40, 50)]].astype('category')

df.head(5)

print(df.dtypes)

# train LightGBM model with monotone constraints
lgb_train = lgb.Dataset(df.drop('y', axis=1), df["y"])
params = {
    'objective': 'mse',
    'verbose': -1,
    'num_threads':8,
    'min_child_samples': 5,
    'monotone_constraints': [1] + [0] * 40 + [0] * 10, 
}

monotone_model = lgb.train(
            params,
            lgb_train,
            num_boost_round=100,
        )

# plot the x-dependence of the model outputs
df_tmp = df.copy()
df_tmp[[f"x{i}" for i in range(0, 40)]] = 0.5
df_tmp[[f"x{i}" for i in range(40, 50)]] = 0

monotone_output = pd.DataFrame(
    {
        "x": df_tmp["x"],
        "y": df_tmp["y"],
        "y_pred": monotone_model.predict(df_tmp.drop('y', axis=1))
    }
)

alt.Chart(monotone_output).mark_point().encode(
    x="x",
    y="y"
) + alt.Chart(monotone_output).mark_line().encode(
    x="x",
    y="y_pred",
    color=alt.value("red")
)

Environment info

LightGBM version or commit hash: 3.3.2

Command(s) you used to install LightGBM

pip install lightgbm

Runtime environment is x64 Windows10Pro 20H2 Microsoft Visual C++ 2015-2022 Redistributable(x64)-14.31.31103

Additional Comments

The VC++ module was being called, so I re-installed it from the MS site and tried again, but the same error happened. While executing lightgbm.train(), I checked task manager and it does not appear to be running out of memory.

takeyama0 avatar Apr 26 '22 10:04 takeyama0

I have added the sample code and updated the comments. Setting 'set_category' to True in the sample code causes the error. I found that using 'monotone_constraints' with Pandas 'category' type causes the error.

takeyama0 avatar May 09 '22 04:05 takeyama0