LightGBM icon indicating copy to clipboard operation
LightGBM copied to clipboard

[python-package] [dask] Dask estimators raise an unavoidable warning

Open jameslamb opened this issue 1 year ago • 7 comments

Description

When training a model with the lightgbm.dask estimators, this warning is always emitted:

/Users/runner/miniforge/envs/test-env/lib/python3.11/site-packages/lightgbm/dask.py:549: UserWarning: Parameter n_jobs will be ignored. _log_warning(f"Parameter {param_alias} will be ignored.")

Nothing in lightgbm's public interface can suppress this, and it shows up even when using all default values of parameters. That's a little annoying and a little confusing.... it should be changed.

Specifically... for num_threads and its aliases, the warning should be not raised if the value is -1 or None.

Reproducible example

import dask.array as da
import lightgbm as lgb
from distributed import Client, LocalCluster
from sklearn.datasets import make_blobs

X, y = make_blobs(n_samples=1000, n_features=50, centers=2)

cluster = LocalCluster()
client = Client(cluster)

dX = da.from_array(X, chunks=(100, 50))
dy = da.from_array(y, chunks=(100,))
dask_model = lgb.DaskLGBMClassifier(n_estimators=10)
dask_model.fit(dX, dy)

Environment info

LightGBM version or commit hash: https://github.com/microsoft/LightGBM/commit/3654ecaaa1a33628dd2b1cc936c3f1cfd31079a4

Command(s) you used to install LightGBM

cmake -B build -S .
cmake --build build --target _lightgbm
sh build-python.sh install --precompile

Additional Comments

This is coming from here:

https://github.com/microsoft/LightGBM/blob/3654ecaaa1a33628dd2b1cc936c3f1cfd31079a4/python-package/lightgbm/dask.py#L544-L550

I think because n_jobs is an alias for num_threads.

from lightgbm.basic import _ConfigAliases
_ConfigAliases.get("num_threads")
# {'n_jobs', 'num_threads', 'num_thread', 'nthread', 'nthreads'}

And it's guaranteed to be present in params there, because it's in the signature of the estimators.

https://github.com/microsoft/LightGBM/blob/3654ecaaa1a33628dd2b1cc936c3f1cfd31079a4/python-package/lightgbm/dask.py#L1135

I noticed this in CI logs:

https://github.com/microsoft/LightGBM/actions/runs/12922031182/job/36037058867#step:3:6718

jameslamb avatar Jan 25 '25 03:01 jameslamb

May I work on this issue?

devesh-2002 avatar Jan 25 '25 06:01 devesh-2002

Sure, thanks!

jameslamb avatar Jan 25 '25 06:01 jameslamb

@jameslamb can i work on this

zacharyftw avatar Jan 26 '25 06:01 zacharyftw

@KekmaTime thanks for your interest in LightGBM! But no, let's please give @devesh-2002 time to attempt it. If you are looking to help out here, you could consider some of the other "good first issue" items at https://github.com/microsoft/LightGBM/issues?q=state%3Aopen%20label%3A%22good%20first%20issue%22.

jameslamb avatar Jan 26 '25 17:01 jameslamb

okay 👍🏽

zacharyftw avatar Jan 27 '25 01:01 zacharyftw

Is this still an issue? If so, could I work on it?

wohoef avatar Jul 08 '25 17:07 wohoef

Thanks for your interest! Yes you could work on it. I think it's safe to say @devesh-2002 has abandoned this (no problem, it happens!).

We'd welcome a pull request fixing this.

jameslamb avatar Jul 08 '25 17:07 jameslamb