dask-ml icon indicating copy to clipboard operation
dask-ml copied to clipboard

Whether xgboost_requires needs to be changed

Open t-karatsu opened this issue 4 years ago • 11 comments

I'm sorry if I made a mistake in the place to post.

https://github.com/dask/dask-ml/blob/main/setup.py Currently dask-ml depends on the following OSS when building with xgboost function:

xgboost_requires = ["dask-xgboost", "xgboost"]

https://github.com/dmlc/xgboost/blob/master/python-package/setup.py However currently xgboost owns the dask option. Does xgboost_requires need to be modified?

  • xgboost+dask only or
  • dask-xgboost and xgboost+dask

t-karatsu avatar Feb 17 '21 16:02 t-karatsu

To clarify, according to https://github.com/dask/dask-xgboost/issues/80, dask-xgboost is now deprecated and packages should now depend on xgboost+dask instead.

adamjstewart avatar Feb 17 '21 17:02 adamjstewart

Yep, @adamjstewart is correct. (it'd just be xgboost in the dask-ml[xgboost]` extra, since dask is already in the main requirements).

TomAugspurger avatar Feb 18 '21 02:02 TomAugspurger

@TomAugspurger Thanks for your reply. I tried to correct some sources and build, but one of the xgboost modules was not found when building docs. Is there a problem with my fix?

  • Draft of correction
diff --git a/dask_ml/xgboost.py b/dask_ml/xgboost.py
index 86e841db..70551df7 100644
--- a/dask_ml/xgboost.py
+++ b/dask_ml/xgboost.py
@@ -4,4 +4,4 @@ This may be used for training an XGBoost model on a cluster. XGBoost
 will be setup in distributed mode alongside your existing
 ``dask.distributed`` cluster.
 """
-from dask_xgboost import *  # noqa
+from xgboost import *  # noqa
diff --git a/setup.py b/setup.py
index 857f6911..fb280973 100644
--- a/setup.py
+++ b/setup.py
@@ -35,7 +35,7 @@ test_requires = [
     "pytest-mock",
 ]
 dev_requires = doc_requires + test_requires
-xgboost_requires = ["dask-xgboost", "xgboost"]
+xgboost_requires = ["xgboost[dask]"]
 complete_requires = xgboost_requires

 extras_require = {
  • Occurred warning

Only the dask_ml.xgboost.predict module was not found.

==> [2021-02-19-18:10:56.073922] 'make' '-j16' 'html'
Running Sphinx v3.2.0
making output directory... done
WARNING: html_static_path entry '_static' does not exist
loading intersphinx inventory from https://docs.python.org/3.6/objects.inv...
loading intersphinx inventory from http://scikit-learn.org/stable/objects.inv...
loading intersphinx inventory from https://docs.dask.org/en/latest/objects.inv...
loading intersphinx inventory from https://distributed.dask.org/en/latest/objects.inv...
loading intersphinx inventory from http://dask-glm.readthedocs.io/en/latest/objects.inv...
intersphinx inventory has moved: http://scikit-learn.org/stable/objects.inv -> https://scikit-learn.org/stable/objects.inv
intersphinx inventory has moved: http://dask-glm.readthedocs.io/en/latest/objects.inv -> https://dask-glm.readthedocs.io/
en/latest/objects.inv
[autosummary] generating autosummary for: changelog.rst, clustering.rst, compose.rst, contributing.rst, cross_validation.rst, examples.rst, glm.rst, history.rst, hyper-parameter-search.rst, incremental.rst, ..., keras.rst, meta-estimators.rst, modules/api.rst, modules/generted/dask_ml.compose.ColumnTransformer.rst, modules/generted/dask_ml.compose.make_column_transformer.rst, naive-bayes.rst, preprocessing.rst, pytorch.rst, roadmap.rst, xgboost.rst
WARNING: [autosummary] failed to import 'dask_ml.xgboost.predict': no module named dask_ml.xgboost.predict

This module was not loaded.

reading sources... [ 86%] modules/generated/dask_ml.wrappers.ParallelPostFit
reading sources... [ 87%] modules/generated/dask_ml.xgboost.XGBClassifier
reading sources... [ 89%] modules/generated/dask_ml.xgboost.XGBRegressor
reading sources... [ 90%] modules/generated/dask_ml.xgboost.train
reading sources... [ 91%] modules/generted/dask_ml.compose.ColumnTransformer

t-karatsu avatar Feb 19 '21 12:02 t-karatsu

@TomAugspurger Could you check this? Is it still difficult to be consistent to change dependencies now?

t-karatsu avatar Feb 21 '21 16:02 t-karatsu

I can take a look next week.

TomAugspurger avatar Feb 21 '21 17:02 TomAugspurger

Hello, @TomAugspurger. What’s the situation of this issue after that?

t-karatsu avatar Mar 10 '21 02:03 t-karatsu

I think we can just remove the xgboost stuff in the setup.py and update the docs at https://ml.dask.org/xgboost.html to use it.

People should just use xgboost.dask, and there's really no reason to add it under the dask_ml.xgboost namespace.

TomAugspurger avatar Mar 10 '21 11:03 TomAugspurger

cc @hcho3 @trivialfis @JohnZed (for vis)

jakirkham avatar Mar 23 '21 17:03 jakirkham

What you want to say is

  • Delete dask-ml / xgboost.py
  • To use the xgboost function, call it directly from xgboost.dask

don't you? I understood, thanks.

t-karatsu avatar Mar 26 '21 09:03 t-karatsu

To use the xgboost function, call it directly from xgboost.dask

Correct.

Delete dask-ml / xgboost.py

We'll want to add a deprecation warning to that model that emits when people import it.

TomAugspurger avatar Apr 04 '21 22:04 TomAugspurger

Looks like this is being worked on in #844

mmccarty avatar Sep 29 '21 02:09 mmccarty