dask-xgboost icon indicating copy to clipboard operation
dask-xgboost copied to clipboard

dxgb.train throws ValueError: need more than 1 value to unpack

Open koolaquarian213 opened this issue 6 years ago • 6 comments

I am running a master and 5 node cluster on aws. All my feature variables (X_train) are continuous and have been properly cleaned and null values filled. The target label (y_train) is 0 or 1 (float64); I get the following error when trying to execute the following: bst = dxgb.train(client, params, X_train, y_train) where X_train and y_train are data_train and labels_train;

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-88-ff678a0c4ab4> in <module>()
----> 1 bst = dxgb.train(client, params, X_train, y_train)

/usr/local/lib/python2.7/site-packages/dask_xgboost/core.pyc in train(client, params, data, labels, dmatrix_kwargs, **kwargs)
    167     """
    168     return sync(client.loop, _train, client, params, data,
--> 169                 labels, dmatrix_kwargs, **kwargs)
    170 
    171 

/usr/local/lib/python2.7/site-packages/distributed/utils.pyc in sync(loop, func, *args, **kwargs)
    275             e.wait(10)
    276     if error[0]:
--> 277         six.reraise(*error[0])
    278     else:
    279         return result[0]

/usr/local/lib/python2.7/site-packages/distributed/utils.pyc in f()
    260             if timeout is not None:
    261                 future = gen.with_timeout(timedelta(seconds=timeout), future)
--> 262             result[0] = yield future
    263         except Exception as exc:
    264             error[0] = sys.exc_info()

/usr/local/lib64/python2.7/site-packages/tornado/gen.pyc in run(self)
   1131 
   1132                     try:
-> 1133                         value = future.result()
   1134                     except Exception:
   1135                         self.had_exception = True

/usr/local/lib64/python2.7/site-packages/tornado/concurrent.pyc in result(self, timeout)
    259         if self._exc_info is not None:
    260             try:
--> 261                 raise_exc_info(self._exc_info)
    262             finally:
    263                 self = None

/usr/local/lib64/python2.7/site-packages/tornado/gen.pyc in run(self)
   1145                             exc_info = None
   1146                     else:
-> 1147                         yielded = self.gen.send(value)
   1148 
   1149                     if stack_context._state.contexts is not orig_stack_contexts:

/usr/local/lib/python2.7/site-packages/dask_xgboost/core.pyc in _train(client, params, data, labels, dmatrix_kwargs, **kwargs)
    119 
    120     # Start the XGBoost tracker on the Dask scheduler
--> 121     host, port = parse_host_port(client.scheduler.address)
    122     env = yield client._run_on_scheduler(start_tracker,
    123                                          host.strip('/:'),

/usr/local/lib/python2.7/site-packages/dask_xgboost/core.pyc in parse_host_port(address)
     22     if '://' in address:
     23         address = address.rsplit('://', 1)[1]
---> 24     host, port = address.split(':')
     25     port = int(port)
     26     return host, port

ValueError: need more than 1 value to unpack

koolaquarian213 avatar Oct 29 '18 20:10 koolaquarian213

I get this too when running a Client with processes=false

gvelchuru avatar Nov 13 '18 00:11 gvelchuru

Sorry I missed this @samsonpaturi. How did you create your client? We should probably be using distributed.utils.parse_host_address. What is the value of address in that traceback? We may need to call distributed.utils.resolve_address before splitting it.

@gvelchuru I'm curious: what's your goal using dask-xgboost with a local cluster? Any reason not to use XGBoost directly in that case?

TomAugspurger avatar Nov 13 '18 17:11 TomAugspurger

@TomAugspurger I have a similar issue I believe.

My use case for dask-xgboost on local cluster is that my data set is very large. I was hoping to test the benefits of dask with training on large datasets.

I have a big dataset

XGBoost is designed to be memory efficient. Usually it can handle problems as long as the data fit into your memory. (This usually means millions of instances) If you are running out of memory, checkout external memory version or distributed version of XGBoost.

From the XGBoost Docs

Here is my Traceback for this problem.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-6-3d556431b00b> in <module>
     15           'min_child_weight': 0.5}
     16 
---> 17 bst = dask_xgboost.train(client, params, X_train, y_train, num_boost_round=10)

~/anaconda3/envs/py36/lib/python3.6/site-packages/dask_xgboost/core.py in train(client, params, data, labels, dmatrix_kwargs, **kwargs)
    167     """
    168     return sync(client.loop, _train, client, params, data,
--> 169                 labels, dmatrix_kwargs, **kwargs)
    170 
    171 

~/anaconda3/envs/py36/lib/python3.6/site-packages/distributed/utils.py in sync(loop, func, *args, **kwargs)
    329             e.wait(10)
    330     if error[0]:
--> 331         six.reraise(*error[0])
    332     else:
    333         return result[0]

~/anaconda3/envs/py36/lib/python3.6/site-packages/six.py in reraise(tp, value, tb)
    691             if value.__traceback__ is not tb:
    692                 raise value.with_traceback(tb)
--> 693             raise value
    694         finally:
    695             value = None

~/anaconda3/envs/py36/lib/python3.6/site-packages/distributed/utils.py in f()
    314             if timeout is not None:
    315                 future = gen.with_timeout(timedelta(seconds=timeout), future)
--> 316             result[0] = yield future
    317         except Exception as exc:
    318             error[0] = sys.exc_info()

~/anaconda3/envs/py36/lib/python3.6/site-packages/tornado/gen.py in run(self)
    727 
    728                     try:
--> 729                         value = future.result()
    730                     except Exception:
    731                         exc_info = sys.exc_info()

~/anaconda3/envs/py36/lib/python3.6/site-packages/tornado/gen.py in run(self)
    740                             exc_info = None
    741                     else:
--> 742                         yielded = self.gen.send(value)
    743 
    744                 except (StopIteration, Return) as e:

~/anaconda3/envs/py36/lib/python3.6/site-packages/dask_xgboost/core.py in _train(client, params, data, labels, dmatrix_kwargs, **kwargs)
    119 
    120     # Start the XGBoost tracker on the Dask scheduler
--> 121     host, port = parse_host_port(client.scheduler.address)
    122     env = yield client._run_on_scheduler(start_tracker,
    123                                          host.strip('/:'),

~/anaconda3/envs/py36/lib/python3.6/site-packages/dask_xgboost/core.py in parse_host_port(address)
     22     if '://' in address:
     23         address = address.rsplit('://', 1)[1]
---> 24     host, port = address.split(':')
     25     port = int(port)
     26     return host, port

ValueError: not enough values to unpack (expected 2, got 1)

zdwhite avatar Jun 25 '19 17:06 zdwhite

@zdwhite what version of dask / distributed? What's your scheduler address?

TomAugspurger avatar Jun 25 '19 18:06 TomAugspurger

@TomAugspurger Let me check

#Dask
import dask.dataframe as dd
from dask.distributed import Client
client = Client()  # start distributed scheduler locally.  Launch dashboard

client

Scheduler Address (this changes I believe) tcp://127.0.0.1:54582

dask==1.2.2
dask-cuda==0.6.0
dask-glm==0.2.0
dask-kubernetes==0.8.0
dask-ml==1.0.0
dask-xgboost==0.1.5

distributed==1.28.1

zdwhite avatar Jun 26 '19 03:06 zdwhite

@zdwhite you might want to try inserting a breakpoint in parse_host_port in dask_xgboost/core.py to debug further.

TomAugspurger avatar Jul 08 '19 15:07 TomAugspurger