Daft icon indicating copy to clipboard operation
Daft copied to clipboard

[BUG] Bug with tqdm progress bar display when running in Ray client mode with remote cluster

Open jaychia opened this issue 2 years ago • 0 comments

Describe the bug

When running on a remote cluster via Ray client, progress bars seem to be broken:

(SchedulerActor pid=180, ip=10.0.66.234) Exception in thread 0d287252-6ae5-445b-9b96-5e412af6ab5d:
(SchedulerActor pid=180, ip=10.0.66.234) Traceback (most recent call last):
(SchedulerActor pid=180, ip=10.0.66.234)   File "/home/ray/anaconda3/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
(SchedulerActor pid=180, ip=10.0.66.234)     self.run()
(SchedulerActor pid=180, ip=10.0.66.234)   File "/home/ray/anaconda3/lib/python3.10/threading.py", line 953, in run
(SchedulerActor pid=180, ip=10.0.66.234)     self._target(*self._args, **self._kwargs)
(SchedulerActor pid=180, ip=10.0.66.234)   File "/home/ray/anaconda3/lib/python3.10/site-packages/ray/util/tracing/tracing_helper.py", line 460, in _resume_span
(SchedulerActor pid=180, ip=10.0.66.234)     return method(self, *_args, **_kwargs)
(SchedulerActor pid=180, ip=10.0.66.234)   File "/Users/jaychia/code/venv-demo/lib/python3.10/site-packages/ray/util/tracing/tracing_helper.py", line 460, in _resume_span
(SchedulerActor pid=180, ip=10.0.66.234)   File "/tmp/ray/session_2023-12-14_02-16-08_035207_7/runtime_resources/pip/aeb7de99a29ab6cec9bf133f8005a8e5d32df3a9/virtualenv/lib/python3.10/site-packages/daft/runners/ray_runner.py", line 578, in _run_plan
(SchedulerActor pid=180, ip=10.0.66.234)     pbar.mark_task_start(task)
(SchedulerActor pid=180, ip=10.0.66.234)   File "/tmp/ray/session_2023-12-14_02-16-08_035207_7/runtime_resources/pip/aeb7de99a29ab6cec9bf133f8005a8e5d32df3a9/virtualenv/lib/python3.10/site-packages/daft/runners/progress_bar.py", line 63, in mark_task_start
(SchedulerActor pid=180, ip=10.0.66.234)     pb.total += 1
(SchedulerActor pid=180, ip=10.0.66.234) AttributeError: 'tqdm' object has no attribute 'total'. Did you mean: '_total'?

To reproduce, run a remote Ray cluster and try this:

import daft
import ray

RAY_ADDRESS = "ray://localhost:10001"
ray.init(runtime_env={"pip": ["getdaft==0.2.7"]}, address=RAY_ADDRESS)
daft.context.set_runner_ray(address=RAY_ADDRESS)

import boto3
import daft

session = boto3.session.Session()
creds = session.get_credentials()

daft.set_planning_config(default_io_config=daft.io.IOConfig(
    s3=daft.io.S3Config(
        key_id=creds.access_key,
        access_key=creds.secret_key,
        session_token=creds.token,
    )
))

df = daft.read_csv("s3://noaa-global-hourly-pds/2023/**")
df.show()

(Running ray==2.4.0 and getdaft==0.2.7 both on the client-side and on the cluster)

jaychia avatar Dec 14 '23 10:12 jaychia