Zyda2 tutorial - TypeError when initializing Dask CPU cluster
Describe the bug
In the Zyda2 tutorial, several scripts like the process_dclm.py attempt to start a Dask LocalCluster. These scripts take an environment variable
CPU_WORKERS = os.environ.get("CPU_WORKERS") to setup the cluster with equivalent workers using the following code cluster = LocalCluster(n_workers=CPU_WORKERS, processes=True, memory_limit="48GB"). A TypeError is raised because n_workers is expected to be an integer.
Steps/Code to reproduce bug
- Follow steps in tutorial
- Run
python3 0_processing/process_dclm.py - Script errors with following error
Traceback (most recent call last):
File "...NeMo-Curator/tutorials/zyda2-tutorial/0_processing/process_dclm.py", line 21, in <module>
cluster = LocalCluster(n_workers=CPU_WORKERS, processes=True, memory_limit="48GB")
File "/usr/local/lib/python3.10/dist-packages/distributed/deploy/local.py", line 211, in __init__
threads_per_worker = max(1, int(math.ceil(CPU_COUNT / n_workers)))
TypeError: unsupported operand type(s) for /: 'int' and 'str'
Expected behavior
Dask cluster is created and data is processed, script completes successfully
Environment overview (please complete the following information)
- Environment location: Slurm
- Method of NeMo-Curator install: docker container, dev image from nvcr.io/nvidia/nemo:dev
@ronjer30 Could you share the latest on this?
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.
This issue was closed because it has been inactive for 7 days since being marked as stale.