dask-cuda Configuration Profiles

We've had several issues come in related to defaults/cluster configurations (https://github.com/rapidsai/dask-cuda/issues/990 / https://github.com/rapidsai/dask-cuda/issues/348 / ...) and a general request to support "Configuration Profiles". As there are many ways to configure a Dask Cluster: CLI, ENV Var, Yaml Config, etc -- users would like to have a higher level concept i.e. a Profile, which codifies are particular configuration. This might be a configuration related to performance, debugging, benchmarking, environment (cloud vs hpc, vs local), etc.

Earlier today, it was pointed out to me that Dask does in fact check a couple of locations for YAML config files and the location itself is also configurable via the DASK_CONFIG env var:

https://docs.dask.org/en/stable/configuration.html?highlight=DASK_CONFIG#yaml-files
https://github.com/dask/dask/blob/ebb9c22ca68b39e1757c45cd608cf6ee2db6af62/dask/config.py#L20-L38

For example, we can create the following RAPIDS-TPCDS/distributed.yaml

# distributed.yaml
distributed:
  logging:
    distributed: info
  version: 2
rmm:
  pool-size: 30GB
ucx:
  cuda_copy: true
  infiniband: true
  nvlink: true
  reuse-endpoints: true
  tcp: true

Then load like the following:

$ DASK_CONFIG=~/RAPIDS-TPCDS/ python -c "import dask, distributed; print(dask.config.get('distributed.rmm'))" {'pool-size': '30GB'}

This is quite useful and would unblock, I think, several folks. However, I don't think we are done yet.

The above example shows us that the UCX configuration in Distiributed.yaml is lacking the all button
We are missing a number of options: device-memory-limit, jit-unspill
We need help building some pre-configured configurations

For problem 1) and 2), what would folks think about building a config system inside of dask-cuda. This would be similar to what dask-kubernetes and dask-sql do. If we had this, then would could also think moving/reducing the configuration inside distributed.

For 3) would could ask folks like @VibhuJawa @ayushdg and @randerzander for their input in building their defaults as highlighted options

Another thing we could also think about is pushing the DASK_CONFIG env var to a CLI option for workers and schedulers;

dask-worker --dask-config-path /foo/bar

Oct 18 '22 20:10 quasiben

Thanks a lot for starting this discussion @quasiben. These set of yaml files for pre-baked profiles is something I'm in favor of.

I had a few questions around the usage of configuration files:

Do the yaml based configuration only get picked up when creating a cluster from a python client ( from dask_cuda import LocalCUDACluster; cluster = LocalCUDACluster()) or does it also apply to scheduler and worker processes started up from the command line as well? (which is how we setup most multi-node workflows today)
Can the Yaml file be extended to also set certain ucx environment variables (like UCX_TCP_RX_SEG_SIZE) or would we still have to export those separately on the scheduler, worker and client processes?
Specifically for things like rmm pool-size it might make sense to allow passing a memory fraction instead of an absolute value. Curious on your thoughts here @VibhuJawa.

Generally I'm happy to help providing input for some of these default options, and also curious to hear if others opinions on what high level defaults make the most sense.

Oct 18 '22 22:10 ayushdg

Specifically for things like rmm pool-size it might make sense to allow passing a memory fraction instead of an absolute value. Curious on your thoughts here @VibhuJawa.

Agree with this. Would make profiles less specific to a particular device type

Oct 19 '22 00:10 randerzander

Do the yaml based configuration only get picked up when creating a cluster from a python client ( from dask_cuda import LocalCUDACluster; cluster = LocalCUDACluster()) or does it also apply to scheduler and worker processes started up from the command line as well? (which is how we setup most multi-node workflows today)

It seems that it would work for the scheduler as well, but maybe @quasiben can confirm this. This is nevertheless a good question, the client must also pick the same configs.

Can the Yaml file be extended to also set certain ucx environment variables (like UCX_TCP_RX_SEG_SIZE) or would we still have to export those separately on the scheduler, worker and client processes?

Theoretically we could pass this to Dask, but I'm generally -1 on that because we would need to create a new Dask config for every possible UCX configuration. I think we need to think of a more robust way to address this rather than individually adding a new config for every UCX option, but I don't know if we can add in Dask some kind of config translation, e.g., distributed.comm.ucx.some-config -> UCX_SOME_CONFIG. If that's possible then we could support it without any major issues I believe. Alternatively, maybe Dask could support specifying environment variables directly within the yaml file?

Specifically for things like rmm pool-size it might make sense to allow passing a memory fraction instead of an absolute value. Curious on your thoughts here @VibhuJawa.

That's a good point, addressing this now in https://github.com/rapidsai/dask-cuda/pull/1021 .

Oct 19 '22 12:10 pentschev

Theoretically we could pass this to Dask, but I'm generally -1 on that because we would need to create a new Dask config for every possible UCX configuration. I think we need to think of a more robust way to address this rather than individually adding a new config for every UCX option, but I don't know if we can add in Dask some kind of config translation, e.g., distributed.comm.ucx.some-config -> UCX_SOME_CONFIG. If that's possible then we could support it without any major issues I believe. Alternatively, maybe Dask could support specifying environment variables directly within the yaml file?

We can, with something like this:

diff --git a/distributed/comm/ucx.py b/distributed/comm/ucx.py
index fc07d048..254918cc 100644
--- a/distributed/comm/ucx.py
+++ b/distributed/comm/ucx.py
@@ -126,7 +126,16 @@ def init_once():
 
     ucp = _ucp
 
-    ucp.init(options=ucx_config, env_takes_precedence=True)
+    def munge(key):
+        return "_".join(map(str.upper, key.split("-")))
+
+    environment = {
+        munge(k): v
+        for k, v in dask.config.get("distributed.comm.ucx.environment", {}).items()
+    }
+    # Specific ucx_config options should override general environment ones
+    environment.update(ucx_config)
+    ucp.init(options=environment, env_takes_precedence=True)
 
     pool_size_str = dask.config.get("distributed.rmm.pool-size")
 
diff --git a/distributed/distributed-schema.yaml b/distributed/distributed-schema.yaml
index 4e31840d..d8207e5a 100644
--- a/distributed/distributed-schema.yaml
+++ b/distributed/distributed-schema.yaml
@@ -961,7 +961,14 @@ properties:
                   additional variables for each transport, while ensuring optimal connectivity. When
                   ``True``, a CUDA context will be created on the first device listed in
                   ``CUDA_VISIBLE_DEVICES``.
-
+              environment:
+                type: [object, 'null']
+                description: |
+                  Mapping setting specified UCX environment variables.
+                  A name
+                  ``distributed.comm.ucx.environment.some-option=value``
+                  is equivalent to setting ``UCX_SOME_OPTION=value` in
+                  the environment.
           tcp:
             type: object
             properties:
diff --git a/distributed/distributed.yaml b/distributed/distributed.yaml
index 81033703..821232b1 100644
--- a/distributed/distributed.yaml
+++ b/distributed/distributed.yaml
@@ -229,7 +229,8 @@ distributed:
       infiniband: null  # enable Infiniband
       rdmacm: null  # enable RDMACM
       create-cuda-context: null  # create CUDA context before UCX initialization
-
+      environment:
+        max_copy_reg: 1
     zstd:
       level: 3      # Compression level, between 1 and 22.
       threads: 0    # Threads to use. 0 for single-threaded, -1 to infer from cpu count.

Then I can do:

In[3]: dask.config.get("distributed.comm.ucx.environment")
Out[3]: {'max_copy_reg': 1}

Oct 19 '22 16:10 wence-

I'm 100% onboard with this idea @wence- , could you submit a PR to distributed? And thanks for one more neat solution Lawrence-style!

Oct 19 '22 16:10 pentschev

For 3) would could ask folks like @VibhuJawa @ayushdg and @randerzander for their input in building their defaults as highlighted options

More than happy to help on this . We can couple of these one for pure ETL,RAPIDS+DL , RAPIDS+Graph etc.

Specifically for things like rmm pool-size it might make sense to allow passing a memory fraction instead of an absolute value. Curious on your thoughts here @VibhuJawa.

I think we are all in agreement about allowing memory fraction is a good idea for defaults

For example, we can create the following RAPIDS-TPCDS/distributed.yaml

I think we can think about creating one config in the cugraph repo too to help customers and users to the right configuration. CC: @rlratzel , @jnke2016

Oct 19 '22 19:10 VibhuJawa

https://github.com/rapidsai/dask-cuda/pull/1021 is now in and should allow specifying rmm_pool_size=0.95/--rmm-pool-size 0.95, for example. Please let me know if you find any issues with it.

Oct 20 '22 10:10 pentschev

Generic UCX environment setting via dask config is dask/distributed#7164

Oct 20 '22 11:10 wence-

dask-cuda dask-cuda copied to clipboard

Configuration Profiles

dask-cuda
dask-cuda copied to clipboard