softlearning icon indicating copy to clipboard operation
softlearning copied to clipboard

tune.sample_from parallelized

Open kapsl opened this issue 5 years ago • 1 comments
trafficstars

Hi, is there a possibility to run the different trials in parallel (locally) or in the cloud?

kapsl avatar Mar 17 '20 10:03 kapsl

Hey @kapsl,

Yeah, the trials should automatically be parallelized locally if you run softlearning run_example_local and set the resources correctly. By correct, I mean that, for example, if your computer has 16 cpus and you want to allocate 4 cpus per trial, then you can run 4 trials (= 16 cpu / (4 cpu / trial)) by setting --trial-cpus=4 in your softlearning command (e.g. softlearning run_example_local ... --trial-cpus=4). If you have gpus available, you can also set those similarly with --trial-gpu=... (gpus support fractional resources).

For running things in cloud, you can do something very similar with softlearning run_example_{ec2,gce} .... However, this requires a bit more manual setup to configure the ray autoscaler for the cluster (e.g. the ray-autoscaler-gce.yaml) and to create a VM image with all the dependencies to be used on the cloud. If at some point you want to try this option, I'm happy to write clearer step-by-step instructions about it.

An action item for myself would be to document these features a bit better.

hartikainen avatar Mar 17 '20 11:03 hartikainen