cloudml icon indicating copy to clipboard operation
cloudml copied to clipboard

consider setting repo to use linux binaries from PPM

Open slopp opened this issue 4 years ago • 2 comments

Note that the very first time you submit a job to CloudML the various packages required to run your script will be compiled from source. This will make the execution time of the job considerably longer that you might expect. It’s only the first job that incurs this overhead though (since the package installations are cached), and subsequent jobs will run more quickly.

We could significantly reduce the first job time and compilation errors by using the public package manager to provide binary packages, potentially as an opt-out option

slopp avatar Oct 07 '20 22:10 slopp

This sounds pretty great, honestly! Our hesitation here is that we need to reconsider how one trains torch jobs in the cloud, if the answer is cloudml, which I think might be, then we should totally do this work.

javierluraschi avatar Oct 07 '20 23:10 javierluraschi

I'd add that currently, cloudml does not have a dependency to Python/reticulate, so this could be a great way to train models. However, is also worth considering if we could come up with a multi-cloud approach that supports more than just Google Cloud, maybe even RStudio Connect or the Job Launcher?

javierluraschi avatar Oct 07 '20 23:10 javierluraschi