automlbenchmark
automlbenchmark copied to clipboard
Reuse AWS instances
This is a suggestion I found in the TODO:
We can reuse an AWS instance instead of shutting them down after each job. At least during a single benchmark, we could limit #instances = #parallel jobs.
I am not sure how this would actually work, in principle I see there is a speed up:
- only have to create the instance once
- only have to download a dataset once (if subsequent jobs use the same dataset)
- only have to install the automl framework once (if subsequent jobs use the same framework)
I also see some pitfalls:
- When re-using an instance for multiple frameworks, the second framework might be affected by leftovers of the installation of the first.
- To provide the same disk space, does this mean we have to make sure to transfer/delete all (cache) files of the previous run?
- Does this require additional communication? The most naive way I can imagine of doing this is to just run all folds of a (framework, task) tuple per instance, which should not require additional communication (though some extra clean up, see above). This wouldn't allow perfect re-use but it's probably pretty good.
- (for the future:) Is there an increased risk of interruption when using spot instances? Is the effect greater? I assume the answer is no to both (afaik interruption is just based on bid-price, and when transferring results between each part of the job (e.g. fold), no more than one is lost), but I am not sure.
@PGijsbers yes, this is an old suggestion, and I agree with your pitfalls, so I would not consider this as a priority. Good to keep this for reference though.