cml icon indicating copy to clipboard operation
cml copied to clipboard

workflow restart logic not needed for non-spot

Open casperdcl opened this issue 3 years ago • 6 comments

casperdcl avatar Aug 23 '22 17:08 casperdcl

...unless your workflow runs take more than 35 days[^1] to finish?

[^1]: On GitHub, as per https://github.com/iterative/cml/pull/1067.

0x2b3bfa0 avatar Sep 07 '22 11:09 0x2b3bfa0

I think we should strongly consider dropping that.

Reasons I would use cml runner

  • short-lived instance life-cycle management because I don't want to pay for that cloud GPU any longer than is required.
  • Super Simple setup of the CI agent on a system

I think that something that runs longer than 35 days doesn't really fall into one of the above.

/opinions?

dacbd avatar Sep 09 '22 00:09 dacbd

I wholeheartedly agree: the 35 day limit is enough for all the considered use cases unless proven otherwise, and the maintenance overhead of this feature probably outweighs the dubious edge cases where it becomes useful.

0x2b3bfa0 avatar Sep 09 '22 02:09 0x2b3bfa0

>35 days: use LEO

casperdcl avatar Sep 09 '22 20:09 casperdcl

short-lived instance life-cycle management because I don't want to pay for that cloud GPU any longer than is required.

Fine tune Stable difussion, GPT-2 or GPT-3 alternatives might take more than 35 days. Just mentioning 😁

DavidGOrtega avatar Oct 25 '22 14:10 DavidGOrtega

>35 days: use LEO

Or anything use LEO

DavidGOrtega avatar Oct 25 '22 14:10 DavidGOrtega