terraform-provider-iterative icon indicating copy to clipboard operation
terraform-provider-iterative copied to clipboard

Bitbucket GPU demo

Open casperdcl opened this issue 1 year ago • 0 comments

self-hosted Bitbucket Pipelines don't support GPU runners https://github.com/iterative/cml/issues/1015. A work-around is using TPI task directly in BB Pipelines.

Proposal

Put together an example repo/tuorial:

BB Pipelines -> TPI task (exporting env vars) -> AWS ec2 -> DVC repro && CML report

  • "export env vars" means^1 environment = { "BITBUCKET_*" = "", CI = "" }
    • this would mean CI commands (e.g. cml send-comment) would "just work" out-of-the-box
  • BB equivalent of easy GH ssh debugging: https://registry.terraform.io/providers/iterative/iterative/latest/docs/guides/getting-started#debugging
  • and/or maybe uploading tfstate as an artifact?
  • probably fixing TPI#630
  • image = "nvidia"
  • tags = { Name = "Foo" }?
  • basic readme (explaining how to use)
  • post link on docs (TPI & CML?)
  • blog (devrel)
    • part 1: python train.py (using https://github.com/pytorch/examples/tree/main/mnist)
    • part 2: docker run

Alternatives

  • deconstruct, patch & rebuild BB java runner
  • monitor BB java runner releases & repeat the above whenever there's a new release

casperdcl avatar Aug 02 '22 14:08 casperdcl