cml icon indicating copy to clipboard operation
cml copied to clipboard

tests: GCP

Open casperdcl opened this issue 4 years ago • 6 comments

Due to https://github.com/iterative/terraform-provider-iterative/pull/156, GCP should be supported.

  • [x] Fix bugs (#678?)
  • [ ] #921
  • [x] Update README (#717)
  • [x] Update https://cml.dev/doc (https://github.com/iterative/cml.dev/pull/105)

casperdcl avatar Jul 28 '21 20:07 casperdcl

start of "docs": https://github.com/iterative/terraform-provider-iterative/pull/156 and https://github.com/iterative/terraform-provider-iterative/pull/166

casperdcl avatar Aug 17 '21 17:08 casperdcl

Stub, from Notion

Prerequisites

  1. Create a new Google Cloud project (official documentation)
  2. Create a new service account for the newly created project (official documentation)
  3. Create a new service account key for the newly created service account (official documentation)
  4. Store the contents of the downloaded JSON key as a GitHub repository secret named GOOGLE_APPLICATION_CREDENTIALS_DATA

My failed blog post has some extra guidance for GitHub and GitLab and best practices for secret handling in CI/CD environments:

GitHub

  1. Add these two masked variables to your project:

    🔒 You can also store these values as external secrets instead of variables if your server is configured to support this feature

GitLab

  1. Add these two secrets to your repository:
    • REPO_TOKEN with a Personal Access Token with enough permissions for registering the self-hosted runner and publishing a comment with the results
    • ···

Usage

on: workflow_dispatch
jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: iterative/cml@v1
      - run: >-
          cml-runner
          --cloud=gcp 
          --cloud-region=us-west1-b
          --cloud-type=custom-8-65536-ext
          --cloud-gpu=v100
        env:
          REPO_TOKEN: ${{ secrets.PERSONAL_ACCESS_TOKEN }}
          GOOGLE_APPLICATION_CREDENTIALS_DATA: ${{ secrets.GOOGLE_APPLICATION_CREDENTIALS_DATA }}
  deploy:
      runs-on: self-hosted
      steps:
        - run: nvidia-smi

The --region option specifies what Google Cloud calls zones, not regions like in other cloud vendors.

You can check this list to determine which zones provide GPU accelerators and which models are available. Not every zone has availability for every GPU model.

Custom machine types can be specified with custom-{cores}-{memory} where {cores} represents the number of CPU cores and {memory} represents the RAM memory in megabytes; appending the -ext suffix will also enable extended memory.

GPU accelerators are only available on N1 and A2 machines. Trying to request accelerators in any other machine will produce an error.

Authentication

You can set either the GOOGLE_APPLICATION_CREDENTIALS_DATA environment variable to the contents of a service account JSON file, or the GOOGLE_APPLICATION_CREDENTIALS environment variable to the path of the mentioned file.

The former is more convenient for CI/CD scenarios, where secrets are (usually) provisioned through environment variables instead of files.

0x2b3bfa0 avatar Aug 17 '21 20:08 0x2b3bfa0

/tests

casperdcl avatar Oct 01 '21 10:10 casperdcl

/tests

Hwat? [sic]

0x2b3bfa0 avatar Oct 06 '21 13:10 0x2b3bfa0

DavidGOrtega avatar Oct 06 '21 15:10 DavidGOrtega

https://github.com/iterative/cml/issues/680#issuecomment-900616914 contains some valuable bits and pieces we don't have anywhere else. 🤔 Should they be promoted to cml.dev/doc?

0x2b3bfa0 avatar Mar 27 '22 03:03 0x2b3bfa0