terraform-provider-iterative
terraform-provider-iterative copied to clipboard
`task` CML 🍬 Tear down by grepping
From users feedback
tear down the created resources one relies on a brittle condition based on grepping some text in the logs
Its far from ideal, specially having status not totally working or not working as expected #388
This probably belongs to a separate epic/milestone titled “using task from CI/CD systems”; not saying it's not equally important, though.
While it's still far from ideal, have you tried using terraform console or terraform show and jq instead?
Example
resource "iterative_task" "example" {
...
}
terraform console
terraform console <<< 'iterative_task.example.status["succeeded"]'
terraform show --json and jq
terraform show --json | jq --exit-status '
.values.root_module.resources[] |
select(.address == "iterative_task.example") |
.values.status.succeeded
'
I tried jq also but is not very friendly. However terraform console looks very interesting
Both console and jq would solve the "brittle" problem. Should be solved by some docs?
@iterative/cml definitely it's worth to use console and we could this just convert it into a doc issue.
Tips:
- note the
shell: bashdue to the redirection resource "iterative_task" "trainresource name is the one used in the checkif terraform console <<< 'iterative_task.train.status["succeeded"]'; then
Here is a proven workflow. In my workflow the output models folder and report.md is created with the train.py using cml to do a live metrics alike
name: train-my-model
on: [push]
jobs:
train-tpi:
runs-on: [ubuntu-latest]
steps:
- uses: actions/checkout@v2
with:
fetch-depth: 0
- uses: iterative/setup-cml@v1
- name: tpi
env:
REPO_TOKEN: ${{ secrets.REPO_TOKEN }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
AZURE_CLIENT_ID: ${{ secrets.AZURE_CLIENT_ID }}
AZURE_CLIENT_SECRET: ${{ secrets.AZURE_CLIENT_SECRET }}
AZURE_SUBSCRIPTION_ID: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
AZURE_TENANT_ID: ${{ secrets.AZURE_TENANT_ID }}
shell: bash
run: |
cat <<EOF > main.tf
terraform {
required_providers {
iterative = {
source = "iterative/iterative",
}
}
}
provider "iterative" {}
resource "iterative_task" "train" {
cloud = "az"
machine = "Standard_D2S_v3"
region = "us-west"
spot = 0
workdir {
input = "."
output = "."
}
environment = {
EPOCHS = 1
}
script = <<-END
#!/bin/bash
sudo apt update
sudo apt-get install -y software-properties-common build-essential python3-pip
pip3 install -r requirements.txt
python3 train.py
END
}
EOF
terraform init
terraform apply --auto-approve
if terraform console <<< 'iterative_task.train.status["succeeded"]'; then
echo 'Destroying...'
terraform destroy --auto-approve
cml send-github-check --token=$GITHUB_TOKEN --conclusion=success --title='CML report' report.md
cml pr --md output/* >> report.md
cml send-comment --update report.md
cml send-comment --update --pr --commit-sha HEAD report.md
else
echo 'Creating report...'
echo 'In progress...' > report.md
cml send-github-check --token=$GITHUB_TOKEN --conclusion=neutral --title='CML report' report.md
fi
note the shell: bash due to the redirection
GNU Bash version
if terraform console <<< 'iterative_task.example.status["succeeded"]'; then
...
fi
POSIX compliant version ™
if echo 'iterative_task.example.status["succeeded"]' | terraform console; then
...
fi
Here you're a tentative solution to the https://github.com/iterative/terraform-provider-iterative/issues/357 potential XY problem.
It would be nice to have an easier way of checking if a task has either failed or succeeded, but I'm afraid it would either involve exposing a redundant attribute (e.g. completed: boolean) or finding another way of invoking task from CI/CD systems.
Succeeded + failed check
if terraform console <<< 'try(iterative_task.example.status["succeeded"], 0) + try(iterative_task.example.status["failed"], 0)' | grep --quiet --invert-match 0; then
echo the task has either succeeded or failed, destroying the resources...
fi
Loop–based alternative (8 bytes less)
if terraform console <<< 'sum([for status in ["succeeded", "failed"] : try(iterative_task.example.status[status], 0)])' | grep --quiet --invert-match 0; then
echo the task has either succeeded or failed, destroying the resources...
fi
I don't quite grok if succeeded; then destroy; report pass; else report running; fi
Surely it should be report running; while !succeeded; do sleep; done; destroy; report pass?
Are you referring to https://github.com/iterative/terraform-provider-iterative/issues/389#issuecomment-1047102568? 🤔 If so, you're probably missing a detail: task was modified (https://github.com/iterative/terraform-provider-iterative/pull/339) to restart workflows when machines shut down. Intuitive as it gets.
I just realised :nauseated_face:
🦄 🤢
No longer a valid use case, use leo read --status or propose new output formats for the standalone command-line tool