task icon indicating copy to clipboard operation
task copied to clipboard

Support additional metadata in Taskfile YAML schema

Open timrulebosch opened this issue 1 year ago • 6 comments

We are using Taskfiles as input for an automation system (a VSCode extenstion). For that to operate we need to know additional domain specific information about the tasks in the Taskfile. Currently, the Taskfile schema prevents adding such metadata.

Suggestion is to add a metadata object to the schema for both the Taskfile (global) and individual tasks (per task). Something like this:

---
version: '3'

vars:
  foo: bar

metadata:
  annotations:
    generator:
      repo: https://github.com/go-foo/foo
      version: v1.2.3
  labels:
    internal-only: true

tasks:
  my-task:
    desc: This task is generated by foo for project fubar.
    metadata:
       annotations: 
          generator: foo
       labels:
         project: fubar
    cmds:
      - echo "generated by foo"

If there is interest we can provide a PR as the basis for implementation.

timrulebosch avatar Nov 11 '24 11:11 timrulebosch

We decided to augment our Taskfile.yml with an additional file "Metadata.yml" (out of necessity, not because we wanted to). With that we can describe the tasks in a way that integrates with the VSCode plugin.

I think that describing this metadata within the Taskfile schema would be more practical, and would also allow some level of automated documentation generation.

As an example, here is a fragment of the metadata, and the associated Taskfile:

metadata:
  models:
    fmimcl:
      name: fmimcl
      displayName: dse.fmi.mcl
      path: dse/fmimcl
      workflows:
        - generate-fmimcl
        - patch-signalgroup
  tasks:
    generate-fmimcl:
      vars:
        FMU_DIR:
          required: true
          hint: URI identifying an FMU (select from 'uses' or manually enter a path).
        OUT_DIR:
          required: true
          hint: Directory where the model should be created (sim relative).
          default: out/model
        MCL_PATH:
          required: true
          hint: Path where the FMI MCL shared library is located.
        FMU_MODELDESC:
          required: false
          hint: Path where the FMI modelDescription.xml should be created.
          default: '{{.FMU_DIR}}/modelDescription.xml'
        SIGNAL_GROUP:
          required: false
          hint: Path where the associated Signal Group should be created.
          default: '{{.OUT_DIR}}/signalgroup.yaml'
tasks:

  generate-fmimcl:
    desc: Generate an FMI MCL from an existing FMU.
    run: always
    dir: '{{.USER_WORKING_DIR}}'
    label: dse:fmi:generate-fmimcl
    vars:
      FMU_DIR: '{{.FMU_DIR | default "FMU_DIR_NOT_SPECIFIED"}}'
      OUT_DIR: '{{.OUT_DIR | default "out/model"}}'
      MCL_PATH: '{{.MCL_PATH | default "MCL_PATH_NOT_SPECIFIED"}}'
      FMU_MODELDESC: '{{.FMU_DIR}}/modelDescription.xml'
      SIGNAL_GROUP: "{{if .SIGNAL_GROUP}}'{{.SIGNAL_GROUP}}'{{else}}'{{.OUT_DIR}}/signalgroup.yaml'{{end}}"
    cmds:
      - task: gen-mcl
        vars:
          FMU_DIR: '{{.FMU_DIR}}'
          OUT_DIR: '{{.OUT_DIR}}'
          MCL_PATH: '{{.MCL_PATH}}'
      - task: gen-signalgroup
        vars:
          INPUT: '{{.FMU_MODELDESC}}'
          OUTPUT: '{{.SIGNAL_GROUP}}'
    requires:
      vars: [FMU_DIR, MCL_PATH]

timrulebosch avatar Nov 14 '24 07:11 timrulebosch

Hi @timrulebosch,

I understand you have a valid use case, but it seems to be kinda rare to need this.

I'll close this for now, but if more people find this interesting feel free to upvote or comment.

andreynering avatar Dec 30 '24 18:12 andreynering

@andreynering sure, but this method is a fairly common way of making a schema extensible for various things, so you might want to consider if it could be helpful generally.

Example, for things like Feature Flags where you need to provide a mechanism for "out of band" configuration - use an annotation, or a blob of metadata to hold a more complex object. The separation-of-concern achieves the goal of additional configuration, keeps the schema stable, and is a more contemporary approach.

For Task, where the YAML has a certain style, you don't need the complexity I presented. A simple annotation and metadata node at the root of the file, and perhaps on each task, would help a lot of people do more with Task.

---
version: '3'

annotations:
  TASK_X_REMOTE_TASKFILES: true

metadata
  doc:
    title: DSE SDP
    desc: Simulation Development Platform

vars:
  PLATFORM_ARCH: linux-amd64

tasks:
  build:
    annotations:
        internal: true    # don't document or expose via VS Code integration.
    cmds:
      - mkdir -p '{{.SIMDIR}}/data'
      - cp {{.PWD}}/simulation.yaml '{{.SIMDIR}}/data/simulation.yaml'
    sources:
      - '{{.PWD}}/simulation.yaml'
    generates:
      - '{{.SIMDIR}}/data/simulation.yaml'

I can go through the issues (actually, I have) and find many instances of this kind of problem. Just now, #1695 , basically the same problem ... integration with a doc system. Or all the issues related to experimental features, not my issue with that(!), but all the others which have also been raised ... when all that is needed is a simple annotation or environment variable.

If you put yourself in my position, to integrate task into an operational system; operational as in can be operated by others without domain experience (in Task); it is more difficult than it should be, and can be solved without too much effort (from the Task project).

It's not a case of asking the Task project to do these things, but would it be possible to enable them?

trulede avatar Jan 04 '25 09:01 trulede

Turns out you can add additional metadata to the task schema, in most cases (not all), and it just works. There are some limits, task is only reading the first yaml document from a multi doc yaml file, so keep the task schema doc first.

---
version: '3'

vars:
  foo: '{{.FOO}}'

annotations:
  foo: bar

tasks:
  default:
    annotations:
      foo: bar
    cmds:
      - echo "FOO x y "
---
foo: bar

That solves one aspect of the workaround; needing a second configuration file alongside the Taskfile; and may allow some structural elements to be annotated. I think, there will still be a problem for documenting variables, this is a problem of the Task schema itself (which could be solved without breaking compatibility).

Variables need documentation, because it's extremely helpful to provide examples of how specialised parameters should be formatted. And with all the imports, documenting imported variables is difficult (one thing documents the other thing, if the other thing changes, now the documentation for the one thing is broken ... add versioning of taskfiles/repos ... impossible to resolve).

trulede avatar Jan 05 '25 14:01 trulede

We do use this kind of "metadata" extension with other yaml based config quite extensively. We already pull out a lot from the desc/summary sections, but having more structured metadata wil be a big +

It's very neat to 1) to document/annotate inside the config and 2) it being processable using jq/yq. For us the main thing would be that a flexible parent key is present, so we are very free to use any structure below. That way there is most flexibility for the various needs.

hans-d avatar Jan 27 '25 09:01 hans-d

For integration with workflow tools, or other UI, we have two concerns:

  1. Domain specific metadata, not related to task operation - propose root level metadata extension to schema.
  2. Task operational related metadata, specifically for vars, including -
  • documentation - propose task level metadata OR schema extension variable:desc
  • required indicator - propose task level metadata OR infer from task:requires:vars
  • default value - propose task level metadata OR schema extension variable:default OR extract from templating string (i.e. extract the default part from the template string.

Since duplicating information is already part of this problem, those last two points can be solved with "a bit of code". However the documentation question remains.

version: '3'

metadata:  # New schema item.
  package:
    download: '{{.REPO}}/releases/download/v{{.TAG}}/Fmi-{{.TAG}}-{{.PLATFORM_ARCH}}.zip'
  container:
    repository: ghcr.io/boschglobal/dse-fmi

tasks:
  generate-fmimcl:
    metadata: # New schema item.
      vars:
        FMU_DIR:
          required: true
          hint: URI identifying an FMU (select from 'uses' or manually enter a path).
    vars:
      # Using existing schema, additional info in task:metadata
      FMU_DIR: '{{.FMU_DIR | fail "FMU URI not provided!"}}'
      # Extend schema, additional info inline.
      FMU_DIR_ALT: 
        desc: URI identifying an FMU (select from 'uses' or manually enter a path).
        default: '{{.FMU_DIR | fail "FMU URI not provided!"}}'
    cmds:
      - task: gen-mcl
        vars:
          FMU_DIR: '{{.FMU_DIR}}'
    requires:
      vars: [FMU_DIR]

@andreynering I would tend towards a taskfile:metadata and task:metadata nodes, however you might still want to consider richer schema extensions for other interactive usecases. If you have a preference one way or the other I will open a PR.

timrulebosch avatar May 05 '25 16:05 timrulebosch