v8r icon indicating copy to clipboard operation
v8r copied to clipboard

.gitlab-ci.yml fails to validate because of "unknown tag !reference"

Open 749 opened this issue 1 year ago • 3 comments

A gitlab-ci file may contain the following construct to include parts of another job in the current job.

example.gitlab-ci.yml

.build-shell-options:
  before_script:
    - set -euo pipefail

.alpine-install-gitlab-essentials:
  before_script:
    - !reference [.build-shell-options, before_script]
    - 'echo "BEGIN: .alpine-install-gitlab-essentials:before_script"'

See also the GitLab docs: https://docs.gitlab.com/ee/ci/yaml/yaml_optimization.html#reference-tags

Unfortunately this causes an error similar to:

ℹ Validating ./example.gitlab-ci.yml against schema from https://gitlab.com/gitlab-org/gitlab/-/raw/master/app/assets/javascripts/editor/schema/ci.json ...
✖ unknown tag !<!reference> (30:55)

 29 |  ... 
 30 |  ... -shell-options, before_script]
-----------------------------------------^
 31 |  ... ine-install-gitlab-essentials:before_script"'

Thanks for looking into the problem! 👍

749 avatar Jun 24 '24 06:06 749

Having had a quick look at this, js-yaml refuses to parse this without a schema. Basically this is the same core issue as https://github.com/nodeca/js-yaml/issues/289

I don't want to do something GitLab-specific here. If I'm going to address this, it needs to be a generic solution. Doing a bit of digging, I think I basically want to do what is described in https://github.com/nodeca/js-yaml/issues/566 and create a "permissive schema", but I have not tired it out yet.

Also refs https://github.com/SchemaStore/schemastore/issues/2926

chris48s avatar Jun 24 '24 09:06 chris48s

I had a look into this.

As far as I can tell, this is going to be basically impossible to do in a completely generic way. Fundamentally, I think in order to correctly write the schema extension that allows you to parse the file, you really have to already know what input file you're expecting and how to interpret the custom tags. That means every file format that uses them would have to be a special case.

It also doesn't seem possible to just ignore/pass over these.

Switching to a different yaml parser like https://www.npmjs.com/package/yaml doesn't seem to provide any additional options.

Not really sure if there is anywhere to go with this one.

chris48s avatar Jun 25 '24 20:06 chris48s

That is unfortunate. A generic solution would be the best variant.

At least for gitlab-ci files, it is a pretty good bet, that their filenames end in .gitlab-ci.yml as most IDEs use this to identify the filetype. Maybe this is a viable abstraction, selecting the schema base von filename postfix?

749 avatar Jun 27 '24 12:06 749

Hello.

I've just released v8r 4.0.0.

This release introduces a feature relevant to this issue: It is now possible to extend the functionality of v8r by installing or writing plugins.

The reason why this is relevant to this specific issue is because this enables you to write a plugin that parses files which extend the yaml specification with custom tags.

Here's a first stab at a plugin which would allow validating .gitlab-ci.yml files that include a !reference tag. You might need to tweak the reference resolver a bit. I haven't tested it super extensively, but I have at least tested it against your example.

Note that using the referenceType schema extension is enough to get the file to parse, but you do need to actually resolve the references to put the document into a format where it makes sense to validate it against https://gitlab.com/gitlab-org/gitlab/-/raw/master/app/assets/javascripts/editor/schema/ci.json

import yaml from "js-yaml";
import { BasePlugin, Document } from "v8r";

const referenceType = new yaml.Type("!reference", {
  kind: "sequence",
  construct: function (data) {
    return { ref: data };
  },
});

const resolveReferences = function (doc) {
  const resolve = function (obj) {
    if (Array.isArray(obj)) {
      return obj.map(resolve);
    } else if (obj && typeof obj === "object") {
      if (obj.ref) {
        const [refPath, key] = obj.ref;
        const refValue = resolve(doc[refPath][key]);
        return refValue;
      } else {
        return Object.fromEntries(
          Object.entries(obj).map(([k, v]) => [k, resolve(v)]),
        );
      }
    } else {
      return obj;
    }
  };

  return resolve(doc);
};

class GitLabCIParser extends BasePlugin {
  static name = "v8r-plugin-gitlab-ci-parser";

  parseInputFile(contents, fileLocation, parser) {
    if (fileLocation.endsWith(".gitlab-ci.yml")) {
      const doc = yaml.load(contents, {
        schema: yaml.DEFAULT_SCHEMA.extend(referenceType),
      });
      const resolvedDoc = resolveReferences(doc);
      return new Document(resolvedDoc);
    }
  }
}

export default GitLabCIParser;

In this example, I have not attempted to implement any other GitLab CI tags other than !reference. If their yaml dialect implements other custom tags you'd need to implement support for each of them individually yourself in your plugin.

As I've said, it will not be possible to solve this class of problem in a completely generic way. The code I have posted there is very specific to GitLab CI and each custom tag in each format will have its own semantics. I don't plan to add compatibility with this (or other custom yaml extensions) in v8r core, but I think a plugin system which allows users to solve this category of problem in userspace and share solutions is a good tradeoff.

There is more documentation on plugins at https://chris48s.github.io/v8r/category/plugins/ including how to load plugins in your config file. You could take the above and save it as a local plugin in your project, or you could publish it to NPM for others to consume. Up to you..

I'm going to close this issue as you now have a way to solve this problem.

chris48s avatar Aug 19 '24 15:08 chris48s