test-infra checkconfig: allow more advanced configuration with config file

What would you like to be added: As a follow-up to the discussion in https://github.com/kubernetes/test-infra/pull/26656 I would like to add the ability for more advanced configuration of the static checks that run using checkconfig. My suggestion is to allow using a YAML file that defines the severity of the checks, and their per-check configuration.

Teams that use Prow would be able to have more control over configuration of the more advanced checks, if they show up in the future, be able to define both strict checks, warnings and excluded checks. Right now you can't define the severity per-check basis.

There would be a static set of default checks that will always work, like it is right now. I believe we have specific set of warnings to explicitly run regardless the configuration. Every Prow maintainer can define a file, which path is provided by a flag, that contains the following contents:

warnings:
  strict: # list of warnings that will make check fail if they run
  - warning1
  - warning2
  optional: # list of warnings that will not fail the check, but will be reported
  - warning3
  - warning4
  excluded: # list of warnings that are excluded
  - warning5
  - warning6
config: # map that includes per-warning configuration
  required-annotation: # name of the warning
    required:
    - annotation1
    - annotation2

Config file only contains the optional configuration for the check that needs to be passed per-warning level.

Based on the configuration file, program merges default, strict and optional warnings, removes the excluded ones and runs the checks. If any of the strict ones fails, then the program exits with an error, failing the ProwJob.

Current checkconfig flags are here to stay for backwards compatibility, but if the config file flag is used then the old flags are ignored and proper message is shown.

I will try to work on that ASAP, as I see it as huge upgrade for usability of checkconfig for Prow config, and ProwJobs advanced and simple static checks.

cc @stevekuznetsov

/area prow /sig testing

Aug 04 '22 11:08 Ressetkk

We might want to decide whether or not new features are enabled by default - what's the current situation with flags? Perhaps the config could list disabled checks and those that are demoted to informational only, and you get new ones for free?

/cc @cjwagner @alvaroaleman

Aug 04 '22 16:08 stevekuznetsov

We might want to decide whether or not new features are enabled by default - what's the current situation with flags? Perhaps the config could list disabled checks and those that are demoted to informational only, and you get new ones for free?

/cc @cjwagner @alvaroaleman

That sounds good for generally applicable checks. I think we may still want some new checks to be opt-in so that we can include checks for optional policies that may not apply to all instances. Hopefully most new checks can be enabled by default though.

Aug 05 '22 22:08 cjwagner

From what I see in the code, there is a list of default checks which is normally enabled only if no --warnings flag is provided, or when --include-default-warnings is there along with --warnings. I would want to keep the current functionality - we define list of checks that are generally enabled in the code, and they would always run. We can opt-in for having additional checks via file, make some of the default checks optional or strict, or disable some, if they are not applicable for specific instance.

The best example of opt-in check would be required-job-annotations. Most of the people wouldn't want this check to be enabled, however I would really benefit from having it enabled in optional mode, keeping old checks in strict mode. Additionally some policies could require more personalized configuration per-instance. Right now it's not possible to achieve that. It's either any warning fails the program, or none.

My idea is to provide more organized way of enabling/disabling checks with the default ones that run if are not explicitly disabled. Additionally provide a way to define severity per each check.

Aug 05 '22 22:08 Ressetkk

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Jan 26 '23 08:01 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

Feb 25 '23 08:02 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Mar 27 '23 09:03 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen

Mark this issue as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Mar 27 '23 09:03 k8s-ci-robot

@Ressetkk : are any further actions needed from our side?

Apr 17 '23 09:04 tobiscr

Nope, we didn't plan to do anything there, so the topic died out due to lack of time

Apr 17 '23 10:04 Ressetkk

test-infra test-infra copied to clipboard

checkconfig: allow more advanced configuration with config file

test-infra
test-infra copied to clipboard