test-infra icon indicating copy to clipboard operation
test-infra copied to clipboard

Add public log vendor matcher periodic job

Open BobyMCbobs opened this issue 4 years ago • 15 comments

Adds a Prow periodic job for running the public-log-vendor-matcher data generation pipeline

Depends on https://github.com/kubernetes/test-infra/pull/23664 and https://github.com/kubernetes/k8s.io/pull/2710

BobyMCbobs avatar Sep 20 '21 00:09 BobyMCbobs

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: BobyMCbobs To complete the pull request process, please assign spiffxp after the PR has been reviewed. You can assign the PR to them by writing /assign @spiffxp in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Sep 20 '21 00:09 k8s-ci-robot

The pull-test-infra-bazel job says that there needs to be a dashboard made, I think?

  • configuration error for (TestGroup) public-log-vendor-matcher: Each Test Group must be referenced by at least 1 Dashboard Tab.

BobyMCbobs avatar Sep 20 '21 04:09 BobyMCbobs

The pull-test-infra-bazel job says that there needs to be a dashboard made, I think?

  • configuration error for (TestGroup) public-log-vendor-matcher: Each Test Group must be referenced by at least 1 Dashboard Tab.

Yes! We have code that configures a TestGrid view for periodic jobs automatically, so you can see historical test results. New jobs should have annotations on them that say which dashboard the results should be on.

chases2 avatar Sep 20 '21 22:09 chases2

The pull-test-infra-bazel job says that there needs to be a dashboard made, I think?

  • configuration error for (TestGroup) public-log-vendor-matcher: Each Test Group must be referenced by at least 1 Dashboard Tab.

Yes! We have code that configures a TestGrid view for periodic jobs automatically, so you can see historical test results. New jobs should have annotations on them that say which dashboard the results should be on.

Thank you @chases2, that's helpful! I've just taken a look at a few more existing jobs to add those annotations

BobyMCbobs avatar Sep 20 '21 23:09 BobyMCbobs

If you want to use a dashboard that already exists, add the annotations and delete config/testgrids/kubernetes/k8s-infra-public-pii/config.yaml. It's trying to build the dashboard twice.

If you're trying to build your own dashboard, config/testgrids/kubernetes/k8s-infra-public-pii/config.yaml is the right place to put that config, but your dashboard and dashboard group need different names from each other.

chases2 avatar Sep 21 '21 00:09 chases2

If you want to use a dashboard that already exists, add the annotations and delete config/testgrids/kubernetes/k8s-infra-public-pii/config.yaml. It's trying to build the dashboard twice.

If you're trying to build your own dashboard, config/testgrids/kubernetes/k8s-infra-public-pii/config.yaml is the right place to put that config, but your dashboard and dashboard group need different names from each other.

I think i'll go for a new dashboard. Though unsure if I should be putting the dashboard with wg-k8s-infra or not. Do you have any thoughts @spiffxp?

BobyMCbobs avatar Sep 21 '21 00:09 BobyMCbobs

There was discussion a few weeks back about where this should run. From what I remember, the options were:

  • cloudbuild
  • cloudrun
  • a completely isolate PII cluster

I would appreciate reiteration of clarity once more.

I'm unsure how to configure jobs in cloudbuild and cloudrun, especially how to tell it a ServiceAccount.

/hold

cc @spiffxp

BobyMCbobs avatar Oct 19 '21 21:10 BobyMCbobs

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jan 17 '22 21:01 k8s-triage-robot

@BobyMCbobs: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-test-infra-bazel 0c50d8e0c9330de003635678c1686eee5fb63445 link true /test pull-test-infra-bazel
pull-test-infra-unit-test 0c50d8e0c9330de003635678c1686eee5fb63445 link true /test pull-test-infra-unit-test

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

k8s-ci-robot avatar Mar 04 '22 22:03 k8s-ci-robot

/remove-lifecycle stale

riaankleinhans avatar Mar 15 '22 20:03 riaankleinhans

I want to help, but I don't really have context to review this - what input are you looking for from me?

thockin avatar May 23 '22 20:05 thockin

I just did a bulk scan of spiffxp assigned PRs after nudging him about setting github account status to busy (so the robot won't assign or suggest for now) while still extended OOO -- I don't have context on this either, just assigned to someone else on the thread / OWNERS for each of these.

BenTheElder avatar May 26 '22 23:05 BenTheElder

The aim of this periodic job is to autmatically update the data used in BQ & Data Studio to show the costs of distrubuting K8s artifacts. At the moment the updated are done manually by @BobyMCbobs , but he would like to automate it. He will pick this up again when he is back in office in 2 weeks.

riaankleinhans avatar May 30 '22 21:05 riaankleinhans

@thockin @BenTheElder, the purpose of this PR is to propose and make a solution of the automation of public-log-asn-matcher job. There were discussions with @spiffxp about how this should run. The job of course contains access to PII for the Kubernetes public bucket logs. I would like for the automation of this job to be pushed forward. It would likely need to run with a particular GCP service account so that it can access the buckets, BigQuery databases and produce DataStudio dashboards. The direction is likely one of the following

  • Cloud Run
  • Cloud Build
  • Prow (somewhere); or
  • An alternate method

A goal is that it is well isolated and only few team on sig-k8s-infra have access to it, given PII NDA. The job should run perhaps once a week or so.

Something that I might've been stuck on when implementing this job is the service accounts with creation, access or configuring in the job.

Which way of running it might be considered the best? and also what are your thoughts?

BobyMCbobs avatar Jun 13 '22 14:06 BobyMCbobs

@thockin @BenTheElder, the purpose of this PR is to propose and make a solution of the automation of public-log-asn-matcher job. There were discussions with @spiffxp about how this should run. The job of course contains access to PII for the Kubernetes public bucket logs. I would like for the automation of this job to be pushed forward. It would likely need to run with a particular GCP service account so that it can access the buckets, BigQuery databases and produce DataStudio dashboards. The direction is likely one of the following

  • Cloud Run
  • Cloud Build
  • Prow (somewhere); or
  • An alternate method

A goal is that it is well isolated and only few team on sig-k8s-infra have access to it, given PII NDA. The job should run perhaps once a week or so.

Something that I might've been stuck on when implementing this job is the service accounts with creation, access or configuring in the job.

Which way of running it might be considered the best? and also what are your thoughts?

@thockin, might you have any suggestion and/or would like to Pair on this?

BobyMCbobs avatar Jul 27 '22 23:07 BobyMCbobs

it's been a year. do we want to talk about this in next SIG call?

dims avatar Aug 23 '22 12:08 dims

/close From K8s-infra meeting 31 Aug having a periodic job does not make sense at the moment. The Main purpose of the data was to find out use by vendors, which is accomplished. If an update in the data is required the job can be run manually.

riaankleinhans avatar Aug 31 '22 20:08 riaankleinhans

@Riaankl: Closed this PR.

In response to this:

/close From K8s-infra meeting 31 Aug having a periodic job does not make sense at the moment. The Main purpose of the data was to find out use by vendors, which is accomplished. If an update in the data is required the job can be run manually.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Aug 31 '22 20:08 k8s-ci-robot