Marcel Hild

Results 47 comments of Marcel Hild

@chrisarcand I'm glad, I'm not the only one that didnt spot the difference right away. If we would add Gemfile.lock to git, then changes in external libraries would not strike...

cc @bdunne @bzwei @syncrou

> I would expect i.credential to be an AnsibleTowerCredential instance, not "/api/v1/credentials/6" The above hash is the `related` key from the api response. `i.credential` would be `6` if its was...

https://www.kubermatic.com/blog/monitoring-prow-resources-with-prometheus-and-grafana/ is another writeup to give some more context

* https://console-openshift-console.apps.smaug.na.operate-first.cloud/k8s/ns/opf-ci-prow/routes * https://deck-metrics-opf-ci-prow.apps.smaug.na.operate-first.cloud/metrics * http://ghproxy-metrics-opf-ci-prow.apps.smaug.na.operate-first.cloud/metrics * https://hook-metrics-opf-ci-prow.apps.smaug.na.operate-first.cloud/metrics * [grafana query up for UWM](https://grafana.operate-first.cloud/explore?orgId=1&left=%7B%22datasource%22:%22moc-smaug%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22instant%22:true,%22range%22:true,%22exemplar%22:false,%22expr%22:%22up%7Bnamespace%3D%27opf-ci-prow%27%7D%22%7D%5D,%22range%22:%7B%22from%22:%22now-1h%22,%22to%22:%22now%22%7D%7D) but I cant figure out how to query prometheus for the actual prow component metrics

https://github.com/operate-first/apps/pull/2393 fixes user workload monitoring but also removes the above public routes

I imported all dashboards from https://github.com/kubernetes/test-infra/tree/d917e4a1e773ccf701ac58424a0b4ee3490e8dd1/config/prow/cluster/monitoring/mixins/grafana_dashboards to https://grafana.operate-first.cloud/dashboards/f/XfwHK1M4k/tmp ``` ~/src/misc/test-infra/config/prow/cluster/monitoring/mixins |   master  |  mhild ❯ make grafana-dashboards ~/src/misc/test-infra/config/prow/cluster/monitoring/mixins |   master  | ↵ SIGINT(2)...

either increasing the backup frequency might help as in https://github.com/CrunchyData/postgres-operator/issues/2531#issuecomment-922349084 or setting archive_mode to off as in https://github.com/CrunchyData/postgres-operator/issues/2531#issuecomment-1022070211

Extend the PVC for psql by +2Gb and wait until the DB recovers. Then trigger a one off backup either by: ``` oc annotate postgrescluster db --overwrite postgres-operator.crunchydata.com/pgbackrest-backup="$(date)" ``` or...