crash-diagnostics
crash-diagnostics copied to clipboard
Crash-Diagnostics (Crashd) is a tool to help investigate, analyze, and troubleshoot unresponsive or crashed Kubernetes clusters.
To get better visibility into what failed, we should split those up.
when i run crashd with long script, it will take long time, i don't know the progress information without --debug flag
I integrated crashd in my pipeline, when the test case fails this is run to collect some information: ``` capture_local(cmd=collectionCmd, file_name="collection.log") ``` Due to a bug in the backend, the...
It would be useful to be able to use `crashd` to scrape AWS and vCenter directly via APIs instead of relying on data from CAPI. Information we would like to...
When trying to diagnose issues with etcd, it is sometimes necessary to review metrics provided for Prometheus and other monitoring tools that utilise open metrics. If the cluster does not...
In the current design, the crashd programmatic API relies completely on built-in functions defined in the project. However, as a user of the crashd API, it would be nice to...
It'd be nice if we could have defaulting for args, allowing them to become optional instead of hard erroring. e.g. in many cases for vsphere, the ssh user is `capv`,...
CAPA provides a way to deploy clusters on AWS without a jumpbox. Instead SSM Session Manager (https://docs.aws.amazon.com/systems-manager/latest/userguide/session-manager.html) is used to access nodes. For such cases having additional `ssm()` starlark function...
Would it be possible to combine `capa_provider` and `capv_provider` to a more generic `cluster_api_provider`? Having specific providers for specific Cluster API infrastructure providers means you have to add new providers...