Make CRD parse failures in tidb-controller-manager more friendly
Feature Request
Is your feature request related to a problem? Please describe:
If I have applied a malformed TidbMonitor CRD to the cluster, tidb-controller-manager does not handle it very well:
E0522 17:51:18.048163 1 runtime.go:78] Observed a panic: &errors.errorString{s:"cannot parse '': quantities must match the regular expression '^([+-]?[0-9.]+)([eEinumkKMGTP]*[-+]?[0-9]*)$'"} (cannot parse '': quantities must match the regular expression '^([+-]?[0-9.]+)([eEinumkKMGTP]*[-+]?[0-9]*)$')
goroutine 439 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic(0x18d0620, 0xc0003bec30)
k8s.io/[email protected]/pkg/util/runtime/runtime.go:74 +0xa3
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
k8s.io/[email protected]/pkg/util/runtime/runtime.go:48 +0x82
panic(0x18d0620, 0xc0003bec30)
runtime/panic.go:679 +0x1b2
k8s.io/apimachinery/pkg/api/resource.MustParse(0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
k8s.io/[email protected]/pkg/api/resource/quantity.go:127 +0x1c8
github.com/pingcap/tidb-operator/pkg/monitor/monitor.getMonitorPVC(0xc00093e600, 0x44985f)
github.com/pingcap/tidb-operator@/pkg/monitor/monitor/util.go:769 +0x261
github.com/pingcap/tidb-operator/pkg/monitor/monitor.(*MonitorManager).syncTidbMonitorPVC(0xc00088ac80, 0xc00093e600, 0xc0003bec10, 0x0, 0x8)
github.com/pingcap/tidb-operator@/pkg/monitor/monitor/monitor_manager.go:140 +0x40
github.com/pingcap/tidb-operator/pkg/monitor/monitor.(*MonitorManager).Sync(0xc00088ac80, 0xc00093e600, 0xc0008a5080, 0xc0006f14e0)
github.com/pingcap/tidb-operator@/pkg/monitor/monitor/monitor_manager.go:91 +0x938
github.com/pingcap/tidb-operator/pkg/controller/tidbmonitor.(*defaultTidbMonitorControl).reconcileTidbMonitor(...)
github.com/pingcap/tidb-operator@/pkg/controller/tidbmonitor/tidb_monitor_control.go:51
github.com/pingcap/tidb-operator/pkg/controller/tidbmonitor.(*defaultTidbMonitorControl).ReconcileTidbMonitor(0xc0008f5590, 0xc00093e600, 0x0, 0x0)
github.com/pingcap/tidb-operator@/pkg/controller/tidbmonitor/tidb_monitor_control.go:43 +0x42
github.com/pingcap/tidb-operator/pkg/controller/tidbmonitor.(*Controller).sync(0xc0007d9600, 0xc0006f14e0, 0x1f, 0x0, 0x0)
github.com/pingcap/tidb-operator@/pkg/controller/tidbmonitor/tidb_monitor_controller.go:146 +0x20a
github.com/pingcap/tidb-operator/pkg/controller/tidbmonitor.(*Controller).processNextWorkItem(0xc0007d9600, 0x0)
github.com/pingcap/tidb-operator@/pkg/controller/tidbmonitor/tidb_monitor_controller.go:114 +0x100
github.com/pingcap/tidb-operator/pkg/controller/tidbmonitor.(*Controller).worker(0xc0007d9600)
github.com/pingcap/tidb-operator@/pkg/controller/tidbmonitor/tidb_monitor_controller.go:101 +0x2b
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc000676a60)
k8s.io/[email protected]/pkg/util/wait/wait.go:152 +0x5e
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc000676a60, 0x3b9aca00, 0x0, 0x1, 0xc000158420)
k8s.io/[email protected]/pkg/util/wait/wait.go:153 +0xf8
k8s.io/apimachinery/pkg/util/wait.Until(0xc000676a60, 0x3b9aca00, 0xc000158420)
k8s.io/[email protected]/pkg/util/wait/wait.go:88 +0x4d
created by github.com/pingcap/tidb-operator/pkg/controller/tidbmonitor.(*Controller).Run
github.com/pingcap/tidb-operator@/pkg/controller/tidbmonitor/tidb_monitor_controller.go:94 +0x1d9
panic: cannot parse '': quantities must match the regular expression '^([+-]?[0-9.]+)([eEinumkKMGTP]*[-+]?[0-9]*)$' [recovered]
panic: cannot parse '': quantities must match the regular expression '^([+-]?[0-9.]+)([eEinumkKMGTP]*[-+]?[0-9]*)$'
goroutine 439 [running]:
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
k8s.io/[email protected]/pkg/util/runtime/runtime.go:55 +0x105
panic(0x18d0620, 0xc0003bec30)
runtime/panic.go:679 +0x1b2
k8s.io/apimachinery/pkg/api/resource.MustParse(0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
k8s.io/[email protected]/pkg/api/resource/quantity.go:127 +0x1c8
github.com/pingcap/tidb-operator/pkg/monitor/monitor.getMonitorPVC(0xc00093e600, 0x44985f)
github.com/pingcap/tidb-operator@/pkg/monitor/monitor/util.go:769 +0x261
github.com/pingcap/tidb-operator/pkg/monitor/monitor.(*MonitorManager).syncTidbMonitorPVC(0xc00088ac80, 0xc00093e600, 0xc0003bec10, 0x0, 0x8)
github.com/pingcap/tidb-operator@/pkg/monitor/monitor/monitor_manager.go:140 +0x40
github.com/pingcap/tidb-operator/pkg/monitor/monitor.(*MonitorManager).Sync(0xc00088ac80, 0xc00093e600, 0xc0008a5080, 0xc0006f14e0)
github.com/pingcap/tidb-operator@/pkg/monitor/monitor/monitor_manager.go:91 +0x938
github.com/pingcap/tidb-operator/pkg/controller/tidbmonitor.(*defaultTidbMonitorControl).reconcileTidbMonitor(...)
github.com/pingcap/tidb-operator@/pkg/controller/tidbmonitor/tidb_monitor_control.go:51
github.com/pingcap/tidb-operator/pkg/controller/tidbmonitor.(*defaultTidbMonitorControl).ReconcileTidbMonitor(0xc0008f5590, 0xc00093e600, 0x0, 0x0)
github.com/pingcap/tidb-operator@/pkg/controller/tidbmonitor/tidb_monitor_control.go:43 +0x42
github.com/pingcap/tidb-operator/pkg/controller/tidbmonitor.(*Controller).sync(0xc0007d9600, 0xc0006f14e0, 0x1f, 0x0, 0x0)
github.com/pingcap/tidb-operator@/pkg/controller/tidbmonitor/tidb_monitor_controller.go:146 +0x20a
github.com/pingcap/tidb-operator/pkg/controller/tidbmonitor.(*Controller).processNextWorkItem(0xc0007d9600, 0x0)
github.com/pingcap/tidb-operator@/pkg/controller/tidbmonitor/tidb_monitor_controller.go:114 +0x100
github.com/pingcap/tidb-operator/pkg/controller/tidbmonitor.(*Controller).worker(0xc0007d9600)
github.com/pingcap/tidb-operator@/pkg/controller/tidbmonitor/tidb_monitor_controller.go:101 +0x2b
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc000676a60)
k8s.io/[email protected]/pkg/util/wait/wait.go:152 +0x5e
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc000676a60, 0x3b9aca00, 0x0, 0x1, 0xc000158420)
k8s.io/[email protected]/pkg/util/wait/wait.go:153 +0xf8
k8s.io/apimachinery/pkg/util/wait.Until(0xc000676a60, 0x3b9aca00, 0xc000158420)
k8s.io/[email protected]/pkg/util/wait/wait.go:88 +0x4d
created by github.com/pingcap/tidb-operator/pkg/controller/tidbmonitor.(*Controller).Run
github.com/pingcap/tidb-operator@/pkg/controller/tidbmonitor/tidb_monitor_controller.go:94 +0x1d9
Describe the feature you'd like:
I'd like more information about where in the CRD the problem is, and I'd like the problem to be handled more cleanly rather than causing a crash!
Describe alternatives you've considered:
Better validate CRD before applying it to the cluster.
Teachability, Documentation, Adoption, Migration Strategy:
Better validate CRD before applying it to the cluster.
That's a good idea.
@Yisaer
we can do semantic validation in the controller or ValidatingAdmission. The validation code can be reused, and it is better to do it in the ValidatingAdmission, because cr will not be saved to etcd, if we do it in the controller, we need to report warnning event. wdyt?
ValidationAdmission is a good idea. ValidationAdmission do help us to reduce the panic before saving in the etcd, we still need to handle the situation when we meet panic in controller-manager. @zjj2wry
In fact, we do have a validation webhook for pingcap resources (tidbcluster), but now it covered little range for them.
we still need to handle the situation when we meet panic in controller-manager
agreed, because ValidationAdmission is an optional component, I think it is also needed in the controller manager
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 15 days
/assign