tidb-operator icon indicating copy to clipboard operation
tidb-operator copied to clipboard

Make CRD parse failures in tidb-controller-manager more friendly

Open kolbe opened this issue 5 years ago • 6 comments

Feature Request

Is your feature request related to a problem? Please describe:

If I have applied a malformed TidbMonitor CRD to the cluster, tidb-controller-manager does not handle it very well:

E0522 17:51:18.048163       1 runtime.go:78] Observed a panic: &errors.errorString{s:"cannot parse '': quantities must match the regular expression '^([+-]?[0-9.]+)([eEinumkKMGTP]*[-+]?[0-9]*)$'"} (cannot parse '': quantities must match the regular expression '^([+-]?[0-9.]+)([eEinumkKMGTP]*[-+]?[0-9]*)$')
goroutine 439 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic(0x18d0620, 0xc0003bec30)
        k8s.io/[email protected]/pkg/util/runtime/runtime.go:74 +0xa3
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
        k8s.io/[email protected]/pkg/util/runtime/runtime.go:48 +0x82
panic(0x18d0620, 0xc0003bec30)
        runtime/panic.go:679 +0x1b2
k8s.io/apimachinery/pkg/api/resource.MustParse(0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
        k8s.io/[email protected]/pkg/api/resource/quantity.go:127 +0x1c8
github.com/pingcap/tidb-operator/pkg/monitor/monitor.getMonitorPVC(0xc00093e600, 0x44985f)
        github.com/pingcap/tidb-operator@/pkg/monitor/monitor/util.go:769 +0x261
github.com/pingcap/tidb-operator/pkg/monitor/monitor.(*MonitorManager).syncTidbMonitorPVC(0xc00088ac80, 0xc00093e600, 0xc0003bec10, 0x0, 0x8)
        github.com/pingcap/tidb-operator@/pkg/monitor/monitor/monitor_manager.go:140 +0x40
github.com/pingcap/tidb-operator/pkg/monitor/monitor.(*MonitorManager).Sync(0xc00088ac80, 0xc00093e600, 0xc0008a5080, 0xc0006f14e0)
        github.com/pingcap/tidb-operator@/pkg/monitor/monitor/monitor_manager.go:91 +0x938
github.com/pingcap/tidb-operator/pkg/controller/tidbmonitor.(*defaultTidbMonitorControl).reconcileTidbMonitor(...)
        github.com/pingcap/tidb-operator@/pkg/controller/tidbmonitor/tidb_monitor_control.go:51
github.com/pingcap/tidb-operator/pkg/controller/tidbmonitor.(*defaultTidbMonitorControl).ReconcileTidbMonitor(0xc0008f5590, 0xc00093e600, 0x0, 0x0)
        github.com/pingcap/tidb-operator@/pkg/controller/tidbmonitor/tidb_monitor_control.go:43 +0x42
github.com/pingcap/tidb-operator/pkg/controller/tidbmonitor.(*Controller).sync(0xc0007d9600, 0xc0006f14e0, 0x1f, 0x0, 0x0)
        github.com/pingcap/tidb-operator@/pkg/controller/tidbmonitor/tidb_monitor_controller.go:146 +0x20a
github.com/pingcap/tidb-operator/pkg/controller/tidbmonitor.(*Controller).processNextWorkItem(0xc0007d9600, 0x0)
        github.com/pingcap/tidb-operator@/pkg/controller/tidbmonitor/tidb_monitor_controller.go:114 +0x100
github.com/pingcap/tidb-operator/pkg/controller/tidbmonitor.(*Controller).worker(0xc0007d9600)
        github.com/pingcap/tidb-operator@/pkg/controller/tidbmonitor/tidb_monitor_controller.go:101 +0x2b
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc000676a60)
        k8s.io/[email protected]/pkg/util/wait/wait.go:152 +0x5e
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc000676a60, 0x3b9aca00, 0x0, 0x1, 0xc000158420)
        k8s.io/[email protected]/pkg/util/wait/wait.go:153 +0xf8
k8s.io/apimachinery/pkg/util/wait.Until(0xc000676a60, 0x3b9aca00, 0xc000158420)
        k8s.io/[email protected]/pkg/util/wait/wait.go:88 +0x4d
created by github.com/pingcap/tidb-operator/pkg/controller/tidbmonitor.(*Controller).Run
        github.com/pingcap/tidb-operator@/pkg/controller/tidbmonitor/tidb_monitor_controller.go:94 +0x1d9
panic: cannot parse '': quantities must match the regular expression '^([+-]?[0-9.]+)([eEinumkKMGTP]*[-+]?[0-9]*)$' [recovered]
        panic: cannot parse '': quantities must match the regular expression '^([+-]?[0-9.]+)([eEinumkKMGTP]*[-+]?[0-9]*)$'

goroutine 439 [running]:
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
        k8s.io/[email protected]/pkg/util/runtime/runtime.go:55 +0x105
panic(0x18d0620, 0xc0003bec30)
        runtime/panic.go:679 +0x1b2
k8s.io/apimachinery/pkg/api/resource.MustParse(0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
        k8s.io/[email protected]/pkg/api/resource/quantity.go:127 +0x1c8
github.com/pingcap/tidb-operator/pkg/monitor/monitor.getMonitorPVC(0xc00093e600, 0x44985f)
        github.com/pingcap/tidb-operator@/pkg/monitor/monitor/util.go:769 +0x261
github.com/pingcap/tidb-operator/pkg/monitor/monitor.(*MonitorManager).syncTidbMonitorPVC(0xc00088ac80, 0xc00093e600, 0xc0003bec10, 0x0, 0x8)
        github.com/pingcap/tidb-operator@/pkg/monitor/monitor/monitor_manager.go:140 +0x40
github.com/pingcap/tidb-operator/pkg/monitor/monitor.(*MonitorManager).Sync(0xc00088ac80, 0xc00093e600, 0xc0008a5080, 0xc0006f14e0)
        github.com/pingcap/tidb-operator@/pkg/monitor/monitor/monitor_manager.go:91 +0x938
github.com/pingcap/tidb-operator/pkg/controller/tidbmonitor.(*defaultTidbMonitorControl).reconcileTidbMonitor(...)
        github.com/pingcap/tidb-operator@/pkg/controller/tidbmonitor/tidb_monitor_control.go:51
github.com/pingcap/tidb-operator/pkg/controller/tidbmonitor.(*defaultTidbMonitorControl).ReconcileTidbMonitor(0xc0008f5590, 0xc00093e600, 0x0, 0x0)
        github.com/pingcap/tidb-operator@/pkg/controller/tidbmonitor/tidb_monitor_control.go:43 +0x42
github.com/pingcap/tidb-operator/pkg/controller/tidbmonitor.(*Controller).sync(0xc0007d9600, 0xc0006f14e0, 0x1f, 0x0, 0x0)
        github.com/pingcap/tidb-operator@/pkg/controller/tidbmonitor/tidb_monitor_controller.go:146 +0x20a
github.com/pingcap/tidb-operator/pkg/controller/tidbmonitor.(*Controller).processNextWorkItem(0xc0007d9600, 0x0)
        github.com/pingcap/tidb-operator@/pkg/controller/tidbmonitor/tidb_monitor_controller.go:114 +0x100
github.com/pingcap/tidb-operator/pkg/controller/tidbmonitor.(*Controller).worker(0xc0007d9600)
        github.com/pingcap/tidb-operator@/pkg/controller/tidbmonitor/tidb_monitor_controller.go:101 +0x2b
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc000676a60)
        k8s.io/[email protected]/pkg/util/wait/wait.go:152 +0x5e
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc000676a60, 0x3b9aca00, 0x0, 0x1, 0xc000158420)
        k8s.io/[email protected]/pkg/util/wait/wait.go:153 +0xf8
k8s.io/apimachinery/pkg/util/wait.Until(0xc000676a60, 0x3b9aca00, 0xc000158420)
        k8s.io/[email protected]/pkg/util/wait/wait.go:88 +0x4d
created by github.com/pingcap/tidb-operator/pkg/controller/tidbmonitor.(*Controller).Run
        github.com/pingcap/tidb-operator@/pkg/controller/tidbmonitor/tidb_monitor_controller.go:94 +0x1d9

Describe the feature you'd like:

I'd like more information about where in the CRD the problem is, and I'd like the problem to be handled more cleanly rather than causing a crash!

Describe alternatives you've considered:

Better validate CRD before applying it to the cluster.

Teachability, Documentation, Adoption, Migration Strategy:

kolbe avatar May 22 '20 17:05 kolbe

Better validate CRD before applying it to the cluster.

That's a good idea.

Yisaer avatar May 28 '20 08:05 Yisaer

@Yisaer

we can do semantic validation in the controller or ValidatingAdmission. The validation code can be reused, and it is better to do it in the ValidatingAdmission, because cr will not be saved to etcd, if we do it in the controller, we need to report warnning event. wdyt?

zjj2wry avatar Jun 01 '20 10:06 zjj2wry

ValidationAdmission is a good idea. ValidationAdmission do help us to reduce the panic before saving in the etcd, we still need to handle the situation when we meet panic in controller-manager. @zjj2wry

In fact, we do have a validation webhook for pingcap resources (tidbcluster), but now it covered little range for them.

Yisaer avatar Jun 02 '20 03:06 Yisaer

we still need to handle the situation when we meet panic in controller-manager

agreed, because ValidationAdmission is an optional component, I think it is also needed in the controller manager

zjj2wry avatar Jun 03 '20 02:06 zjj2wry

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 15 days

github-actions[bot] avatar Aug 08 '20 00:08 github-actions[bot]

/assign

mianhk avatar Dec 29 '21 07:12 mianhk