mondoo-operator
mondoo-operator copied to clipboard
The operator fails to scan GKE autopilot clusters
Describe the bug When the operator is deployed in a GKE autopilot cluster, it does not report any assets.
https://cloud.google.com/kubernetes-engine/docs/concepts/autopilot-overview
This needs to be fixed because the new default for GKE clusters is autopilot: https://cloud.google.com/blog/products/containers-kubernetes/gke-autopilot-is-now-default-mode-of-cluster-operation
To Reproduce Steps to reproduce the behavior:
- Go to '...'
- Select '....'
- Scroll down to '....'
- Note the error
Expected behavior The operator should scan the same workloads as in other clusters.
Seems that the problem is that we have this volume
Volumes: []corev1.Volume{
{
Name: "root",
VolumeSource: corev1.VolumeSource{
HostPath: &corev1.HostPathVolumeSource{Path: "/", Type: &unsetHostPath},
},
},
Which gets mounted here:
VolumeMounts: []corev1.VolumeMount{
{
Name: "root",
ReadOnly: true,
MountPath: "/mnt/host/",
},
Which causes the following error
hostPath volume root used in container cnspec uses path / which is not allowed in Autopilot. Allowed path prefixes for hostPath volumes are: [/var/log/].
I see that we are currently relying on this volume for scanning
Spec: &v1.InventorySpec{
Assets: []*asset.Asset{
{
Id: "host",
Name: node.Name,
Connections: []*providers.Config{
{
Host: "/mnt/host",
Backend: providers.ProviderType_FS,
PlatformId: fmt.Sprintf("//platformid.api.mondoo.app/runtime/k8s/uid/%s/node/%s", clusterUID, node.UID),
},
},
Labels: map[string]string{
"k8s.mondoo.com/kind": "node",
},
ManagedBy: "mondoo-operator-" + clusterUID,
},
},
},
}
I don't have a solution currenlty, just wanted to share my insights from taking a first look at this problem.
I assume we should at least be able to build better handling around this case as this currently prevents the cronjob from being created at all.
Seeing that if node scanning fails we just stop and don't even attempt to scan kubernetes ressources - I wonder if for autopilot clusters we could be fine with being unable to scan the nodes (currently I don't see a way to make that work) but still scan the resources in the cluster
nodes := nodes.DeploymentHandler{
Mondoo: mondooAuditConfig,
KubeClient: r.Client,
MondooOperatorConfig: config,
ContainerImageResolver: r.ContainerImageResolver,
IsOpenshift: r.RunningOnOpenShift,
}
result, reconcileError = nodes.Reconcile(ctx)
if reconcileError != nil {
log.Error(reconcileError, "Failed to set up nodes scanning")
}
if reconcileError != nil || result.Requeue {
return result, reconcileError
}
workloads := k8s_scan.DeploymentHandler{
Mondoo: mondooAuditConfig,
KubeClient: r.Client,
MondooOperatorConfig: config,
ContainerImageResolver: r.ContainerImageResolver,
ScanApiStore: r.ScanApiStore,
}
result, reconcileError = workloads.Reconcile(ctx)
if reconcileError != nil {
log.Error(reconcileError, "Failed to set up Kubernetes resources scanning")
}
if reconcileError != nil || result.Requeue {
return result, reconcileError
}
Turning off node scanning already allows the resources to show up
Thanks @mariuskimmina, for digging deeper into this.
It's good to know what to look for. Perhaps other ways to scan a now work in GKE autopilot.