clamav
clamav copied to clipboard
Couldn't resolve HEALTH CHECK on GCP Ingress in GKE
Description
Hi I've been working on using clamAV Docker image Tag:1.2 in GKE for a while. Allow me, GKE is my new area recently . I've done some basic deploying and maintaining but might miss some details in understanding.
This is also my first issue in GitHub. Kindly ignore if I miss any formatting.
Recently, I've deployed my docker image in the GKE via yaml description (I'll provide my config below). I've created a simple deployment, service file.
Deployment file:
kind: Deployment
metadata:
name: clam-av
spec:
replicas: 1
selector:
matchLabels:
run: clam-av
template:
metadata:
labels:
run: clam-av
spec:
nodeSelector:
cloud.google.com/gke-nodepool: XXX-XXX-pool
terminationGracePeriodSeconds: 60
containers:
- name: clamav-container
image: clamav/clamav:1.2
resources:
requests:
cpu: 200m
memory: 1Gi
imagePullPolicy: Always
ports:
- containerPort: 3310
# - containerPort: 7357
Service file:
kind: Service
metadata:
name: clam-av-service
annotations:
cloud.google.com/backend-config: '{"default": "backend-for-clamAV"}'
spec:
selector:
run: clam-av
ports:
- name: http3310
protocol: TCP
port: 80
targetPort: 3310
# - name: http7357
# protocol: TCP
# port: 80
# targetPort: 7357
type: ClusterIP
I've created a route path on GCP Ingress mapping to this service, also for which created a simple backend-config file:
kind: BackendConfig
metadata:
name: backend-for-clamAV
spec:
timeoutSec: 150
connectionDraining:
drainingTimeoutSec: 150
healthCheck:
checkIntervalSec: 15
port: 80
type: HTTP
requestPath: /
healthyThreshold: 1
unhealthyThreshold: 3
timeoutSec: 15
Whenever I tried to access the page via domain I've used in Ingress I get the following page (Screenshot below):
Also having this warning message related to HEALTH CHECK in GKE Ingress page.
I've tested this docker image locally in Docker works fine, for which also created a simple C# Web app to act as client to perform the virus scans for file while uploading. IT WORKS FINE !!.
Note: When I exposed my service as LoadBalaner type I can access the page and also from my C# application but not when i mapped to ingress with a route i couldn't get the working page as in docker tested locally nor could I connect via my application
I think the issue is in resolving HEALTH CHECK in GKE ingress.
If the issue is other than that I could miss in, that could resolve my problem kindly let me update me on this issue.
Provide your suggestions on how to resolve this issue. Thank You !
@Vikram-Raghu As your clamav application works fine locally and through loadBalanacers, I can only suggest two things here.
First, the connectivity of clamav-app to internet which might not be available in ingress settings.
Second, if starting up clamav-app service takes time to download databases, you need to have some readiness/health-check in it.
Not sure what else can be the issue from clamav POV. This seems more of how you have created the wrapper (C#) around clamav & Kubernetes service config.
@rsundriyal Thank You for time and suggestions.
For your First suggestion, I've properly created a route with domain added in ingress. Is this what you are suggesting about ingress settings.
For second, I do have tried to add a readiness and liveness probe setting in the deployment file:
readinessProbe:
failureThreshold: 3
httpGet:
path: /
port: 3310
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 15
livenessProbe:
failureThreshold: 3
httpGet:
path: /
port: 3310
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 15
But after adding this health setting the pods keeps failing and unable to start again. Pod keeps crashing and restarted more than 16 times last time checked. So I removed that piece of health settings code from the deployment file. Should I increase my threshold seconds to allow my pod fully download the databases. and then check health ?
Regarding the clamAV wrapper around clamAV, I think the problem is in GKE health check or clamAV hosting side. I've not hosted this application in GKE in any pods. I solely used this application for testing purposes from my local machine to connect with the clamAV server for file scanning.
This was the page coming up from running clamAV in local docker
At first I thought it was not working, after multiple attempts to connect, I came across this wrapper method to connect to this clamd daemon from my application. In which, it WORKS FINE !!.
This same page appears when I set my clamAV-service as LOAD BALANCER and can able to connect with my local application created for testing purposes, which also works fine in this case.
But When I add in ingress with domain mapped to the route, I get the "no healthy upstream" page as mentioned earlier in this post.
Hi there,
I had a similar issue when i have tried to setup clamav as container app. I solved it by just waiting for the TCP socket of clamd to be up and running. Maybe this approach helps you too. I don't know much about GCP, but I think they support not only HTTP probes, but also TCP probes.
...
readinessProbe:
tcpSocket:
port: 3310
initialDelaySeconds: 20
periodSeconds: 10
livenessProbe:
tcpSocket:
port: 3310
periodSeconds: 20
...
I think this should help you: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#define-a-tcp-liveness-probe
@snailcatcher I did the same on Azure Container Apps and enabled a log search alert, but I'm not sure if an alert is fired or not in case of the ClamAV container is stuck at loading databases phase, see more https://github.com/Cisco-Talos/clamav/issues/1282