docker-selenium
docker-selenium copied to clipboard
[🐛 Bug]: Nodes Disconnecting from Hub after AKS Deployment with Helm Chart
What happened?
Our team has deployed Selenium Grid to AKS using the helm templates in the repository. Our problem is that the nodes connect to the hub very briefly and are visible in the UI and then disappear and do not show up again. In the logs below we can see that the registration event between the node and hub is not successful. We are attempting to use a basic hub/node architecture with isolateComponents=false. We have disabled ingress and basic auth and are using istio. We are able to access the Selenium Grid UI on the Hub and we are able to queue tests but they timeout as no nodes are available for processing. Thanks in advance for any help on resolving this.
Command used to start Selenium Grid with Docker (or Kubernetes)
global:
seleniumGrid:
# Image tag for all selenium components
imageTag: 4.14.1-20231025
#imageTag: latest
# Image tag for browser's nodes
nodesImageTag: 4.14.1-20231025
#nodesImageTag: latest
# Pull secret for all components, can be overridden individually
imagePullSecret: secret
# Basic auth settings for Selenium Grid
basicAuth:
# Enable or disable basic auth
enabled: false
# Username for basic auth
username: admin
# Password for basic auth
password: admin
# Deploy Router, Distributor, EventBus, SessionMap and Nodes separately
isolateComponents: false
# Service Account for all components
serviceAccount:
create: true
name: ""
annotations: {}
# eks.amazonaws.com/role-arn: "arn:aws:iam::12345678:role/video-bucket-permissions"
istio:
# enable flags can be used to turn on or off specific istio features
flags:
virtualServiceEnabled: true
# istioVirtualService
virtualService:
namespace: seleniumgridpoc
gateways:
- seleniumgridpoc-ig
match:
- appEndpoints:
- /
destinations:
- portNumber: 4444
host: selenium-hub
appTopLevelDomains:
- seleniumgrid-sbx.company.com
# Configure the ingress resource to access the Grid installation.
ingress:
# Enable or disable ingress resource
enabled: false
# Name of ingress class to select which controller will implement ingress resource
className: ""
# Custom annotations for ingress resource
annotations: {}
# Default host for the ingress resource
#hostname: selenium-grid.local
#hostname: seleniumgrid-sbx.company.com
hostname: seleniumgrid-sbx.company.com
# Default host path for the ingress resource
path: /
# TLS backend configuration for ingress resource
tls: []
# ConfigMap that contains SE_EVENT_BUS_HOST, SE_EVENT_BUS_PUBLISH_PORT and SE_EVENT_BUS_SUBSCRIBE_PORT variables
busConfigMap:
# Name of the configmap
name: selenium-event-bus-config
# Custom annotations for configmap
annotations: {}
# ConfigMap that contains common environment variables for browser nodes
nodeConfigMap:
name: selenium-node-config
# Custom annotations for configmap
annotations: {}
# Configuration for isolated components (applied only if `isolateComponents: true`)
components:
# Configuration for router component
router:
# Router image name
imageName: company-seleniumgrid-docker-virtual.jfrog.io/selenium/router
# Router image tag (this overwrites global.seleniumGrid.imageTag parameter)
# imageTag: 4.14.1-20231025
# Image pull policy (see https://kubernetes.io/docs/concepts/containers/images/#updating-images)
imagePullPolicy: IfNotPresent
# Image pull secret (see https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/)
imagePullSecret: ""
# Custom annotations for router pods
annotations: {}
# Router port
port: 4444
# Liveness probe settings
livenessProbe:
enabled: true
path: /readyz
initialDelaySeconds: 10
failureThreshold: 10
timeoutSeconds: 10
periodSeconds: 10
successThreshold: 1
# Readiness probe settings
readinessProbe:
enabled: true
path: /readyz
initialDelaySeconds: 12
failureThreshold: 10
timeoutSeconds: 10
periodSeconds: 10
successThreshold: 1
# Resources for router container
resources: {}
# SecurityContext for router container
securityContext: {}
# Kubernetes service type (see https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services-service-types)
serviceType: ClusterIP
# Set specific loadBalancerIP when serviceType is LoadBalancer (see https://kubernetes.io/docs/concepts/services-networking/service/#loadbalancer)
loadBalancerIP: ""
# Custom annotations for router service
serviceAnnotations: {}
# Tolerations for router pods
tolerations: []
# Node selector for router pods
nodeSelector: {}
# Priority class name for router pods
priorityClassName: ""
# Configuration for distributor component
distributor:
# Distributor image name
imageName: company-seleniumgrid-docker-virtual.jfrog.io/selenium/distributor
# Distributor image tag (this overwrites global.seleniumGrid.imageTag parameter)
# imageTag: 4.14.1-20231025
# Image pull policy (see https://kubernetes.io/docs/concepts/containers/images/#updating-images)
imagePullPolicy: IfNotPresent
# Image pull secret (see https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/)
imagePullSecret: ""
# Custom annotations for Distributor pods
annotations: {}
# Distributor port
port: 5553
# Resources for Distributor container
resources: {}
# SecurityContext for Distributor container
securityContext: {}
# Kubernetes service type (see https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services-service-types)
serviceType: ClusterIP
# Custom annotations for Distributor service
serviceAnnotations: {}
# Tolerations for Distributor pods
tolerations: []
# Node selector for Distributor pods
nodeSelector: {}
# Priority class name for Distributor pods
priorityClassName: ""
# Configuration for Event Bus component
eventBus:
# Event Bus image name
imageName: company-seleniumgrid-docker-virtual.jfrog.io/selenium/event-bus
# Event Bus image tag (this overwrites global.seleniumGrid.imageTag parameter)
# imageTag: 4.14.1-20231025
# Image pull policy (see https://kubernetes.io/docs/concepts/containers/images/#updating-images)
imagePullPolicy: IfNotPresent
# Image pull secret (see https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/)
imagePullSecret: ""
# Custom annotations for Event Bus pods
annotations: {}
# Event Bus port
port: 5557
# Port where events are published
publishPort: 4442
# Port where to subscribe for events
subscribePort: 4443
# Resources for event-bus container
resources: {}
# SecurityContext for event-bus container
securityContext: {}
# Kubernetes service type (see https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services-service-types)
serviceType: ClusterIP
# Custom annotations for Event Bus service
serviceAnnotations: {}
# Tolerations for Event Bus pods
tolerations: []
# Node selector for Event Bus pods
nodeSelector: {}
# Priority class name for Event Bus pods
priorityClassName: ""
# Configuration for Session Map component
sessionMap:
# Session Map image name
imageName: company-seleniumgrid-docker-virtual.jfrog.io/selenium/sessions
# Session Map image tag (this overwrites global.seleniumGrid.imageTag parameter)
# imageTag: 4.14.1-20231025
# Image pull policy (see https://kubernetes.io/docs/concepts/containers/images/#updating-images)
imagePullPolicy: IfNotPresent
# Image pull secret (see https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/)
imagePullSecret: ""
# Custom annotations for Session Map pods
annotations: {}
port: 5556
# Resources for Session Map container
resources: {}
# SecurityContext for Session Map container
securityContext: {}
# Kubernetes service type (see https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services-service-types)
serviceType: ClusterIP
# Custom annotations for Session Map service
serviceAnnotations: {}
# Tolerations for Session Map pods
tolerations: []
# Node selector for Session Map pods
nodeSelector: {}
# Priority class name for Session Map pods
priorityClassName: ""
# Configuration for Session Queue component
sessionQueue:
# Session Queue image name
imageName: company-seleniumgrid-docker-virtual.jfrog.io/selenium/session-queue
# Session Queue image tag (this overwrites global.seleniumGrid.imageTag parameter)
# imageTag: 4.14.1-20231025
# Image pull policy (see https://kubernetes.io/docs/concepts/containers/images/#updating-images)
imagePullPolicy: IfNotPresent
# Image pull secret (see https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/)
imagePullSecret: ""
# Custom annotations for Session Queue pods
annotations: {}
port: 5559
# Resources for Session Queue container
resources: {}
# SecurityContext for Session Queue container
securityContext: {}
# Kubernetes service type (see https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services-service-types)
serviceType: ClusterIP
# Custom annotations for Session Queue service
serviceAnnotations: {}
# Tolerations for Session Queue pods
tolerations: []
# Node selector for Session Queue pods
nodeSelector: {}
# Priority class name for Session Queue pods
priorityClassName: ""
# Custom sub path for all components
subPath: /
# Custom environment variables for all components
extraEnvironmentVariables:
# - name: SE_JAVA_OPTS
# value: "-Xmx512m"
# - name:
# valueFrom:
# secretKeyRef:
# name: secret-name
# key: secret-key
# Custom environment variables by sourcing entire configMap, Secret, etc. for all components
extraEnvFrom:
# - configMapRef:
# name: proxy-settings
# - secretRef:
# name: mysecret
# Configuration for selenium hub deployment (applied only if `isolateComponents: false`)
hub:
# Selenium Hub image name
imageName: company-seleniumgrid-docker-virtual.jfrog.io/selenium/hub
# Selenium Hub image tag (this overwrites global.seleniumGrid.imageTag parameter)
# imageTag: 4.14.1-20231025
# Image pull policy (see https://kubernetes.io/docs/concepts/containers/images/#updating-images)
imagePullPolicy: IfNotPresent
# Image pull secret (see https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/)
imagePullSecret: ""
# Custom annotations for Selenium Hub pods
annotations: {}
# Custom labels for Selenium Hub pods
labels: {}
# Port where events are published
publishPort: 4442
# Port where to subscribe for events
subscribePort: 4443
# Selenium Hub port
port: 4444
# Liveness probe settings
livenessProbe:
enabled: true
path: /readyz
initialDelaySeconds: 10
failureThreshold: 10
timeoutSeconds: 10
periodSeconds: 10
successThreshold: 1
# Readiness probe settings
readinessProbe:
enabled: true
path: /readyz
initialDelaySeconds: 12
failureThreshold: 10
timeoutSeconds: 10
periodSeconds: 10
successThreshold: 1
# Custom sub path for the hub deployment
subPath: /
# Custom environment variables for selenium-hub
extraEnvironmentVariables:
# - name: SE_JAVA_OPTS
# value: "-Xmx512m"
# - name: SECRET_VARIABLE
# valueFrom:
# secretKeyRef:
# name: secret-name
# key: secret-key
# Custom environment variables by sourcing entire configMap, Secret, etc. for selenium-hub
extraEnvFrom:
# - configMapRef:
# name: proxy-settings
# - secretRef:
# name: mysecret
extraVolumeMounts: []
# - name: my-extra-volume
# mountPath: /home/seluser/Downloads
extraVolumes: []
# - name: my-extra-volume
# emptyDir: {}
# - name: my-extra-volume-from-pvc
# persistentVolumeClaim:
# claimName: my-pv-claim
# Resources for selenium-hub container
resources: {}
# SecurityContext for selenium-hub container
securityContext: {}
# Kubernetes service type (see https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services-service-types)
serviceType: ClusterIP
# Set specific loadBalancerIP when serviceType is LoadBalancer (see https://kubernetes.io/docs/concepts/services-networking/service/#loadbalancer)
loadBalancerIP: ""
# Custom annotations for Selenium Hub service
serviceAnnotations: {}
# Tolerations for selenium-hub pods
tolerations: []
# Node selector for selenium-hub pods
nodeSelector: {}
# Priority class name for selenium-hub pods
priorityClassName: ""
# Keda scaled object configuration
autoscaling:
# Enable autoscaling. Implies installing KEDA
enabled: false
# Enable autoscaling without automatically installing KEDA
enableWithExistingKEDA: false
# Which typ of KEDA scaling to use: job or deployment
scalingType: job
# Annotations for KEDA resources: ScaledObject and ScaledJob
annotations:
helm.sh/hook: post-install,post-upgrade
# Options for KEDA ScaledJobs
scaledJobOptions:
pollingInterval: 10
scalingStrategy:
strategy: accurate
deregisterLifecycle:
preStop:
exec:
command:
- bash
- -c
- |
curl -X POST 127.0.0.1:5555/se/grid/node/drain --header 'X-REGISTRATION-SECRET;' && \
while curl 127.0.0.1:5555/status; do sleep 1; done;
# Configuration for chrome nodes
chromeNode:
# Enable chrome nodes
enabled: true
# NOTE: Only used when autoscaling.enabled is false
# Enable creation of Deployment
# true (default) - if you want long living pods
# false - for provisioning your own custom type such as Jobs
deploymentEnabled: true
# Number of chrome nodes
replicas: 1
# Image of chrome nodes
imageName: company-seleniumgrid-docker-virtual.jfrog.io/selenium/node-chrome
# Image of chrome nodes (this overwrites global.seleniumGrid.nodesImageTag)
# imageTag: 4.14.1-20231025
# Image pull policy (see https://kubernetes.io/docs/concepts/containers/images/#updating-images)
imagePullPolicy: IfNotPresent
# Image pull secret (see https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/)
imagePullSecret: ""
# Port list to enable on container
ports:
- 4442
- 4443
# Selenium port (spec.ports[0].targetPort in kubernetes service)
seleniumPort: 5900
# Selenium port exposed in service (spec.ports[0].port in kubernetes service)
seleniumServicePort: 6900
# Annotations for chrome-node pods
annotations: {}
# Labels for chrome-node pods
labels: {}
# Resources for chrome-node container
resources:
requests:
memory: "1Gi"
cpu: "1"
limits:
memory: "1Gi"
cpu: "1"
# SecurityContext for chrome-node container
securityContext: {}
# Tolerations for chrome-node pods
tolerations: []
# Node selector for chrome-node pods
nodeSelector: {}
# Custom host aliases for chrome nodes
hostAliases:
# - ip: "198.51.100.0"
# hostnames:
# - "example.com"
# - "example.net"
# - ip: "203.0.113.0"
# hostnames:
# - "example.org"
# Custom environment variables for chrome nodes
extraEnvironmentVariables:
# - name: SE_JAVA_OPTS
# value: "-Xmx512m"
# - name:
# valueFrom:
# secretKeyRef:
# name: secret-name
# key: secret-key
# Custom environment variables by sourcing entire configMap, Secret, etc. for chrome nodes
extraEnvFrom:
# - configMapRef:
# name: proxy-settings
# - secretRef:
# name: mysecret
# Service configuration
service:
# Create a service for node
enabled: true
# Service type
type: ClusterIP
# Custom annotations for service
annotations: {}
# Size limit for DSH volume mounted in container (if not set, default is "1Gi")
dshmVolumeSizeLimit: 1Gi
# Priority class name for chrome-node pods
priorityClassName: ""
# Wait for pod startup
startupProbe: {}
# httpGet:
# path: /status
# port: 5555
# failureThreshold: 120
# periodSeconds: 5
# Liveness probe settings
livenessProbe: {}
# Time to wait for pod termination
terminationGracePeriodSeconds: 30
lifecycle: {}
extraVolumeMounts: []
# - name: my-extra-volume
# mountPath: /home/seluser/Downloads
extraVolumes: []
# - name: my-extra-volume
# emptyDir: {}
# - name: my-extra-volume-from-pvc
# persistentVolumeClaim:
# claimName: my-pv-claim
maxReplicaCount: 8
minReplicaCount: 1
hpa:
url: '{{ include "seleniumGrid.graphqlURL" . }}'
browserName: chrome
# browserVersion: '91.0' # Optional. Only required when supporting multiple versions of browser in your Selenium Grid.
unsafeSsl : 'true' # Optional
# It is used to add a sidecars proxy in the same pod of the browser node.
# It means it will add a new container to the deployment itself.
# It should be set using the --set-json option
sidecars: []
# Configuration for firefox nodes
firefoxNode:
# Enable firefox nodes
enabled: true
# NOTE: Only used when autoscaling.enabled is false
# Enable creation of Deployment
# true (default) - if you want long living pods
# false - for provisioning your own custom type such as Jobs
deploymentEnabled: true
# Number of firefox nodes
replicas: 1
# Image of firefox nodes
imageName: company-seleniumgrid-docker-virtual.jfrog.io/selenium/node-firefox
# Image of firefox nodes (this overwrites global.seleniumGrid.nodesImageTag)
# imageTag: 4.14.1-20231025
# Image pull policy (see https://kubernetes.io/docs/concepts/containers/images/#updating-images)
imagePullPolicy: IfNotPresent
# Image pull secret (see https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/)
imagePullSecret: ""
# Port list to enable on container
ports:
- 5555
# Selenium port (spec.ports[0].targetPort in kubernetes service)
seleniumPort: 5900
# Selenium port exposed in service (spec.ports[0].port in kubernetes service)
seleniumServicePort: 6900
# Annotations for firefox-node pods
annotations: {}
# Labels for firefox-node pods
labels: {}
# Tolerations for firefox-node pods
tolerations: []
# Node selector for firefox-node pods
nodeSelector: {}
# Resources for firefox-node container
resources:
requests:
memory: "1Gi"
cpu: "1"
limits:
memory: "1Gi"
cpu: "1"
# SecurityContext for firefox-node container
securityContext: {}
# Custom host aliases for firefox nodes
hostAliases:
# - ip: "198.51.100.0"
# hostnames:
# - "example.com"
# - "example.net"
# - ip: "203.0.113.0"
# hostnames:
# - "example.org"
# Custom environment variables for firefox nodes
extraEnvironmentVariables:
# - name: SE_JAVA_OPTS
# value: "-Xmx512m"
# - name:
# valueFrom:
# secretKeyRef:
# name: secret-name
# key: secret-key
# Custom environment variables by sourcing entire configMap, Secret, etc. for firefox nodes
extraEnvFrom:
# - configMapRef:
# name: proxy-settings
# - secretRef:
# name: mysecret
# Service configuration
service:
# Create a service for node
enabled: true
# Service type
type: ClusterIP
# Custom annotations for service
annotations: {}
# Size limit for DSH volume mounted in container (if not set, default is "1Gi")
dshmVolumeSizeLimit: 1Gi
# Priority class name for firefox-node pods
priorityClassName: ""
# Wait for pod startup
startupProbe: {}
# httpGet:
# path: /status
# port: 5555
# failureThreshold: 120
# periodSeconds: 5
# Liveness probe settings
livenessProbe: {}
# Time to wait for pod termination
terminationGracePeriodSeconds: 30
lifecycle: {}
extraVolumeMounts: []
# - name: my-extra-volume
# mountPath: /home/seluser/Downloads
extraVolumes: []
# - name: my-extra-volume
# emptyDir: {}
# - name: my-extra-volume-from-pvc
# persistentVolumeClaim:
# claimName: my-pv-claim
maxReplicaCount: 8
minReplicaCount: 1
hpa:
url: '{{ include "seleniumGrid.graphqlURL" . }}'
browserName: firefox
# It is used to add a sidecars proxy in the same pod of the browser node.
# It means it will add a new container to the deployment itself.
# It should be set using the --set-json option
sidecars: []
# Configuration for edge nodes
edgeNode:
# Enable edge nodes
enabled: true
# NOTE: Only used when autoscaling.enabled is false
# Enable creation of Deployment
# true (default) - if you want long living pods
# false - for provisioning your own custom type such as Jobs
deploymentEnabled: true
# Number of edge nodes
replicas: 1
# Image of edge nodes
imageName: company-seleniumgrid-docker-virtual.jfrog.io/selenium/node-edge
# Image of edge nodes (this overwrites global.seleniumGrid.nodesImageTag)
# imageTag: 4.14.1-20231025
# Image pull policy (see https://kubernetes.io/docs/concepts/containers/images/#updating-images)
imagePullPolicy: IfNotPresent
# Image pull secret (see https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/)
imagePullSecret: ""
ports:
- 5555
# Selenium port (spec.ports[0].targetPort in kubernetes service)
seleniumPort: 5900
# Selenium port exposed in service (spec.ports[0].port in kubernetes service)
seleniumServicePort: 6900
# Annotations for edge-node pods
annotations: {}
# Labels for edge-node pods
labels: {}
# Tolerations for edge-node pods
tolerations: []
# Node selector for edge-node pods
nodeSelector: {}
# Resources for edge-node container
resources:
requests:
memory: "1Gi"
cpu: "1"
limits:
memory: "1Gi"
cpu: "1"
# SecurityContext for edge-node container
securityContext: {}
# Custom host aliases for edge nodes
hostAliases:
# - ip: "198.51.100.0"
# hostnames:
# - "example.com"
# - "example.net"
# - ip: "203.0.113.0"
# hostnames:
# - "example.org"
# Custom environment variables for edge nodes
extraEnvironmentVariables:
# - name: SE_JAVA_OPTS
# value: "-Xmx512m"
# - name:
# valueFrom:
# secretKeyRef:
# name: secret-name
# key: secret-key
# Custom environment variables by sourcing entire configMap, Secret, etc. for edge nodes
extraEnvFrom:
# - configMapRef:
# name: proxy-settings
# - secretRef:
# name: mysecret
# Service configuration
service:
# Create a service for node
enabled: true
# Service type
type: ClusterIP
# Custom annotations for service
annotations:
hello: world
# Size limit for DSH volume mounted in container (if not set, default is "1Gi")
dshmVolumeSizeLimit: 1Gi
# Priority class name for edge-node pods
priorityClassName: ""
# Wait for pod startup
startupProbe: {}
# httpGet:
# path: /status
# port: 5555
# failureThreshold: 120
# periodSeconds: 5
# Liveness probe settings
livenessProbe: {}
# Time to wait for pod termination
terminationGracePeriodSeconds: 30
lifecycle: {}
extraVolumeMounts: []
# - name: my-extra-volume
# mountPath: /home/seluser/Downloads
extraVolumes: []
# - name: my-extra-volume
# emptyDir: {}
# - name: my-extra-volume-from-pvc
# persistentVolumeClaim:
# claimName: my-pv-claim
maxReplicaCount: 8
minReplicaCount: 1
hpa:
url: '{{ include "seleniumGrid.graphqlURL" . }}'
browserName: MicrosoftEdge
sessionBrowserName: 'msedge'
# It is used to add a sidecars proxy in the same pod of the browser node.
# It means it will add a new container to the deployment itself.
# It should be set using the --set-json option
sidecars: []
videoRecorder:
enabled: false
# Image of video recorder
imageName: company-seleniumgrid-docker-virtual.jfrog.io/selenium/video
# Image of video recorder
imageTag: ffmpeg-6.0-20231025
# Image pull policy (see https://kubernetes.io/docs/concepts/containers/images/#updating-images)
imagePullPolicy: IfNotPresent
# Image pull secret (see https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/)
# What uploader to use. See .videRecorder.s3 for how to create a new one.
# uploader: s3
uploader: false
# Where to upload the video file. Should be set to something like 's3://myvideobucket/'
uploadDestinationPrefix: false
ports:
- 9000
resources:
requests:
memory: "1Gi"
cpu: "1"
limits:
memory: "1Gi"
cpu: "1"
extraEnvironmentVariables:
# - name: SE_VIDEO_FOLDER
# value: /videos
# Custom environment variables by sourcing entire configMap, Secret, etc. for video recorder.
extraEnvFrom:
# - configMapRef:
# name: proxy-settings
# - secretRef:
# name: mysecret
# Wait for pod startup
terminationGracePeriodSeconds: 30
# Wait for pod startup
startupProbe: {}
# httpGet:
# path: /
# port: 9000
# failureThreshold: 120
# periodSeconds: 5
# Liveness probe settings
livenessProbe: {}
volume:
# name:
# folder: video
# scripts: video-scripts
# Custom video recorder back-end scripts (video.sh, video_ready.py, etc.) further by ConfigMap.
# NOTE: For the mount point with the name "video", or "video-scripts", it will override the default. For other names, it will be appended.
extraVolumeMounts: []
# - name: video-scripts
# mountPath: /opt/bin/video.sh
# subPath: custom_video.sh
# - name: video-scripts
# mountPath: /opt/bin/video_ready.py
# subPath: video_ready.py
extraVolumes: []
# - name: video-scripts
# configMap:
# name: my-video-scripts-cm
# defaultMode: 0500
# - name: video
# persistentVolumeClaim:
# claimName: video-pv-claim
# Container spec for the uploader if above it is defined as "uploader: s3"
s3:
imageName: public.ecr.aws/bitnami/aws-cli
imageTag: "2"
imagePullPolicy: IfNotPresent
securityContext:
runAsUser: 0
command:
- /bin/sh
args:
- -c
- |
while ! [ -p /videos/uploadpipe ]
do
echo Waiting for /videos/uploadpipe to be created
sleep 1
done
echo Waiting for files to upload
while read FILE DESTINATION < /videos/uploadpipe
do
if [ "$FILE" = "exit" ]
then
break
else
aws s3 cp --no-progress $FILE $DESTINATION
fi
done
extraEnvironmentVariables:
# - name: AWS_ACCESS_KEY_ID
# value: aws_access_key_id
# - name: AWS_SECRET_ACCESS_KEY
# value: aws_secret_access_key
# - name:
# valueFrom:
# secretKeyRef:
# name: secret-name
# key: secret-key
# Custom labels for k8s resources
customLabels: {}
Relevant log output
Logs from the chrome node that is not able to register with the hub:
2023-12-14 10:25:15,457 INFO Included extra file "/etc/supervisor/conf.d/selenium.conf" during parsing
2023-12-14 10:25:15,460 INFO RPC interface 'supervisor' initialized
2023-12-14 10:25:15,460 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2023-12-14 10:25:15,460 INFO supervisord started with pid 8
2023-12-14 10:25:16,462 INFO spawned: 'xvfb' with pid 10
2023-12-14 10:25:16,464 INFO spawned: 'vnc' with pid 11
2023-12-14 10:25:16,465 INFO spawned: 'novnc' with pid 12
2023-12-14 10:25:16,467 INFO spawned: 'selenium-node' with pid 13
2023-12-14 10:25:16,484 INFO success: selenium-node entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
Generating Selenium Config
Configuring server...
Setting up SE_NODE_HOST...
Setting up SE_NODE_PORT...
2023-12-14 10:25:17,538 INFO success: xvfb entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-12-14 10:25:17,538 INFO success: vnc entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-12-14 10:25:17,538 INFO success: novnc entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
Tracing is disabled
Selenium Grid Node configuration:
[events]
publish = "tcp://selenium-hub:4442"
subscribe = "tcp://selenium-hub:4443"
[node]
grid-url = http://selenium-hub.seleniumgridpoc:4444
session-timeout = "300"
override-max-sessions = false
detect-drivers = false
drain-after-session-count = 0
max-sessions = 1
[[node.driver-configuration]]
display-name = "chrome"
stereotype = '{"browserName": "chrome", "browserVersion": "118.0", "platformName": "Linux"}'
max-sessions = 1
Starting Selenium Grid Node...
Dec 14, 2023 10:25:17 AM org.openqa.selenium.grid.Bootstrap createExtendedClassLoader
WARNING: Extension file or directory does not exist: /opt/selenium/selenium-http-jdk-client.jar
10:25:18.527 INFO [LoggingOptions.configureLogEncoding] - Using the system default encoding
10:25:18.538 INFO [OpenTelemetryTracer.createTracer] - Using OpenTelemetry for tracing
10:25:18.935 INFO [UnboundZmqEventBus.<init>] - Connecting to tcp://selenium-hub:4442 and tcp://selenium-hub:4443
10:25:19.133 INFO [UnboundZmqEventBus.<init>] - Sockets created
10:25:20.136 INFO [UnboundZmqEventBus.<init>] - Event bus ready
10:25:20.320 INFO [NodeServer.createHandlers] - Reporting self as: http://10.244.4.8:5555
10:25:20.339 INFO [NodeOptions.getSessionFactories] - Detected 1 available processors
10:25:20.437 INFO [NodeOptions.report] - Adding chrome for {"browserName": "chrome","browserVersion": "118.0","platformName": "linux","se:noVncPort": 7900,"se:vncEnabled": true} 1 times
10:25:20.452 INFO [Node.<init>] - Binding additional locator mechanisms: relative
10:25:20.756 INFO [NodeServer$1.start] - Starting registration process for Node http://10.244.4.8:5555
10:25:20.758 INFO [NodeServer.execute] - Started Selenium node 4.14.1 (revision 03f8ede370): http://10.244.4.8:5555
10:25:20.777 INFO [NodeServer$1.lambda$start$1] - Sending registration event...
10:25:30.781 INFO [NodeServer$1.lambda$start$1] - Sending registration event...
10:25:40.782 INFO [NodeServer$1.lambda$start$1] - Sending registration event...
10:25:50.785 INFO [NodeServer$1.lambda$start$1] - Sending registration event...
10:26:00.787 INFO [NodeServer$1.lambda$start$1] - Sending registration event...
10:26:10.789 INFO [NodeServer$1.lambda$start$1] - Sending registration event...
10:26:20.794 INFO [NodeServer$1.lambda$start$1] - Sending registration event...
10:26:30.796 INFO [NodeServer$1.lambda$start$1] - Sending registration event...
10:26:40.798 INFO [NodeServer$1.lambda$start$1] - Sending registration event...
10:26:50.800 INFO [NodeServer$1.lambda$start$1] - Sending registration event...
10:27:00.802 INFO [NodeServer$1.lambda$start$1] - Sending registration event...
10:27:10.804 INFO [NodeServer$1.lambda$start$1] - Sending registration event...
10:27:20.761 INFO [NodeServer$1.lambda$start$1] - Sending registration event...
Operating System
AKS
Docker Selenium version (image tag)
4.14.1-20231025
Selenium Grid chart version (chart version)
0.23
@michaelmowry, thank you for creating this issue. We will troubleshoot it as soon as we can.
Info for maintainers
Triage this issue by using labels.
If information is missing, add a helpful comment and then I-issue-template label.
If the issue is a question, add the I-question label.
If the issue is valid but there is no time to troubleshoot it, consider adding the help wanted label.
If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C),
add the applicable G-* label, and it will provide the correct link and auto-close the
issue.
After troubleshooting the issue, please add the R-awaiting answer label.
Thank you!
Hi @michaelmowry, can you try to kubectl describe selenium-node-config to see what is SE_NODE_GRID_URL is set there?
Thanks for the reply. SE_NODE_GRID_URL = http://selenium-hub.seleniumgridpoc:4444
@michaelmowry, can you try to enable FINE logs in Node, what's wrong behind Sending registration event...
chromeNode:
extraEnvironmentVariables:
- name: SE_OPTS
value: "--log-level FINE"
If there is no dependency can you try the latest chart 0.26.3 with this config passing when installing the chart
--set global.seleniumGrid.logLevel=FINE, it would simply enable FINE logs for all components
@vietnd96,
We upgraded to 0.26.3 and still get the same issue with the chrome node not connecting. The only items to note are:
- We disable basic auth
- We use istio for traffic control see line 39 in values.yaml
- IsolateComponents = false and we disable Edge, Firefox, Video, and scaling just to work on connectivity with chrome nodes
- We set a hostname on line 75 but disable ingress because we are using istio. I don't think this is an issue because we are able to access the selenium grid web console at http://seleniumgrid-sbx.company.com
- We updated logging to FINEST
The updated values files and logs are attached. We also validated connectivity between the chrome node and the hub via curl and have attached the logs with the failed registration for the chrome node. We still get a timeout on "Sending registration event...". We can queue tests for execution but they also timeout due to no available chrome nodes. We have tried quite a few things but haven't been able to solve this...would appreciate any ideas.
values.yaml.txt
values-istio.yaml.txt
chrome-node-argocd-logs.txt
Honestly, I don't have much experience with Istio. Let me look around to see any clue. How about other kinds of service deployment? without Istio, NodePort, or Ingress?
@michaelmowry, there is another ticket that also mentioned the same problem when Node registers - https://github.com/SeleniumHQ/docker-selenium/issues/1645#issuecomment-1851895016. There was a comment mentioned that can be resolved by disabling Java Opentelemetry feature on the Selenium process.
Can you try to add the below configs under chromeNode
chromeNode:
extraEnvironmentVariables:
- name: SE_JAVA_OPTS
value: "-Dotel.javaagent.enabled=false -Dotel.metrics.exporter=none -Dotel.sdk.disabled=true"
@vietnd96 thank you for your continued support. I tried adding the SE_JAVA_OPTS above and still no change to the connectivity issue. I will also look for a response from @eowoyn in the comment linked above.
@michaelmowry What role does istio play in your kubernetes cluster ? Can it block the traffic within kubernetes namespace among pods ? I faced a different issue of similar nature due to Calico networking policy. The calico by default is zero trust in my setup. Had to apply the appropriate network policy so that Node and Hub can talk to each other.
Istio is a traffic manager within our cluster. It can block traffic within the namespace, however we have it configured to allow all traffic within the namespace.
Calico is disabled in our namespace.
The chrome node and hub run on seperate pods and have different IPs. From the chrome node log snippet below, it appears that selenium-hub is accessible on 4442 and 4443 as the sockets are created. Can anyone tell us more about how the registration event works? What port does it occur on and what endpoint does it use to register with the hub? It is strange that the 4442/4443 connection works but the registration does not, right?
10:15:27.108 INFO [UnboundZmqEventBus.<init>] - Connecting to tcp://selenium-hub:4442 and tcp://selenium-hub:4443
10:15:27.293 INFO [UnboundZmqEventBus.<init>] - Sockets created
10:15:28.303 INFO [UnboundZmqEventBus.<init>] - Event bus ready
10:15:28.514 INFO [NodeServer.createHandlers] - Reporting self as: http://10.244.3.8:5555/
10:15:28.585 INFO [NodeOptions.getSessionFactories] - Detected 1 available processors
10:15:28.710 INFO [NodeOptions.report] - Adding chrome for {"browserName": "chrome","browserVersion": "120.0","goog:chromeOptions": {"binary": "\u002fusr\u002fbin\u002fgoogle-chrome"},"platformName": "linux","se:noVncPort": 7900,"se:vncEnabled": true} 1 times
10:15:28.796 INFO [Node.<init>] - Binding additional locator mechanisms: relative
10:15:29.214 INFO [NodeServer$1.start] - Starting registration process for Node http://10.244.3.8:5555/
10:15:29.216 INFO [NodeServer.execute] - Started Selenium node 4.16.1 (revision 9b4c83354e): http://10.244.3.8:5555/
10:15:29.280 INFO [NodeServer$1.lambda$start$1] - Sending registration event...
10:15:39.283 INFO [NodeServer$1.lambda$start$1] - Sending registration event...
10:15:49.289 INFO [NodeServer$1.lambda$start$1] - Sending registration event...
Not specific to kubernetes but this link may be helpful on ports used in registration https://www.selenium.dev/documentation/grid/getting_started/#node-and-hub-on-different-machines Few things I will try in your situation assuming you are using hub mode:
- Try enabling DEBUG mode via helm chart if that prints more details around registration.
- exec into hub/node containers and check if pods can connect on desired ports via kubernetes services.
- Check istio logs /ui . Does istio offer some interactive ui where I can see the traffic ?
Hi, I am continuing Micheal's effort from our team. The issue is still not received. I tried with diabling the open telemetry feature as mentioned in the comment - [https://github.com/SeleniumHQ/docker-selenium/issues/1645#issuecomment-1851895016.)].But it didnt work out.
Also I am attaching the response from hub and nodes when doing curl from one another. Please let me know if it does ring a bell on an possible cause?
Hub to Node:
Node to Hub:
Node also needs to reach EventBus (port 4442, 4443) inside the Hub, that communication is done via TCP. Can you check if that is enabled?
Hi Everyone, I am able to register the nodes by passing the environment variables of Pod names.
I have another question on https:// calls inside nodes.
When i trigger a test using my selenium grid on AKS, by default the webpage under test are routed to http:// instead of HTTPS://
Can you please help me to understand the root cause of this issue.
Hi @Thomas-Personal, may I know the details on passing the environment variables of Pod names. Which env vars and it belongs to which component? With Istio (service mesh) if using Service names it won't work?
I just tried to understand Istio and service mesh, it looks like one proxy sidecar per pod, so I guess that's the reason Pod names are needed for components communication. Currently, by default in chart, Service names are used only. So I am thinking on how to extend the supports, then we can simplify this kind of deployment.
Hi @VietND96 , We have updated the service names in the node env. by default , it was using the POD IP tp register the nodes. when we passed the service names , it got registered
@VietND96 , Can you please let me know the release from which the service names are used by default. Passing the service names in the extra env variables causing some issues during autoscaled jobs . I am using 0.26.3.but it seems to have taken the POD IP for registration
Hi @Thomas-Personal, you can check the chart version 0.28.0 onwards
Thank you @VietND96 . I have issues with autoscaling . When the queue size is 2 , there are two scaled jobs triggered for chrome node. But only one node was successful and one test case picked up and run and the other test case failed. I could see only one node in the UI . the other node also says the node registration is successful.
But I am not sure what was the error. Is it because the both scaled jobs using the same port ? do we need to change any configuration to see both queued test cases picked up successfully ?
Hi @VietND96 , In Istio mesh, the POD IP based node registration seems to be causing the problem. So i added the below in the helpers.tpl - name: SE_NODE_HOST value: {{ .name | quote }}
Node registration is successful after including this part. But I couldn't get more than one node registered. Could you please help me with this issue.
@Thomas-Personal, I have not tried this way yet, let me try to see any clue and get back to you.
Thank you so much . Please let me know the results once you tried it. I am trying to implement it with ISTIO mesh for the organization that i work.
Hi @VietND96 , I have made the clusterIP: none in the node service which made the service as headless without cluster IP and node started registering without issues.
I have tried with KEDA autoscalar. I am facing two issues ,
- After completion, the sidecar proxy(istio-proxy) is not terminated. beacuse of which the pod continue to exist
- If test cases timeout before pod spin up, the jobs are not terminating the container
Please help me with the above two issues
@VietND96 any updates on this?