mssql-docker icon indicating copy to clipboard operation
mssql-docker copied to clipboard

MSSQL_AGENT_ENABLED=true doesn't work with SQL 2019 CTP3.2

Open shrankit opened this issue 5 years ago • 21 comments

I'm trying to setup replication from my Linux container running SQL 2019 CTP3.2, the two ways I know to enable SQL Agent are

  1. use MSSQL_AGENT_ENABLED environment variable while creating container
  2. /opt/mssql/bin/mssql-conf set sqlagent.enabled true

I've tried both, and sadly none of them work. Is this a known issue or I'm doing something wrong or missing something.

shrankit avatar Sep 17 '19 13:09 shrankit

I've spent the whole day trying to run SQL Server 2019 Linux container on AKS in same way as the OP describes, and failed as well. Wanted to post an issue, but was not the first one :)

The short version

It seems, that SQL Agent, when deployed to AKS, fails to start. The possible reason for failure, as indicated in sqlagentstartup.log is: Agent fails to connect to the service itself.

SQL Agent was not able to connect to SQL Server instance [(local),1433]. Attempt 1 of 15. Will retry again in 1000ms.

I'm not so proficient with Kubernetes administration, but it seems, that somehow connection from the pod to itself on port 1433 is actually restricted by Kubernetes itself.

The long version

Now I'll describe my scenario in details. So the requirements are:

  • Run an SQL Server 2019 in the AKS (specifically AKS)
  • Run the SQL Server Agent, as the whole purpose is to use CDC feature

Running the container in Docker Desktop on Windows (still, Linux container) was successful. SQL Agent is running, CDC is working. The command for this was: docker run -e MSSQL_PID=Developer -e ACCEPT_EULA=Y -e SA_PASSWORD=StrongP@ssword! -e MSSQL_AGENT_ENABLED=true -p 1433:1433 -d mcr.microsoft.com/mssql/server:2019-CTP3.2-ubuntu

The real problem is to run the same container in the AKS. Here is the yaml file I use for the deployment:

apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: mssql-deployment
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: mssql
    spec:
      terminationGracePeriodSeconds: 10
      containers:
      - name: mssql
        image: mcr.microsoft.com/mssql/server:2019-CTP3.2-ubuntu
        ports:
        - containerPort: 1433
          protocol: TCP
        resources:
          requests:
            cpu: 500m
            memory: 512Mi
          limits:
            cpu: 2000m
            memory: 2048Mi
        env:
        - name: MSSQL_PID
          value: "Developer"
        - name: ACCEPT_EULA
          value: "Y"
        - name: MSSQL_SA_PASSWORD
          value: "StrongP@ssword!"
        - name: MSSQL_AGENT_ENABLED
          value: "true"
        volumeMounts:
        - name: mssqldb
          mountPath: /var/opt/mssql
      volumes:
      - name: mssqldb
        persistentVolumeClaim:
          claimName: mssql-data
---
apiVersion: v1
kind: Service
metadata:
  name: mssql-deployment
spec:
  selector:
    app: mssql
  ports:
    - protocol: TCP
      port: 1433
      targetPort: 1433
  type: LoadBalancer

The result is - SQL Server is working, but Agent is not working. Here how it looks in SQL Management studio: image

Trying to open properties, yields this error

SQL Server blocked access to procedure 'dbo.sp_get_sqlagent_properties' of component 'Agent XPs' because this component is turned off as part of the security configuration for this server. A system administrator can enable the use of 'Agent XPs' by using sp_configure. For more information about enabling 'Agent XPs', search for 'Agent XPs' in SQL Server Books Online. (Microsoft SQL Server, Error: 15281)

For help, click: http://go.microsoft.com/fwlink?ProdName=Microsoft%20SQL%20Server&ProdVer=15.00.1800&EvtSrc=MSSQLServer&EvtID=15281&LinkId=20476

While checking the logs from within the container, found these prompts

  • From sqlagent.out
019-09-17 12:15:25 - ? [508] Logging SQL Server Agent messages in file '/var/opt/mssql/log/sqlagent.out'.
2019-09-17 12:15:25 - ? [101] SQLServerAgent service successfully started
2019-09-17 12:15:25 - ? [350] Waiting for SQL Server to start...
2019-09-17 12:15:28 - ? [000] Event Global\sqlserverRecComplete opened
2019-09-17 12:15:28 - ? [500] Waiting for SQL Server to recover all databases...
2019-09-17 12:15:36 - ? [100] Microsoft SQLServerAgent version 15.0.1800.32 (X64 unicode retail build) : Process ID 24
2019-09-17 12:15:36 - ? [495] The SQL Server Agent startup service account is \mssql-deploymen$.
2019-09-17 12:15:36 - ? [151] Running SQL Server Agent cross-platform
2019-09-17 12:16:06 - ! [150] SQL Server does not accept the connection (error: 11001). Waiting for Sql Server to allow connections. Operation attempted was: Verify Connection On Start.
2019-09-17 12:16:37 - ! [150] SQL Server does not accept the connection (error: 11001). Waiting for Sql Server to allow connections. Operation attempted was: Verify Connection On Start.
2019-09-17 12:17:08 - ! [150] SQL Server does not accept the connection (error: 11001). Waiting for Sql Server to allow connections. Operation attempted was: Verify Connection On Start.
2019-09-17 12:17:39 - ! [150] SQL Server does not accept the connection (error: 11001). Waiting for Sql Server to allow connections. Operation attempted was: Verify Connection On Start.
2019-09-17 12:18:10 - ! [150] SQL Server does not accept the connection (error: 11001). Waiting for Sql Server to allow connections. Operation attempted was: Verify Connection On Start.
2019-09-17 12:18:41 - ! [150] SQL Server does not accept the connection (error: 11001). Waiting for Sql Server to allow connections. Operation attempted was: Verify Connection On Start.
2019-09-17 12:19:12 - ! [150] SQL Server does not accept the connection (error: 11001). Waiting for Sql Server to allow connections. Operation attempted was: Verify Connection On Start.
2019-09-17 12:19:43 - ! [150] SQL Server does not accept the connection (error: 11001). Waiting for Sql Server to allow connections. Operation attempted was: Verify Connection On Start.
2019-09-17 12:20:14 - ! [150] SQL Server does not accept the connection (error: 11001). Waiting for Sql Server to allow connections. Operation attempted was: Verify Connection On Start.
2019-09-17 12:20:46 - ! [150] SQL Server does not accept the connection (error: 11001). Waiting for Sql Server to allow connections. Operation attempted was: Verify Connection On Start.
2019-09-17 12:21:17 - ! [150] SQL Server does not accept the connection (error: 11001). Waiting for Sql Server to allow connections. Operation attempted was: Verify Connection On Start.
2019-09-17 12:21:48 - ! [150] SQL Server does not accept the connection (error: 11001). Waiting for Sql Server to allow connections. Operation attempted was: Verify Connection On Start.
2019-09-17 12:22:19 - ! [150] SQL Server does not accept the connection (error: 11001). Waiting for Sql Server to allow connections. Operation attempted was: Verify Connection On Start.
2019-09-17 12:22:50 - ! [150] SQL Server does not accept the connection (error: 11001). Waiting for Sql Server to allow connections. Operation attempted was: Verify Connection On Start.
2019-09-17 12:23:21 - ! [150] SQL Server does not accept the connection (error: 11001). Waiting for Sql Server to allow connections. Operation attempted was: Verify Connection On Start.
2019-09-17 12:23:21 - ! [000] Unable to connect to server '(local),1433'; SQLServerAgent cannot start
2019-09-17 12:23:21 - ! [103] SQLServerAgent could not be started (reason: Unable to connect to server '(local),1433'; SQLServerAgent cannot start)
2019-09-17 12:23:26 - ! [298] SQLServer Error: 11001, TCP Provider: No such host is known. [SQLSTATE 08001] 
2019-09-17 12:23:26 - ! [165] ODBC Error: 0, Login timeout expired [SQLSTATE HYT00] 
2019-09-17 12:23:26 - ! [298] SQLServer Error: 11001, A network-related or instance-specific error has occurred while establishing a connection to SQL Server. Server is not found or not accessible. Check if instance name is correct and if SQL Server is configured to allow remote connections. For more information see SQL Server Books Online. [SQLSTATE 08001] 
2019-09-17 12:23:26 - ! [382] Logon to server '(local),1433' failed (DisableAgentXPs)
2019-09-17 12:23:26 - ? [102] SQLServerAgent service successfully stopped
2019-09-17 12:23:26 - ? [098] SQLServerAgent terminated (normally)
  • From sqlagentstartup.log
SQL Agent was not able to connect to SQL Server instance [(local),1433]. Attempt 1 of 15. Will retry again in 1000ms.

lancerdima avatar Sep 18 '19 09:09 lancerdima

@lancerdima hey I tried mcr.microsoft.com/mssql/server:2019-latest today and it worked for me. You can give it a try. Wanted to post an update but you beat me to it

shrankit avatar Sep 18 '19 10:09 shrankit

Just an FYI SSMS cannot show the status of Agent full description here https://github.com/microsoft/mssql-docker/issues/412

shrankit avatar Sep 18 '19 10:09 shrankit

does anyone know what the issue was? is there some network config missing? I get the same error message when trying to start the agent on the latest version but not running inside a container

niroa avatar Nov 10 '19 13:11 niroa

same issue here... This is really disappointing...

mrvladis avatar Nov 14 '19 09:11 mrvladis

ive also raised it here - FYI https://social.msdn.microsoft.com/Forums/sqlserver/en-US/ef641b2a-da32-44ff-81b9-8afa0c861371/sql-server-2019-linux-issue-with-sql-agent-startup?forum=sqlsetupandupgrade

niroa avatar Nov 14 '19 09:11 niroa

Yeah, I have seen it. Answers there even more disappointing... no one has actually read description of the problem.

mrvladis avatar Nov 14 '19 09:11 mrvladis

@mrvladis i just tried the latest GDR repo for 2019 on ubuntu 16.04 - the agent starts up fine!

niroa avatar Nov 14 '19 10:11 niroa

Nope. I have been using latest and it is not, but I've managed to find what is the problem!

It is working fine only if you don't tune your containers!!!!!!!!! I am always setting the "hostname" for the container. And it seems that host name is not resolvable even within the container itself. Once I have removed host name for container and it fall back to the actual host name - agent started successfully.

Thanks.

mrvladis avatar Nov 14 '19 10:11 mrvladis

How and/or where are you setting the hostname? I am experiencing the same issue and finding the exact same errors in the logs. I don't think I'm setting any hostname configuration values. I'm running this on Kubernetes. I'm using the exact same Deployment.yml as shown above. I'm curious what needs to be changed in that yaml to get it to work

kellymenzel avatar Nov 16 '19 20:11 kellymenzel

Hi @kellymenzel ,

I have put few word about my experience here: https://mrvlab.co.uk/2019/11/17/sql-server-for-iot/

However I am not entirely sure if it will help you on Kubernetes, as if I remember it correctly - you will have pod name resolution configured there, rather than container.

In anyway, what you need to check is that the container / SQL server name ( you can get it by running [SELECT @@SERVERNAME]) is resolvable into IP address within the container running your MSSQL.

Thanks.

mrvladis avatar Nov 17 '19 21:11 mrvladis

The root of the original problem described in this issue is that @@SERVERNAME once you start the pod in Kubernetes returns the first 15 characters of the kubernetes deployment name. So for example, my Deployment was named "sales-master-database", which was concatinated to "sales-master-da". This name is not resolvable in Kubernetes. In the /var/opt/mssql/log/sqlagent.out file we were seeing...

SQLServer Error: 11001, TCP Provider: No such host is known. [SQLSTATE 08001]

I was able to make some progress by creating a Service for this name of type ClusterIP

apiVersion: v1
kind: Service
metadata:
  name: sales-master-da
  labels:
    app: sales-master-database
spec:
  type: ClusterIP
  sessionAffinity: None
  ports:
  - name: tcp
    port: 1433
  selector:
    app: sales-master-database

While this resolved the No such host is known error message, SQL Agent still didn't start up because of a different error. Now I am receiving a login failed error...

mssql@sales-master-database-deployment-7485bb479c-gt45v:/$ cat /var/opt/mssql/log/sqlagent.out 
2019-11-21 15:31:58 - ? [508] Logging SQL Server Agent messages in file '/var/opt/mssql/log/sqlagent.out'.
2019-11-21 15:31:58 - ? [101] SQLServerAgent service successfully started
2019-11-21 15:31:58 - ? [350] Waiting for SQL Server to start...
2019-11-21 15:32:01 - ? [000] Event Global\sqlserverRecComplete opened
2019-11-21 15:32:01 - ? [500] Waiting for SQL Server to recover all databases...
2019-11-21 15:32:05 - ? [100] Microsoft SQLServerAgent version 15.0.2000.5 (X64 unicode retail build) : Process ID 28
2019-11-21 15:32:05 - ? [495] The SQL Server Agent startup service account is \sales-master-da$.
2019-11-21 15:32:05 - ? [151] Running SQL Server Agent cross-platform
2019-11-21 15:32:10 - ! [150] SQL Server does not accept the connection (error: 10061). Waiting for Sql Server to allow connections. Operation attempted was: Verify Connection On Start.
2019-11-21 15:32:40 - ! [150] SQL Server does not accept the connection (error: 18452). Waiting for Sql Server to allow connections. Operation attempted was: Verify Connection On Start.
2019-11-21 15:32:41 - ! [150] SQL Server does not accept the connection (error: 18452). Waiting for Sql Server to allow connections. Operation attempted was: Verify Connection On Start.
2019-11-21 15:32:42 - ! [150] SQL Server does not accept the connection (error: 18452). Waiting for Sql Server to allow connections. Operation attempted was: Verify Connection On Start.
2019-11-21 15:32:43 - ! [150] SQL Server does not accept the connection (error: 18452). Waiting for Sql Server to allow connections. Operation attempted was: Verify Connection On Start.
2019-11-21 15:32:44 - ! [150] SQL Server does not accept the connection (error: 18452). Waiting for Sql Server to allow connections. Operation attempted was: Verify Connection On Start.
2019-11-21 15:32:45 - ! [150] SQL Server does not accept the connection (error: 18452). Waiting for Sql Server to allow connections. Operation attempted was: Verify Connection On Start.
2019-11-21 15:32:46 - ! [150] SQL Server does not accept the connection (error: 18452). Waiting for Sql Server to allow connections. Operation attempted was: Verify Connection On Start.
2019-11-21 15:32:47 - ! [150] SQL Server does not accept the connection (error: 18452). Waiting for Sql Server to allow connections. Operation attempted was: Verify Connection On Start.
2019-11-21 15:32:48 - ! [150] SQL Server does not accept the connection (error: 18452). Waiting for Sql Server to allow connections. Operation attempted was: Verify Connection On Start.
2019-11-21 15:32:49 - ! [150] SQL Server does not accept the connection (error: 18452). Waiting for Sql Server to allow connections. Operation attempted was: Verify Connection On Start.
2019-11-21 15:32:50 - ! [150] SQL Server does not accept the connection (error: 18452). Waiting for Sql Server to allow connections. Operation attempted was: Verify Connection On Start.
2019-11-21 15:32:51 - ! [150] SQL Server does not accept the connection (error: 18452). Waiting for Sql Server to allow connections. Operation attempted was: Verify Connection On Start.
2019-11-21 15:32:52 - ! [150] SQL Server does not accept the connection (error: 18452). Waiting for Sql Server to allow connections. Operation attempted was: Verify Connection On Start.
2019-11-21 15:32:53 - ! [150] SQL Server does not accept the connection (error: 18452). Waiting for Sql Server to allow connections. Operation attempted was: Verify Connection On Start.
2019-11-21 15:32:55 - ! [150] SQL Server does not accept the connection (error: 18452). Waiting for Sql Server to allow connections. Operation attempted was: Verify Connection On Start.
2019-11-21 15:32:55 - ! [000] Unable to connect to server '(local),1433'; SQLServerAgent cannot start
2019-11-21 15:32:55 - ! [103] SQLServerAgent could not be started (reason: Unable to connect to server '(local),1433'; SQLServerAgent cannot start)
2019-11-21 15:32:55 - ! [298] SQLServer Error: 18452, Login failed. The login is from an untrusted domain and cannot be used with Integrated authentication. [SQLSTATE 28000] 
2019-11-21 15:32:55 - ! [382] Logon to server '(local),1433' failed (DisableAgentXPs)
2019-11-21 15:32:55 - ? [102] SQLServerAgent service successfully stopped
2019-11-21 15:32:55 - ? [098] SQLServerAgent terminated (normally)

Then I thought of trying to run this workload as a Pod resource, rather than a deployment. Since the NETBIOS name seems to be causing problems inside the container, if I chose to deploy the workload to Kubernetes as a Pod, I can control the name of the pod and set it to be something shorter than 15 characters (the name of a pod in a Deployment is auto-generated by Kubernetes and is much longer than 15 characters. When I did this, the deployment worked and SQL Agent was started and running. Here's my Pod spec

apiVersion: v1
kind: Pod
metadata:
  name: sales-db
  labels:
    app: sales-master-database
spec:
  terminationGracePeriodSeconds: 10
  containers:
  - name: sales-master-database
    image: mcr.microsoft.com/mssql/server:2019-GA-ubuntu-16.04
    imagePullPolicy: IfNotPresent
    env:
    - name: MSSQL_PID
      value: "Developer"
    - name: ACCEPT_EULA
      value: "Y"
    - name: SA_PASSWORD
      value: "StrongP@ssword!"
    - name: MSSQL_AGENT_ENABLED
      value: "true"
    ports:
    - containerPort: 1433
      protocol: TCP
    resources:
      limits:
        memory: 8Gi
    volumeMounts:
    - name: mssql-volume
      mountPath: /var/opt/mssql
  volumes:
  - name: mssql-volume
    persistentVolumeClaim:
      claimName: sales-master-database-pvc

This will get my by for now, but I think this needs to be fixed in SQL Server somehow

kellymenzel avatar Nov 21 '19 19:11 kellymenzel

Hi Everyone,

We are working on a solution for this at the moment. Will update this thread when we have something we can share.

vin-yu avatar Nov 22 '19 00:11 vin-yu

As a workaround, you could set the hostname field on the Pod Spec for your Deployment. This will set the hostname in the pod and will be what’s set for @@ServerName. Keep in mind this is at the pod level, rather than the container.

kubectl explain deployment.spec.template.spec.hostname
KIND:     Deployment
VERSION:  extensions/v1beta1

FIELD:    hostname <string>

DESCRIPTION:
     Specifies the hostname of the Pod If not specified, the pod's hostname will
     be set to a system-defined value.

nocentino avatar Nov 22 '19 16:11 nocentino

I was able to work around this on premise, solution should also work for AKS (not tested)

First, SQL agent does not look at @@SERVERNAME. This field is stored in the database.

Run this query and you may see that @@SERVERNAME and SERVERNAME are not the same.

SELECT SERVERPROPERTY(N'servername') 
SELECT @@SERVERNAME

SQL agent looks at SERVERNAME, which is the hostname of the pod on k8s. First it will try to connect to this address on port 1433. If that fails, it tries localhost:1433. If that fails, agent gives up and won't start.

This podSpec should fix by creating a static hostname and hostAlias for pod, allowing SQL agent to find SQL server on first attempt.

spec:
  hostname: sqlserver
  hostAliases:
  - hostnames:
     - sqlserver
     ip:
        valueFrom:
          fieldRef:
            fieldPath: status.podIP

This is more of a hack than a solution. Hopefully Microsoft will provide a better solution soon

csb1582 avatar Feb 28 '20 18:02 csb1582

It seems that mssql-agent looks at SERVERNAME. It checks hostname through @@SERVERNAME value. I guess that the problem is that @@SERVERNAME(hostname) is too long. because if hostname name over 15 character, more after 15 char was cut. I tested both hostname set long or short (under 15 character) The result is fine under 15 char. It works well in hostname under 15 character.

west0706 avatar May 13 '20 02:05 west0706

The issue is not the hostname. I mean it is, but not directly, as k8s uses a networking container this might not be directly obvious. Workaround: Set the hostname explicitly within the yaml. https://stackoverflow.com/questions/34609572/is-it-possible-to-set-a-hostname-in-a-kubernetes-replication-controller

I spent a whole day debugging this issue in our environment, I used docker run --rm -v /var/run/docker.sock:/var/run/docker.sock assaflavie/runlike to dump the exact docker run command that k8s uses to spawn the container.

The issue is the container networking. As soon as --network=container:80ce6db07a67 is specified the agent is no longer enabled (as this results in the hostname being inherited). This can be simply observed using docker logs by looking for the lines

2020-07-09 10:21:50.69 spid51      Configuration option 'show advanced options' changed from 0 to 1. Run the RECONFIGURE statement to install.
2020-07-09 10:21:50.71 spid51      Configuration option 'Agent XPs' changed from 0 to 1. Run the RECONFIGURE statement to install.
2020-07-09 10:21:50.75 spid51      Configuration option 'show advanced options' changed from 1 to 0. Run the RECONFIGURE statement to install.

With container networking these do not appear within that output.

The error can be reproduced using these steps:

  1. Spawn the k8s helper container manually:
docker run --name=k8s_POD_mssql-deployment-f6fc678b7-7mbjh_default_8a21d3f0-7b9f-4bc4-900a-903a0cbc88d5_0 --hostname=mssql-deployment-f6fc678b7-7mbjh --env=PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin --network=none --label io.kubernetes.docker.type="podsandbox" --label io.kubernetes.pod.uid="8a21d3f0-7b9f-4bc4-900a-903a0cbc88d5" --label annotation.kubernetes.io/config.source="api" --label io.kubernetes.pod.name="mssql-deployment-f6fc678b7-7mbjh" --label annotation.kubernetes.io/config.seen="2020-07-08T01:44:21.652993302Z" --label io.kubernetes.container.name="POD" --label pod-template-hash="f6fc678b7" --label app="mssql" --label io.kubernetes.pod.namespace="default" --detach=true k8s.gcr.io/pause:3.1
  1. Set the variable $helperID to the ID of the helper (spawned in the previous command)
  2. Spawn the pod manually:
docker run --name=k8s_mssql_mssql-deployment-f6fc678b7-7mbjh_default_8a21d3f0-7b9f-4bc4-900a-903a0cbc88d5_0 --user=mssql --env=MSSQL_PID=Developer --env=ACCEPT_EULA=Y --env=MSSQL_AGENT_ENABLED=true --env=MSSQL_ENABLE_HADR=1 --env='MSSQL_SA_PASSWORD=Eh5I6ov5pRDYqGpJZa0V!' --env=MSSQL_DEPLOYMENT_SERVICE_PORT=1433 --env=MSSQL_DEPLOYMENT_PORT_1433_TCP=tcp://10.240.21.176:1433 --env=MSSQL_DEPLOYMENT_PORT_1433_TCP_PORT=1433 --env=KUBERNETES_PORT=tcp://10.240.16.1:443 --env=KUBERNETES_PORT_443_TCP=tcp://10.240.16.1:443 --env=KUBERNETES_PORT_443_TCP_PROTO=tcp --env=MSSQL_DEPLOYMENT_SERVICE_HOST=10.240.21.176 --env=MSSQL_DEPLOYMENT_PORT=tcp://10.240.21.176:1433 --env=MSSQL_DEPLOYMENT_PORT_1433_TCP_PROTO=tcp --env=KUBERNETES_SERVICE_HOST=10.240.16.1 --env=KUBERNETES_SERVICE_PORT=443 --env=KUBERNETES_SERVICE_PORT_HTTPS=443 --env=KUBERNETES_PORT_443_TCP_PORT=443 --env=KUBERNETES_PORT_443_TCP_ADDR=10.240.16.1 --env=MSSQL_DEPLOYMENT_PORT_1433_TCP_ADDR=10.240.21.176 --env=PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin --restart=no --label io.kubernetes.docker.type="container" --label annotation.io.kubernetes.pod.terminationGracePeriod="30" --label annotation.io.kubernetes.container.terminationMessagePolicy="File" --label annotation.io.kubernetes.container.hash="bdf86433" --label com.microsoft.product="Microsoft SQL Server" --label io.kubernetes.container.logpath="/var/log/pods/default_mssql-deployment-f6fc678b7-7mbjh_8a21d3f0-7b9f-4bc4-900a-903a0cbc88d5/mssql/0.log" --label annotation.io.kubernetes.container.restartCount="0" --label annotation.io.kubernetes.container.ports="[{"containerPort":1433,"protocol":"TCP"}]" --label com.microsoft.version="15.0.4043.16" --label io.kubernetes.container.name="mssql" --label io.kubernetes.pod.uid="8a21d3f0-7b9f-4bc4-900a-903a0cbc88d5" --label io.kubernetes.pod.name="mssql-deployment-f6fc678b7-7mbjh" --label io.kubernetes.pod.namespace="default" --label io.kubernetes.sandbox.id="dd8eef7d109f4dd7ceb79dddef134965eeb815dac0f90a47d06cffb1684b4e8c" --label annotation.io.kubernetes.container.terminationMessagePath="/dev/termination-log" --label vendor="Microsoft" --detach=true --network=container:$helperID mcr.microsoft.com/mssql/server /opt/mssql/bin/sqlservr

This will result in a mssql container without agent even though MSSQL_AGENT_ENABLED=true

If the same command is used without the network parameter it works

docker run --name=k8s_mssql_mssql-deployment-f6fc678b7-7mbjh_default_8a21d3f0-7b9f-4bc4-900a-903a0cbc88d5_0 --user=mssql --env=MSSQL_PID=Developer --env=ACCEPT_EULA=Y --env=MSSQL_AGENT_ENABLED=true --env=MSSQL_ENABLE_HADR=1 --env='MSSQL_SA_PASSWORD=Eh5I6ov5pRDYqGpJZa0V!' --env=MSSQL_DEPLOYMENT_SERVICE_PORT=1433 --env=MSSQL_DEPLOYMENT_PORT_1433_TCP=tcp://10.240.21.176:1433 --env=MSSQL_DEPLOYMENT_PORT_1433_TCP_PORT=1433 --env=KUBERNETES_PORT=tcp://10.240.16.1:443 --env=KUBERNETES_PORT_443_TCP=tcp://10.240.16.1:443 --env=KUBERNETES_PORT_443_TCP_PROTO=tcp --env=MSSQL_DEPLOYMENT_SERVICE_HOST=10.240.21.176 --env=MSSQL_DEPLOYMENT_PORT=tcp://10.240.21.176:1433 --env=MSSQL_DEPLOYMENT_PORT_1433_TCP_PROTO=tcp --env=KUBERNETES_SERVICE_HOST=10.240.16.1 --env=KUBERNETES_SERVICE_PORT=443 --env=KUBERNETES_SERVICE_PORT_HTTPS=443 --env=KUBERNETES_PORT_443_TCP_PORT=443 --env=KUBERNETES_PORT_443_TCP_ADDR=10.240.16.1 --env=MSSQL_DEPLOYMENT_PORT_1433_TCP_ADDR=10.240.21.176 --env=PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin --restart=no --label io.kubernetes.docker.type="container" --label annotation.io.kubernetes.pod.terminationGracePeriod="30" --label annotation.io.kubernetes.container.terminationMessagePolicy="File" --label annotation.io.kubernetes.container.hash="bdf86433" --label com.microsoft.product="Microsoft SQL Server" --label io.kubernetes.container.logpath="/var/log/pods/default_mssql-deployment-f6fc678b7-7mbjh_8a21d3f0-7b9f-4bc4-900a-903a0cbc88d5/mssql/0.log" --label annotation.io.kubernetes.container.restartCount="0" --label annotation.io.kubernetes.container.ports="[{"containerPort":1433,"protocol":"TCP"}]" --label com.microsoft.version="15.0.4043.16" --label io.kubernetes.container.name="mssql" --label io.kubernetes.pod.uid="8a21d3f0-7b9f-4bc4-900a-903a0cbc88d5" --label io.kubernetes.pod.name="mssql-deployment-f6fc678b7-7mbjh" --label io.kubernetes.pod.namespace="default" --label io.kubernetes.sandbox.id="dd8eef7d109f4dd7ceb79dddef134965eeb815dac0f90a47d06cffb1684b4e8c" --label annotation.io.kubernetes.container.terminationMessagePath="/dev/termination-log" --label vendor="Microsoft" --detach=true mcr.microsoft.com/mssql/server /opt/mssql/bin/sqlservr

I do not know why this causes the error, but it is consistent and reproducible. (Also a final note to the above commands, I removed the --volume parameters as those were unrelated to the issue and would have only complicated reproducing the error)

agowa avatar Jul 09 '20 10:07 agowa

Just like @agowa338 said, setting the hostname explicitly works.

galtonova avatar Mar 17 '21 13:03 galtonova

It seems that mssql-agent looks at SERVERNAME. It checks hostname through @@ServerName value. I guess that the problem is that @@ServerName(hostname) is too long. because if hostname name over 15 character, more after 15 char was cut. I tested both hostname set long or short (under 15 character) The result is fine under 15 char. It works well in hostname under 15 character.

This fixed the problem for me. I had deployed SQL Server with a Helm chart and the name of the service was longer than 15 characters. Make sure the name of the service for SQL Server is not longer than 15 characters and there should be no problem enabling the MSSQL Agent.

vickssi avatar May 19 '22 14:05 vickssi

It seems that mssql-agent looks at SERVERNAME. It checks hostname through @@ServerName value. I guess that the problem is that @@ServerName(hostname) is too long. because if hostname name over 15 character, more after 15 char was cut. I tested both hostname set long or short (under 15 character) The result is fine under 15 char. It works well in hostname under 15 character.

Had same issue and this fixed it.

risb0 avatar Aug 18 '22 10:08 risb0