mssql-docker icon indicating copy to clipboard operation
mssql-docker copied to clipboard

msdtc failing with mcr.microsoft.com/mssql/server:2019-CTP3.2-ubuntu

Open hredestig opened this issue 4 years ago • 9 comments

I run docker 18.09.6 on centos 7 and followed the instructions for setting MSDTC capable SQL Server to serve an application that relies on MSDTC. That application cannot connect to my container and fails with message

System.Runtime.InteropServices.COMException (0x8004D02B): The MSDTC transaction manager was unable to pull the transaction from the source transaction manager due to communication problems. Possible causes are: a firewall is present and it doesn't have an exception for the MSDTC process, the two machines cannot find each other by their NetBIOS names, or the support for network transactions is not enabled for one of the two transaction managers. (Exception from HRESULT: 0x8004D02B)

Container started as expected with logs

	 -l /var/opt/mssql/data/mastlog.ldf
	 -e /var/opt/mssql/log/errorlog
2019-08-21 11:44:38.19 Server      SQL Server detected 12 sockets with 1 cores per socket and 1 logical processors per socket, 12 total logical processors; using 4 logical processors based on SQL Server licensing. This is an informational message; no user action is required.
2019-08-21 11:44:38.19 Server      SQL Server is starting at normal priority base (=7). This is an informational message only. No user action is required.
2019-08-21 11:44:38.19 Server      Detected 40254 MB of RAM. This is an informational message; no user action is required.
2019-08-21 11:44:38.19 Server      Using conventional memory in the memory manager.
2019-08-21 11:44:38.19 Server      Page exclusion bitmap is enabled.
2019-08-21 11:44:38.24 Server      Buffer pool extension is not supported on Linux platform.
2019-08-21 11:44:38.24 Server      Buffer Pool: Allocating 8388608 bytes for 6287360 hashPages.
2019-08-21 11:44:38.34 Server      Automatic soft-NUMA was enabled because SQL Server has detected hardware NUMA nodes with greater than 8 physical cores.
2019-08-21 11:44:38.48 Server      Buffer pool extension is already disabled. No action is necessary.
2019-08-21 11:44:38.70 Server      Successfully initialized the TLS configuration. Allowed TLS protocol versions are ['1.0 1.1 1.2']. Allowed TLS ciphers are ['ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-ECDSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA:ECDHE-ECDSA-AES128-SHA:AES256-GCM-SHA384:AES128-GCM-SHA256:AES256-SHA256:AES128-SHA256:AES256-SHA:AES128-SHA:!DHE-RSA-AES256-GCM-SHA384:!DHE-RSA-AES128-GCM-SHA256:!DHE-RSA-AES256-SHA:!DHE-RSA-AES128-SHA'].
2019-08-21 11:44:38.72 Server      Query Store settings initialized with enabled = 1, 
2019-08-21 11:44:38.73 Server      The maximum number of dedicated administrator connections for this instance is '1'
2019-08-21 11:44:38.74 Server      Node configuration: node 0: CPU mask: 0x000000000000003f:0 Active CPU mask: 0x000000000000000f:0. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
2019-08-21 11:44:38.74 Server      Node configuration: node 1: CPU mask: 0x0000000000000fc0:0 Active CPU mask: 0x0000000000000000:0. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
2019-08-21 11:44:38.82 Server      Using dynamic lock allocation.  Initial allocation of 2500 Lock blocks and 5000 Lock Owner blocks per node.  This is an informational message only.  No user action is required.
2019-08-21 11:44:38.84 Server      In-Memory OLTP initialized on lowend machine.
2019-08-21 11:44:38.91 Server      CLR version v4.0.30319 loaded.
2019-08-21 11:44:38.91 Server      [INFO] Created Extended Events session 'hkenginexesession'
2019-08-21 11:44:38.91 Server      Database Instant File Initialization: enabled. For security and performance considerations see the topic 'Database Instant File Initialization' in SQL Server Books Online. This is an informational message only. No user action is required.
ForceFlush is enabled for this instance. 
2019-08-21 11:44:38.93 Server      Total Log Writer threads: 2. This is an informational message; no user action is required.
2019-08-21 11:44:38.94 Server      clwb is selected for pmem flush operation.
2019-08-21 11:44:38.94 Server      Software Usage Metrics is disabled.
2019-08-21 11:44:38.97 spid9s      [1]. Feature Status: PVS: 0. CTR: 0. ConcurrentPFSUpdate: 1.
2019-08-21 11:44:38.97 spid9s      Starting up database 'master'.
ForceFlush feature is enabled for log durability.
2019-08-21 11:44:39.37 Server      Common language runtime (CLR) functionality initialized.
2019-08-21 11:44:39.44 spid9s      Service Master Key could not be decrypted using one of its encryptions. See sys.key_encryptions for details.
2019-08-21 11:44:39.45 spid9s      An error occurred during Service Master Key initialization. SQLErrorCode=33095, State=8, LastOsError=0.
2019-08-21 11:44:39.47 spid9s      SQL Server Audit is starting the audits. This is an informational message. No user action is required.
2019-08-21 11:44:39.51 spid9s      SQL Server Audit has started the audits. This is an informational message. No user action is required.
2019-08-21 11:44:39.58 spid9s      SQL Trace ID 1 was started by login "sa".
2019-08-21 11:44:39.59 spid9s      Server name is 'balder'. This is an informational message only. No user action is required.
2019-08-21 11:44:39.63 spid29s     [4]. Feature Status: PVS: 0. CTR: 0. ConcurrentPFSUpdate: 1.
2019-08-21 11:44:39.63 spid27s     Always On: The availability replica manager is starting. This is an informational message only. No user action is required.
2019-08-21 11:44:39.64 spid12s     [32767]. Feature Status: PVS: 0. CTR: 0. ConcurrentPFSUpdate: 1.
2019-08-21 11:44:39.64 spid27s     Always On: The availability replica manager is waiting for the instance of SQL Server to allow client connections. This is an informational message only. No user action is required.
2019-08-21 11:44:39.64 spid31s     [5]. Feature Status: PVS: 0. CTR: 0. ConcurrentPFSUpdate: 1.
2019-08-21 11:44:39.65 spid29s     Starting up database 'msdb'.
2019-08-21 11:44:39.66 spid12s     Starting up database 'mssqlsystemresource'.
2019-08-21 11:44:39.66 spid31s     Starting up database 'DART'.
2019-08-21 11:44:39.66 spid12s     The resource database build version is 15.00.1800. This is an informational message only. No user action is required.
2019-08-21 11:44:39.71 spid12s     [3]. Feature Status: PVS: 0. CTR: 0. ConcurrentPFSUpdate: 1.
2019-08-21 11:44:39.71 spid12s     Starting up database 'model'.
2019-08-21 11:44:39.75 spid24s     A self-generated certificate was successfully loaded for encryption.
2019-08-21 11:44:39.75 Server      Failed to verify the Authenticode signature of 'C:\binn\secforwarder.dll'. Signature verification of SQL Server DLLs will be skipped. Genuine copies of SQL Server are signed. Failure to verify the Authenticode signature might indicate that this is not an authentic release of SQL Server. Install a genuine copy of SQL Server or contact customer support.
2019-08-21 11:44:39.76 spid24s     Server is listening on [ 'any' <ipv6> 1433].
2019-08-21 11:44:39.76 spid24s     Server is listening on [ 'any' <ipv4> 1433].
2019-08-21 11:44:39.78 Server      Server is listening on [ ::1 <ipv6> 1434].
2019-08-21 11:44:39.78 Server      Server is listening on [ 127.0.0.1 <ipv4> 1434].
2019-08-21 11:44:39.78 Server      Dedicated admin connection support was established for listening locally on port 1434.
2019-08-21 11:44:39.78 spid24s     Server is listening on [ ::1 <ipv6> 1431].
2019-08-21 11:44:39.78 spid24s     Server is listening on [ 127.0.0.1 <ipv4> 1431].
2019-08-21 11:44:39.79 spid24s     SQL Server is now ready for client connections. This is an informational message; no user action is required.
2019-08-21 11:44:39.90 spid31s     Parallel redo is started for database 'DART' with worker pool size [6].
2019-08-21 11:44:39.97 spid12s     Clearing tempdb database.
2019-08-21 11:44:40.05 spid31s     Parallel redo is shutdown for database 'DART' with worker pool size [6].
2019-08-21 11:44:40.16 Server      Failed to verify the Authenticode signature of 'C:\binn\msoledbsql.dll'. Signature verification of SQL Server DLLs will be skipped. Genuine copies of SQL Server are signed. Failure to verify the Authenticode signature might indicate that this is not an authentic release of SQL Server. Install a genuine copy of SQL Server or contact customer support.
2019-08-21 11:44:40.23 Server      Error: 37308, Severity: 16, State: 1.
2019-08-21 11:44:40.23 Server      Loaded None enclave for always encrypted.
2019-08-21 11:44:40.34 spid12s     [2]. Feature Status: PVS: 0. CTR: 0. ConcurrentPFSUpdate: 1.
2019-08-21 11:44:40.34 spid12s     Starting up database 'tempdb'.
2019-08-21 11:44:41.83 spid12s     The tempdb database has 8 data file(s).
2019-08-21 11:44:41.84 spid27s     The Service Broker endpoint is in disabled or stopped state.
2019-08-21 11:44:41.84 spid27s     The Database Mirroring endpoint is in disabled or stopped state.
2019-08-21 11:44:41.85 spid27s     Service Broker manager has started.
2019-08-21 11:44:41.85 spid9s      Recovery is complete. This is an informational message only. No user action is required.
2019-08-21 11:45:44.07 spid64      Initializing Microsoft Distributed Transaction Coordinator (MS DTC) resource manager [009a4633-001f-4b80-ba92-3e3956d757de] for server instance balder. This is an informational message only. No user action is required.
2019-08-21 11:45:45.50 spid64      Recovery of any in-doubt distributed transactions involving Microsoft Distributed Transaction Coordinator (MS DTC) has completed. This is an informational message only. No user action is required.
2019-08-21 11:50:08.12 spid66      Attempting to load library 'xplog70.dll' into memory. This is an informational message only. No user action is required.
2019-08-21 11:50:08.17 spid66      Using 'xplog70.dll' version '2019.150.1800' to execute extended stored procedure 'xp_msver'. This is an informational message only; no user action is required.
2019-08-21 13:23:08.07 Logon       Error: 18456, Severity: 14, State: 8.
2019-08-21 13:23:08.07 Logon       Login failed for user 'SA'. Reason: Password did not match that for the login provided. [CLIENT: 172.18.0.172]
2019-08-21 13:23:44.55 spid55      Attempting to load library 'xpsqlbot.dll' into memory. This is an informational message only. No user action is required.
2019-08-21 13:23:44.55 spid55      Using 'xpsqlbot.dll' version '2019.150.1800' to execute extended stored procedure 'xp_qv'. This is an informational message only; no user action is required.

To get a more low-level test than the application I actually target, I use dtctester on a windows client directly connected to the server, no intermediate firewall. I can ping in both directions. NetBIOS name resolves as checked by nbtstat on client to server. Output of dtctester is in line with previous error message

C:\opt\dtctest>dtctester.exe balder SA xxx
Executed: dtctester.exe
DSN:  balder
User Name: SA
Password: xxx
tablename= #dtc14928
Creating Temp Table for Testing: #dtc14928
Warning: No Columns in Result Set From Executing: 'create table #dtc14928 (ival int)'
Initializing DTC
Beginning DTC Transaction
Enlisting Connection in Transaction
Error:
SQLSTATE=25S12,Native error=-2147168220,msg='[Microsoft][ODBC SQL Server Driver]Distributed transaction error'
Error:
SQLSTATE=24000,Native error=0,msg=[Microsoft][ODBC SQL Server Driver]Invalid cursor state
Typical Errors in DTC Output When
a.  Firewall Has Ports Closed
-OR-
b.  Bad WINS/DNS entries
-OR-
c.  Misconfigured network
-OR-
d.  Misconfigured SQL Server machine that has multiple netcards.
Aborting DTC Transaction
Releasing DTC Interface Pointers
Successfully Released Transaction Pointer.
  • Should dtctester be able to connect to SQL Server container like this?
  • If not, how do I best verify that MSDTC is setup correctly?
  • And, most importantly any idea for fix?

I have enterprise Standard SQL Server license but understand that MSDTC on docker is not officially supported yet and that I should post my question here.

Some help / ideas would be most appreciated!

hredestig avatar Aug 22 '19 10:08 hredestig

I'm having the exact same issue. Have been using dtctester.exe to test MSDTC. So far no luck. I've tried all the configuration options mentioned here in the docs below. According to the docs MSDTC is supported now.

https://docs.microsoft.com/en-us/sql/linux/sql-server-linux-configure-msdtc?view=sql-server-ver15 https://docs.microsoft.com/en-us/sql/linux/sql-server-linux-configure-mssql-conf?view=sql-server-ver15 https://docs.microsoft.com/en-us/sql/linux/sql-server-linux-configure-msdtc-docker?view=sql-server-ver15 I cannot get it working however. I've tried both SQL 2017CU and 2019CU

These seem to be the only documented options to configure:

sudo /opt/mssql/bin/mssql-conf set distributedtransaction.turnoffrpcsecurity 1 sudo /opt/mssql/bin/mssql-conf set distributedtransaction.fallbacktounsecurerpcifnecessary 1 sudo /opt/mssql/bin/mssql-conf set distributedtransaction.allowonlysecurerpccalls 0 sudo /opt/mssql/bin/mssql-conf set network.rpcport 13500 sudo /opt/mssql/bin/mssql-conf set distributedtransaction.servertcpport 51999

dtctester gives me the exact same output as hredestig here above.

Btw, I have MSDTC working fine on a SQL Windows Container, although it took alot of time to get working. It would be great to get this working on linux as well.

eythork avatar Dec 09 '19 16:12 eythork

I'm also having the same problem someone managed to solve?

erielmiquilino avatar Mar 15 '20 05:03 erielmiquilino

I haven't found any solution yet. I'm hoping a M$ rep will see this thread and tell us how MSDTC is supposted to work on linux containers. Seems kind of weird that the official docs from Microsoft claim that MSDTC is fully supported while a basic DTC setup like this is impossible to get working.

eythork avatar Mar 17 '20 23:03 eythork

Agreed.. I gave up and went for VM based solution :/

hredestig avatar Mar 18 '20 08:03 hredestig

Hello there,

Have anyone tried to use the latest image of SQL Server 2019? mcr.microsoft.com/mssql/server:2019-CU3-ubuntu-18.04

I was checking the thread, it is from August 2019. Based on the release cycle, SQL Server 2019 was not GA at that time. I strongly recommend to use the latest image and try again, most of the known issues are fixed in the post GA release through CU's (cumulative updates).

Regards,

croblesm avatar Mar 24 '20 03:03 croblesm

Hello there,

Have anyone tried to use the latest image of SQL Server 2019? mcr.microsoft.com/mssql/server:2019-CU3-ubuntu-18.04

I was checking the thread, it is from August 2019. Based on the release cycle, SQL Server 2019 was not GA at that time. I strongly recommend to use the latest image and try again, most of the known issues are fixed in the post GA release through CU's (cumulative updates).

Regards,

No, do not work.

gsellitto avatar Apr 18 '20 15:04 gsellitto

@vin-yu , @kapilth , @rothja as FYI.

Can you please paste your container execution command line? Did you also configure the port forwarding on the host from port 135 to the "defined RPC" port that container is listening on? https://docs.microsoft.com/en-us/sql/linux/sql-server-linux-configure-msdtc?view=sql-server-ver15#configure-port-routing

Additionally, do you have more than one containers trying to listen on RPC port? When you are configuring port forwarding on the host IP port 135 to container's RPC port, only one container can be configured to listen to it. So if you are trying to configure multiple containers for MSDTC, then you'll want to create multiple logical IPs on the host's NIC using nmcli or network manager of your choice. Then you'll want to bind the container logical host name to those logical IPs.

Following is an example configuration for configuring multiple containers to listen to port 135 on a single host with appropriate IP bindings. In this case i have assumed a Linux VM in Azure on which i have added multiple IP addresses, one for each container using following. https://docs.microsoft.com/en-us/azure/virtual-network/virtual-network-multiple-ip-addresses-portal

In this example, I am creating two SQL container on Ubuntu host(Azure IaaS VM), which has two additional IP addresses, 172.16.0.15 and 172.16.0.19, one assigned for each SQL container. (Starting RHEL 8, you would use podman instead of docker.)

_docker run
-e 'ACCEPT_EULA=Y' -e 'MSSQL_SA_PASSWORD=<UseComplexPassword!1234>'
-e 'MSSQL_RPC_PORT=13500' -e 'MSSQL_DTC_TCP_PORT=51000'
-p 172.16.0.15:51433:1433 -p 172.16.0.15:13500:13500 -p 172.16.0.15:51000:51000
-h SQLContainer1 --name sqlcnt1
-d mcr.microsoft.com/mssql/server:2017-latest

docker run
-e 'ACCEPT_EULA=Y' -e 'MSSQL_SA_PASSWORD=<UseComplexPassword!1234>'
-e 'MSSQL_RPC_PORT=13500' -e 'MSSQL_DTC_TCP_PORT=51000'
-p 172.16.0.19:51433:1433 -p 172.16.0.19:13500:13500 -p 172.16.0.19:51000:51000
-h SQLContainer2 --name sqlcnt2
-d mcr.microsoft.com/mssql/server:2017-latest_

On the Host server for both containers, execute following to create appropriate port forwarding for appropriately assigned logical Ips to the containers. (On RHEL and starting SLES 15, port forwarding tends be easier to configure with firewall-cmd itself, where you also open the port for communication outside the VM.) _sudo iptables -t nat -A PREROUTING -d 172.16.0.15 -p tcp --dport 135 -m addrtype --dst-type LOCAL
-j DNAT --to-destination 172.16.0.15:13500 -m comment --comment RpcEndPointMapper1 sudo iptables -t nat -A OUTPUT -d 172.16.0.15 -p tcp --dport 135 -m addrtype --dst-type LOCAL
-j DNAT --to-destination 172.16.0.15:13500 -m comment --comment RpcEndPointMapper1

sudo iptables -t nat -A PREROUTING -d 172.16.0.19 -p tcp --dport 135 -m addrtype --dst-type LOCAL
-j DNAT --to-destination 172.16.0.19:13500 -m comment --comment RpcEndPointMapper2 sudo iptables -t nat -A OUTPUT -d 172.16.0.19 -p tcp --dport 135 -m addrtype --dst-type LOCAL
-j DNAT --to-destination 172.16.0.19:13500 -m comment --comment RpcEndPointMapper2_

(On RHEL, sudo firewall-cmd --permanent --add-forward-port=port=135:proto=tcp:toport=13500 sudo firewall-cmd --reload )

In /etc/hosts of each container, make entry for "HOST" level IP mapped to container name. Alternatively, you can ensure that DNS name resolution between IP and container name works appropriately which would avoid having to modify the /etc/hosts file in the container itself.

In sqlcnt1 /etc/hosts, make a host entry, 172.16.0.19 SQLContainer2 (Requires to open bash via “docker exec -it sqlcnt1 bash” in SQLcnt1 context. Then do apt update and apt install nano. And then use nano editor to modify /etc/hosts file.)

In SQLcnt2 /etc/hosts, make a host entry, 172.16.0.15 SQLContainer1 (Requires to open bash via “docker exec -it sqlcnt2 bash” in SQLcnt2 context. Then do apt update and apt install nano. And then use nano editor to modify /etc/hosts file.)

_USE [master]
GO
EXEC master.dbo.sp_addlinkedserver @server = N'172.16.0.19,51433', @srvproduct=N'SQL Server' ;
GO

EXEC master.dbo.sp_addlinkedsrvlogin @rmtsrvname = N'172.16.0.19,51433', @rmtuser = 'sa', @rmtpassword = '<UseComplexPassword!1234>', @useself = N'False'; GO

set xact_abort on begin distributed transaction select * from [172.16.0.19,51433].master.dbo.sysprocesses commit Go_

Of course, you'll want to make sure on the host that firewall has appropriate port exceptions enabled. sudo ufw allow from any to any port 51433 proto tcp sudo ufw allow from any to any port 51000 proto tcp sudo ufw allow from any to any port 135 proto tcp

(on RHEL, sudo firewall-cmd --zone=public --add-port=51000/tcp --permanent sudo firewall-cmd --zone=public --add-port=51433/tcp --permanent sudo firewall-cmd --zone=public --add-port=135/tcp --permanent sudo firewall-cmd --reload )

We will try to get some documentation out for such configuration asap.

tejasaks avatar Aug 31 '20 08:08 tejasaks

This problem may be caused by HOSTNAME LOOKUP . Add host entry to /etc/hosts , and everything works

root@4f73454bf03e:/# echo "172.17.0.4	6b7401bc1c5b" >> /etc/hosts
root@6b7401bc1c5b:/# echo "172.17.0.3	4f73454bf03e" >> /etc/hosts
root@4f73454bf03e:/# cat /etc/hosts
127.0.0.1	localhost
::1	localhost ip6-localhost ip6-loopback
fe00::0	ip6-localnet
ff00::0	ip6-mcastprefix
ff02::1	ip6-allnodes
ff02::2	ip6-allrouters
172.17.0.3	4f73454bf03e
172.17.0.4	6b7401bc1c5b
root@4f73454bf03e:/# /opt/mssql-tools/bin/sqlcmd -U sa
Password:
1> BEGIN DISTRIBUTED TRANSACTION
2> SELECT top 1 spid,lastwaittype FROM [172.17.0.4].master.dbo.sysprocesses;
3> COMMIT
4> GO
spid   lastwaittype
------ --------------------------------
     1 WAIT_XTP_HOST_WAIT

(1 rows affected)
1>
2>

z-k-q avatar Aug 02 '22 08:08 z-k-q

I'll certainly emphasize that the use of containers on VMs should ideally be limited to small dev/test environments as containers are better deployed and managed in Kubernetes environments. For example, compared to the configuration I described in an earlier post, the container with MSDTC configuration enabled can be deployed much easily in K8s environments. Following blog should give some insights into how much easier it can be to enable MSDTC for containers in K8s.

https://techcommunity.microsoft.com/t5/sql-server-blog/configure-msdtc-to-run-distributed-transactions-for-sql-server/ba-p/3259913

tejasaks avatar Aug 03 '22 17:08 tejasaks