docker-vernemq icon indicating copy to clipboard operation
docker-vernemq copied to clipboard

Can't reconfigure mqtts listener

Open maelp opened this issue 2 years ago • 57 comments

Hi, I'm trying to set up a mqtts listener, I've added the mqtts.enabled=true in the values, as well as the secretMount with the certificates (it seemed that it didn't work by adding directly the "certificates.{certs,ca,key}" as documented in the README), and when I look at logs I see this:

16:56:01.241 [error] can't reconfigure mqtts listener({127,0,0,1}, 1883) with Options [{max_connections,10000},{nr_of_acceptors,10},{mountpoint,[]},{cafile,"/etc/ssl/vernemq/ca.crt"},{depth,1},{certfile,"/etc/ssl/vernemq/tls.crt"},{eccs,[secp521r1,brainpoolP512r1,brainpoolP384r1,secp384r1,brainpoolP256r1,secp256k1,secp256r1,secp224k1,secp224r1,secp192k1,secp192r1,secp160k1,secp160r1,secp160r2]},{keyfile,"/etc/ssl/vernemq/tls.key"},{require_certificate,false},{tls_version,'tlsv1.2'},{use_identity_as_username,false},{allowed_protocol_versions,[3,4,131]},{allow_anonymous_override,false}] due to {already_started,<0.458.0>}

and trying to send data on MQTT on port 8883 fails with an error (although it works fine for 1883)

is there something I should configure?

maelp avatar Feb 23 '23 17:02 maelp

When connecting with hivemq mqtt-cli I get this log

Restriction request problem information was set but is unused in MQTT Version MQTT_3_1_1
Client '[email protected]' sending CONNECT MqttConnect{keepAlive=60, cleanSession=true, restrictions=MqttConnectRestrictions{receiveMaximum=65535, sendMaximum=65535, maximumPacketSize=268435460, sendMaximumPacketSize=268435460, topicAliasMaximum=0, sendTopicAliasMaximum=16, requestProblemInformation=true, requestResponseInformation=false}}
Client '[email protected]' DISCONNECTED null: com.hivemq.client.mqtt.exceptions.ConnectionFailedException: java.nio.channels.ClosedChannelException
Caused by: java.nio.channels.ClosedChannelException
	at io.netty.handler.ssl.SslHandler.channelInactive(SslHandler.java:1065)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262)

maelp avatar Feb 23 '23 17:02 maelp

I checked that the certificates are correct, and the 8883 port is correctly opened, so I guess the error is from the configuration?

Here is the config:

import pulumi
from pulumi_kubernetes.meta.v1 import ObjectMetaArgs
from pulumi_kubernetes.core.v1 import Namespace, Secret
from pulumi_kubernetes.helm.v3 import Chart, ChartOpts
from .env import k8s_provider


# Install the vernemq controller on the cluster
chart_ns = Namespace("vernemq", metadata=ObjectMetaArgs(name="vernemq"))
certificates_secret = Secret(
    "vernemq-certificates-secret",
    metadata=ObjectMetaArgs(
        name="vernemq-certificates-secret", namespace=chart_ns.metadata.name
    ),
    string_data={
        "tls.crt": open(
            "secrets-data/staging/certs/gsc/gsc.crt", "r", encoding="utf-8"
        ).read(),
        "tls.key": open(
            "secrets-data/staging/certs/gsc/private.pem", "r", encoding="utf-8"
        ).read(),
        "ca.crt": open(
            "secrets-data/staging/certs/rk/root.crt", "r", encoding="utf-8"
        ).read(),
    },
)
chart = Chart(
    "vernemq",
    config=ChartOpts(
        fetch_opts={"repo": "https://vernemq.github.io/docker-vernemq"},
        chart="vernemq",
        version="1.8.1",
        namespace=chart_ns.metadata.name,
        values={
            "additionalEnv": [
                {"name": "DOCKER_VERNEMQ_ACCEPT_EULA", "value": "yes"},
                {"name": "DOCKER_VERNEMQ_ALLOW_ANONYMOUS", "value": "on"},
                {
                    "name": "DOCKER_VERNEMQ_LISTENER__SSL__CAFILE",
                    "value": "/etc/ssl/vernemq/ca.crt",
                },
                {
                    "name": "DOCKER_VERNEMQ_LISTENER__SSL__CERTFILE",
                    "value": "/etc/ssl/vernemq/tls.crt",
                },
                {
                    "name": "DOCKER_VERNEMQ_LISTENER__SSL__KEYFILE",
                    "value": "/etc/ssl/vernemq/tls.key",
                },
            ],
            "persistentVolume": {
                "enabled": "true",
                "size": "5Gi",
                # "storageClass": "..."
            },
            "service": {
                "mqtt": {"enabled": "true", "port": "1883"},
                "mqtts": {"enabled": "true", "port": "8883"},
            },
            "secretMounts": [
                {
                    "name": "vernemq-certificates",
                    "secretName": "vernemq-certificates-secret",
                    "path": "/etc/ssl/vernemq",
                }
            ],
        },
    ),
    opts=pulumi.ResourceOptions(
        provider=k8s_provider, depends_on=[chart_ns, certificates_secret]
    ),
)

maelp avatar Feb 23 '23 17:02 maelp

@maelp there's currently a minor issue coming from configuring default listeners in the release package. Will be fixed in next release. Disable as much default listeners configs as you can. Then set the MQTTS listener like the one below.

listener.ssl.default=0.0.0.0:8883
listener.ssl.default.keyfile=/etc/ssl/vernemq/tls.key
listener.ssl.default.cafile=/etc/ssl/vernemq/ca.crt
listener.ssl.default.certfile=/etc/ssl/vernemq/tls.crt

if the reconfigure error bothers you, try setting an MQTT default listener too

listener.tcp.default = 127.0.0.1:1883

To arrive at those, you have to translate to the ENV variable format here.

I'll look at that client side error later, maybe related to allowed protocol version?


:point_right: Thank you for supporting VerneMQ: https://github.com/sponsors/vernemq :point_right: Using the binary VerneMQ packages commercially (.deb/.rpm/Docker) requires a paid subscription.

ioolkos avatar Feb 23 '23 17:02 ioolkos

(the error seems to be that the server closes the socket when trying to do a SSL connection)

maelp avatar Feb 23 '23 17:02 maelp

I'm using a client which handles MQTT 3,4,5 I think (hive mqtt-cli), I'm not sure what version the VerneMQ server is handling, is that configurable?

maelp avatar Feb 23 '23 17:02 maelp

I have seen this error @ioolkos , might be fixed on my currently open PR in VerneMQ but we have to double check

codeadict avatar Feb 23 '23 17:02 codeadict

@ioolkos I tried adding the variables you sent above, now I have those logs

17:26:37.432 [error] can't reconfigure mqtt listener({127,0,0,1}, 1883) with Options [{max_connections,10000},{nr_of_acceptors,10},{mountpoint,[]},{proxy_protocol,false},{allowed_protocol_versions,[3,4,5,131]},{allow_anonymous_override,false}] due to {already_started,<0.435.0>}
17:26:37.514 [error] can't reconfigure mqtts listener({127,0,0,1}, 1883) with Options [{max_connections,10000},{nr_of_acceptors,10},{mountpoint,[]},{cafile,"/etc/ssl/vernemq/ca.crt"},{depth,1},{certfile,"/etc/ssl/vernemq/tls.crt"},{eccs,[secp521r1,brainpoolP512r1,brainpoolP384r1,secp384r1,brainpoolP256r1,secp256k1,secp256r1,secp224k1,secp224r1,secp192k1,secp192r1,secp160k1,secp160r1,secp160r2]},{keyfile,"/etc/ssl/vernemq/tls.key"},{require_certificate,false},{tls_version,'tlsv1.2'},{use_identity_as_username,false},{allowed_protocol_versions,[3,4,131]},{allow_anonymous_override,false}] due to {already_started,<0.435.0>}

maelp avatar Feb 23 '23 17:02 maelp

Here's what I added

                {
                    "name": "DOCKER_VERNEMQ_LISTENER__SSL__DEFAULT",
                    "value": "0.0.0.0:8883",
                },
                {
                    "name": "DOCKER_VERNEMQ_LISTENER__TCP__DEFAULT",
                    "value": "127.0.0.1:1883",
                },

maelp avatar Feb 23 '23 17:02 maelp

But for now it seems I can't connect to VerneMQ with SSL, but perhaps I can do it using a Nginx in front, but this defeats the purpose of using VerneMQ TLS setting which can be convenient (to auto-set the user with the CommonName)

maelp avatar Feb 23 '23 17:02 maelp

I'll do some testing on the branch I've opened with these settings. What does vmq-admin listener show is telling you @maelp ?

codeadict avatar Feb 23 '23 18:02 codeadict

@maelp MQTT protocol versions can be allowed per listener.

listener.ssl.default.allowed_protocol_versions = 3,4,5

Terminating TLS in HAProxy or similar component is often done, yes. Of course, Verne can also terminate TLS without problems; so we'll find out what's happing here.


:point_right: Thank you for supporting VerneMQ: https://github.com/sponsors/vernemq :point_right: Using the binary VerneMQ packages commercially (.deb/.rpm/Docker) requires a paid subscription.

ioolkos avatar Feb 23 '23 18:02 ioolkos

@codeadict this is the result

+--------+---------+-----------+-------+------------+-----------+
| type   | status  | ip        | port  | mountpoint | max_conns |
+--------+---------+-----------+-------+------------+-----------+
| http   | running | 127.0.0.1 | 8888  |            | 10000     |
+--------+---------+-----------+-------+------------+-----------+
| http   | running | 10.4.0.8  | 8888  |            | 10000     |
+--------+---------+-----------+-------+------------+-----------+
| vmq    | running | 10.4.0.8  | 44053 |            | 10000     |
+--------+---------+-----------+-------+------------+-----------+
| mqttws | running | 10.4.0.8  | 8080  |            | 10000     |
+--------+---------+-----------+-------+------------+-----------+
| mqtt   | running | 127.0.0.1 | 1883  |            | 10000     |
+--------+---------+-----------+-------+------------+-----------+
| mqtt   | running | 10.4.0.8  | 1883  |            | 10000     |
+--------+---------+-----------+-------+------------+-----------+

maelp avatar Feb 23 '23 22:02 maelp

(weird that there's no mqtts, is that because of the error above?)

maelp avatar Feb 23 '23 22:02 maelp

also not sure why there is mqttws although I haven't enabled it?

maelp avatar Feb 23 '23 22:02 maelp

If this helps, when I set the log level to debug, I have a lot of these:

08:47:33.345 [debug] Replica meta1: Can't initialize AE exchange due to no peer available
08:47:33.429 [debug] Replica meta9: Can't initialize AE exchange due to no peer available
08:47:33.618 [debug] Replica meta4: Can't initialize AE exchange due to no peer available
08:47:34.507 [debug] Replica meta3: Can't initialize AE exchange due to no peer available
08:47:34.689 [debug] Replica meta8: Can't initialize AE exchange due to no peer available
08:47:34.755 [debug] Replica meta5: Can't initialize AE exchange due to no peer available
08:47:35.127 [debug] Replica meta6: Can't initialize AE exchange due to no peer available
08:47:35.155 [debug] Replica meta10: Can't initialize AE exchange due to no peer available
08:47:35.267 [debug] Replica meta7: Can't initialize AE exchange due to no peer available

maelp avatar Feb 24 '23 08:02 maelp

@ioolkos @codeadict when I disable "mqtt" and only enable "mqtts" in the config, I still have the "can't reconfigure mqtts" error

08:54:06.132 [error] can't reconfigure mqtts listener({127,0,0,1}, 1883) with Options [{max_connections,10000},{nr_of_acceptors,10},{mountpoint,[]},{cafile,"/etc/ssl/vernemq/ca.crt"},{depth,1},{certfile,"/etc/ssl/vernemq/tls.crt"},{eccs,[secp521r1,brainpoolP512r1,brainpoolP384r1,secp384r1,brainpoolP256r1,secp256k1,secp256r1,secp224k1,secp224r1,secp192k1,secp192r1,secp160k1,secp160r1,secp160r2]},{keyfile,"/etc/ssl/vernemq/tls.key"},{require_certificate,false},{tls_version,'tlsv1.2'},{use_identity_as_username,false},{allowed_protocol_versions,[3,4,131]},{allow_anonymous_override,false}] due to {already_started,<0.458.0>}

but then each time I try to connect with SSL I now have this (before I didn't even have a log) and the connection still closes:

09:04:43.371 [warning] session stopped abnormally due to '{cant_parse_connect_fixed_header,<<22,3,3,0,240,1,0,0,236,3,3,114,105,135,159,176,108,221,9,230,83,91,123,59,137,214,40,86,81,117,114,74,34,165,44,40,75,196,182,67,195,180,244,0,0,18,192,44,192,43,192,47,192,48,192,19,192,20,0,156,0,47,0,53,1,0,0,177,0,0,0,25,0,23,0,0,20,119,101,98,45,97,112,112,45,49,46,103,111,117,97,99,104,46,99,111,109,0,5,0,5,1,0,0,0,0,0,10,0,12,0,10,0,29,0,23,0,24,0,25,0,30,0,11,0,2,1,0,0,13,0,40,0,38,4,3,5,3,6,3,8,4,8,5,8,6,8,9,8,10,8,11,4,1,5,1,6,1,4,2,3,3,3,1,3,2,2,3,2,1,2,2,0,50,0,40,0,38,4,3,5,3,6,3,8,4,8,5,8,6,8,9,8,10,8,11,4,1,5,1,6,1,4,2,3,3,3,1,3,2,2,3,2,1,2,2,0,17,0,9,0,7,2,0,4,0,0,0,0,0,23,0,0,0,43,0,3,2,3,3,255,1,0,1,0>>}'

maelp avatar Feb 24 '23 09:02 maelp

@maelp the AE exchange logs above are unrelated and harmless (coming from the sync protocol).

There are other users having troubles with MQTTS listeners in the Docker image, apparently: https://erlangforums.com/t/latest-vernemq-not-allowing-tls-traffic-with-self-signed-ca-certificate/2394 I'll need to check that.

Have you enabled listener.ssl.default.allowed_protocol_versions = 3,4,5?

ioolkos avatar Feb 24 '23 09:02 ioolkos

@ioolkos I added this, but same error

                {
                    "name": "DOCKER_VERNEMQ_LISTENER__SSL__ALLOWED_PROTOCOL_VERSIONS",
                    "value": "3,4,5",
                },

maelp avatar Feb 24 '23 09:02 maelp

Btw, the WS listener comes from the fact that we seem to start one via start script, in case the DOCKER_VERNEMQ variable is not set. https://github.com/vernemq/docker-vernemq/blob/148c27a245d5cdbe7a62c54db9b9e8c247a873f8/bin/vernemq.sh#L214

ioolkos avatar Feb 24 '23 11:02 ioolkos

Can you check on the Pod, what the exact config injected into etc/vernemq/vernemq.conf is and post here? All values injected are at the end of the conf file between ###START### and ###END###.


:point_right: Thank you for supporting VerneMQ: https://github.com/sponsors/vernemq :point_right: Using the binary VerneMQ packages commercially (.deb/.rpm/Docker) requires a paid subscription.

ioolkos avatar Feb 24 '23 11:02 ioolkos

@ioolkos this is what I get

########## Start ##########
listener.ssl.use_identity_as_username=on
listener.ssl.allowed_protocol_versions=3,4,5
accept_eula=yes
listener.tcp.default=127.0.0.1:1883
listener.ssl.default=0.0.0.0:8883
listener.ssl.require_certificate=on
listener.ssl.keyfile=/etc/ssl/vernemq/tls.key
listener.ssl.cafile=/etc/ssl/vernemq/ca.crt
log.console=console
listener.tcp.localhost=127.0.0.1:1883
allow_anonymous=on
listener.ssl.certfile=/etc/ssl/vernemq/tls.crt
erlang.distribution.port_range.minimum = 9100
erlang.distribution.port_range.maximum = 9109
listener.ws.default = 10.4.0.13:8080
listener.vmq.clustering = 10.4.0.13:44053
listener.http.metrics = 10.4.0.13:8888
########## End ##########

maelp avatar Feb 24 '23 11:02 maelp

Can you also add: listener.ssl.default.depth = 10 to ensure we're not limited by chain length.

You could also attach to VerneMQ in the Pod by using vernemq attach. And then get info with something similar to:

rp(ranch:info({{127,0,0,1},8883})). (hit Enter)

Detach from Verne with ctrl+g, then q.

I'm currently unclear whether you manage to setup an SSL listener or not. (check with vmq-admin listener show)


:point_right: Thank you for supporting VerneMQ: https://github.com/sponsors/vernemq :point_right: Using the binary VerneMQ packages commercially (.deb/.rpm/Docker) requires a paid subscription.

ioolkos avatar Feb 24 '23 14:02 ioolkos

Adding the default depth of 10 did not change anything

for the command: if I include the final . (not sure if that was intended), I get the following error:

if I don't include it, I get an empty response

([email protected])1> rp(ranch:info({{127,0,0,1},8883})).
** exception error: bad argument
     in function  ets:lookup_element/3
        called as ets:lookup_element(ranch_server,{listener_sup,{{127,0,0,1},8883}},2)
        *** argument 2: not a key that exists in the table
     in call from ranch_server:get_listener_sup/1 (/opt/vernemq/_build/default/lib/ranch/src/ranch_server.erl, line 122)
     in call from ranch:info/1 (/opt/vernemq/_build/default/lib/ranch/src/ranch.erl, line 434)
([email protected])2> rp(ranch:info({{127,0,0,1},8883}))
([email protected])2>

maelp avatar Feb 24 '23 14:02 maelp

If I do it with "1883" instead of "8883" I get this

([email protected])2> rp(ranch:info({{127,0,0,1},1883})).
#{active_connections => 0,all_connections => 0,
  ip => {127,0,0,1},
  max_connections => 10000,
  metrics =>
      #{{conns_sup,1,accept} => 0,
        {conns_sup,1,terminate} => 0,
        {conns_sup,2,accept} => 0,
        {conns_sup,2,terminate} => 0,
        {conns_sup,3,accept} => 0,
        {conns_sup,3,terminate} => 0,
        {conns_sup,4,accept} => 0,
        {conns_sup,4,terminate} => 0,
        {conns_sup,5,accept} => 0,
        {conns_sup,5,terminate} => 0,
        {conns_sup,6,accept} => 0,
        {conns_sup,6,terminate} => 0,
        {conns_sup,7,accept} => 0,
        {conns_sup,7,terminate} => 0,
        {conns_sup,8,accept} => 0,
        {conns_sup,8,terminate} => 0,
        {conns_sup,9,accept} => 0,
        {conns_sup,9,terminate} => 0,
        {conns_sup,10,accept} => 0,
        {conns_sup,10,terminate} => 0},
  pid => <0.458.0>,port => 1883,protocol => vmq_ranch,
  protocol_options =>
      [{mountpoint,[]},
       {allowed_protocol_versions,[3,4,5,131]},
       {allow_anonymous_override,false},
       {buffer_sizes,undefined}],
  status => running,transport => ranch_tcp,
  transport_options =>
      #{max_connections => 10000,num_acceptors => 10,
        socket_opts =>
            [{ip,{127,0,0,1}},
             {port,1883},
             {nodelay,true},
             {linger,{true,0}},
             {send_timeout,30000},
             {send_timeout_close,true}]}}
ok

so I guess for some reason the mqtts handler is not launched, although I think I configured everything? is that related to the error above where it complains that it cannot reconfigure the listener (not sure why it tries to reconfigure the mqtts listener on 1883 instead of 8883?)

maelp avatar Feb 24 '23 14:02 maelp

It seems though that the 8883 port is correctly opened (and correctly proxied-through by the ingress-nginx TCP proxy in front), because when I try to make a MQTTS request on 8883 I get this log to appear on the service logs

14:29:10.285 [warning] session stopped abnormally due to '{cant_parse_connect_fixed_header,<<22,3,3,0,240,1,0,0,236,3,3,95,237,223,125,180,210,123,10,168,11,41,16,254,209,7,51,234,66,84,195,188,130,8,2,82,22,116,40,207,75,203,201,0,0,18,192,44,192,43,192,47,192,48,192,19,192,20,0,156,0,47,0,53,1,0,0,177,0,0,0,25,0,23,0,0,20,119,101,98,45,97,112,112,45,49,46,103,111,117,97,99,104,46,99,111,109,0,5,0,5,1,0,0,0,0,0,10,0,12,0,10,0,29,0,23,0,24,0,25,0,30,0,11,0,2,1,0,0,13,0,40,0,38,4,3,5,3,6,3,8,4,8,5,8,6,8,9,8,10,8,11,4,1,5,1,6,1,4,2,3,3,3,1,3,2,2,3,2,1,2,2,0,50,0,40,0,38,4,3,5,3,6,3,8,4,8,5,8,6,8,9,8,10,8,11,4,1,5,1,6,1,4,2,3,3,3,1,3,2,2,3,2,1,2,2,0,17,0,9,0,7,2,0,4,0,0,0,0,0,23,0,0,0,43,0,3,2,3,3,255,1,0,1,0>>}'

maelp avatar Feb 24 '23 14:02 maelp

and if I try to send a non-SSL MQTT message on port 8883 (to try), I don't see a log on the server, and the socket is closed immediately

maelp avatar Feb 24 '23 14:02 maelp

Yeah, it's weird. Can you add the TLS listener not with 0.0.0.0 but that actual internal IP like the other listeners? You can add as many TLS listeners on different ports as you want... that 1883 reconfigure thing should not matter here.


:point_right: Thank you for supporting VerneMQ: https://github.com/sponsors/vernemq :point_right: Using the binary VerneMQ packages commercially (.deb/.rpm/Docker) requires a paid subscription.

ioolkos avatar Feb 24 '23 14:02 ioolkos

I'm not sure how to do this, since I guess with Kubernetes each time I deploy I might be on a different node IP no?

maelp avatar Feb 24 '23 14:02 maelp

for the "1883 reconfigure", the thing which surprises me is that it says both "mqtts" and "1883". Shouldn't it be "8883" that it tries to reconfigure for mqtts?

maelp avatar Feb 24 '23 14:02 maelp

The bug with the default listener should be fixed in my branch https://github.com/vernemq/vernemq/pull/2078, if you can build Docker pointing to that it would be great. Although I believe there is still some bug that might have been caused by our latest ranch upgrade, I'm still researching the changes in ranch. But this is what I see:

^C%
codeadict@andromeda vernemq % _build/default/rel/vernemq/bin/vmq-admin listener stop address="127.0.0.1" port=8883
Done
codeadict@andromeda vernemq % _build/default/rel/vernemq/bin/vmq-admin listener show
+-------+---------+-----------+-------+------------+-----------+
| type  | status  | address   | port  | mountpoint | max_conns |
+-------+---------+-----------+-------+------------+-----------+
| http  | running | 127.0.0.1 | 8888  |            | 10000     |
+-------+---------+-----------+-------+------------+-----------+
| vmq   | running | 0.0.0.0   | 44053 |            | 10000     |
+-------+---------+-----------+-------+------------+-----------+
| mqtts | stopped | 127.0.0.1 | 8883  |            | 10000     |
+-------+---------+-----------+-------+------------+-----------+
| mqtt  | running | 127.0.0.1 | 1883  |            | 10000     |
+-------+---------+-----------+-------+------------+-----------+
codeadict@andromeda vernemq % _build/default/rel/vernemq/bin/vmq-admin listener start address="127.0.0.1" port=8883
can't start listener due to '{already_started,<0.20463.0>}'
codeadict@andromeda vernemq % _build/default/rel/vernemq/bin/vmq-admin listener show
+-------+---------+-----------+-------+------------+-----------+
| type  | status  | address   | port  | mountpoint | max_conns |
+-------+---------+-----------+-------+------------+-----------+
| http  | running | 127.0.0.1 | 8888  |            | 10000     |
+-------+---------+-----------+-------+------------+-----------+
| vmq   | running | 0.0.0.0   | 44053 |            | 10000     |
+-------+---------+-----------+-------+------------+-----------+
| mqtts | stopped | 127.0.0.1 | 8883  |            | 10000     |
+-------+---------+-----------+-------+------------+-----------+
| mqtt  | running | 127.0.0.1 | 1883  |            | 10000     |
+-------+---------+-----------+-------+------------+-----------+

Notice when I try to start the listener again, it believes it's already started and remains stopped.

codeadict avatar Feb 24 '23 15:02 codeadict