chirpstack-gateway-bridge
chirpstack-gateway-bridge copied to clipboard
No reconnect if the initial connection fails because certificate has expired (v3.13.3)
- [x] The issue is present in the latest release. (in 3.13.3 but there are no related changes in changelog in 3.14)
- [x] I have searched the issues of this repository and believe that this is not a duplicate.
Steps to reproduce this issue
Steps:
- Set the time of the gateway to the past (e.g. 2012)
- Confgure the CSGB >=v3.13.3 to connect a MQTT broker with TLS enabled
- Start the CSGB, the connection will fail reporting the certificate is not yet valid
- Wait for some more time
What do you expect?
I expect the CSGB continuing to reconnect to the server indefinitely.
What happens instead?
The CSGB does not try to reconnect anymore. At any uplink or stat report it will say that it is "not connected" but does not try to reconnect either.
Could you share your log output?
time="2012-01-01T00:11:51.633064Z" level=error msg="[client] x509: certificate has expired or is not yet valid: current time 2012-01-01T00:11:50Z is before 2022-06-24T09:21:28Z" module=mqtt
time="2012-01-01T00:11:51.65220232Z" level=warning msg="[client] failed to connect to broker, trying next" module=mqtt
time="2012-01-01T00:11:51.6676576Z" level=error msg="[client] Failed to connect to a broker" module=mqtt
time="2012-01-01T00:12:25.58281776Z" level=error msg="publish event error" error="not Connected" event_type=up gateway_id=fcc23dfffe0b6866 uplink_id=7847c4fb-32fb-4645-9f11-231817cce43f
time="2012-01-01T00:12:28.03803664Z" level=error msg="publish event error" error="not Connected" event_type=stats gateway_id=fcc23dfffe0b6866 stats_id=634a6f44-270b-4a64-8075-3a750f05b7c3
time="2012-01-01T00:12:58.00843728Z" level=error msg="publish event error" error="not Connected" event_type=stats gateway_id=fcc23dfffe0b6866 stats_id=949b1970-0eae-4435-9f5c-40fa9929dc72
time="2012-01-01T00:13:28.00825152Z" level=error msg="publish event error" error="not Connected" event_type=stats gateway_id=fcc23dfffe0b6866 stats_id=7d867425-98df-4ab1-870f-7df7c8b25a7e
time="2012-01-01T00:13:58.01437832Z" level=error msg="publish event error" error="not Connected" event_type=stats gateway_id=fcc23dfffe0b6866 stats_id=851c4b30-1eaf-426f-adff-344a127cfa55
As a workaround, I've tried to set terminate_on_connect_error=true to make the process manager restart the CSGB, but the CSBG does never end, it doen't seem aware it's not connected.
Thanks for reporting. In the latest release I have updated the Paho MQTT client to the latest version as there have been a few fixes. I will test if I can reproduce this issue against the latest version.
Seeing this same issue with CSGB connecting to Azure. Would expect a reconnect or a failure which we could terminate the process with and let monit restart it.
I'm now looking into this.
I have tried the client-certificate validation part (validated by the MQTT broker), which does attempt to re-connect.
ERRO[0052] [client] Failed to connect to a broker module=mqtt
ERRO[0052] integration/mqtt: connection error error="network Error : remote error: tls: bad certificate"
ERRO[0054] [net] connect got error remote error: tls: bad certificate module=mqtt
ERRO[0054] [client] Connecting to ssl://localhost:8883 CONNACK was not CONN_ACCEPTED, but rather Connection Error module=mqtt
ERRO[0054] [client] Failed to connect to a broker module=mqtt
ERRO[0054] integration/mqtt: connection error error="network Error : remote error: tls: bad certificate"
ERRO[0056] [net] connect got error remote error: tls: bad certificate module=mqtt
ERRO[0056] [client] Connecting to ssl://localhost:8883 CONNACK was not CONN_ACCEPTED, but rather Connection Error module=mqtt
ERRO[0056] [client] Failed to connect to a broker module=mqtt
ERRO[0056] integration/mqtt: connection error error="network Error : remote error: tls: bad certificate"
ERRO[0059] [net] connect got error remote error: tls: bad certificate module=mqtt
ERRO[0059] [client] Connecting to ssl://localhost:8883 CONNACK was not CONN_ACCEPTED, but rather Connection Error module=mqtt
ERRO[0059] [client] Failed to connect to a broker module=mqtt
ERRO[0059] integration/mqtt: connection error error="network Error : remote error: tls: bad certificate"
I will now try with the server-certificate validation.
I created a MQTT server-certificate with valid-from in the future and am now seeing the same errors:
ERRO[0099] [client] x509: certificate has expired or is not yet valid: current time 2022-08-16T12:06:08+01:00 is before 2022-08-16T23:55:00Z module=mqtt
WARN[0099] [client] failed to connect to broker, trying next module=mqtt
ERRO[0099] [client] Failed to connect to a broker module=mqtt
ERRO[0099] integration/mqtt: connection error error="network Error : x509: certificate has expired or is not yet valid: current time 2022-08-16T12:06:08+01:00 is before 2022-08-16T23:55:00Z"
ERRO[0101] [client] x509: certificate has expired or is not yet valid: current time 2022-08-16T12:06:10+01:00 is before 2022-08-16T23:55:00Z module=mqtt
WARN[0101] [client] failed to connect to broker, trying next module=mqtt
ERRO[0101] [client] Failed to connect to a broker module=mqtt
ERRO[0101] integration/mqtt: connection error error="network Error : x509: certificate has expired or is not yet valid: current time 2022-08-16T12:06:10+01:00 is before 2022-08-16T23:55:00Z"
ERRO[0103] [client] x509: certificate has expired or is not yet valid: current time 2022-08-16T12:06:12+01:00 is before 2022-08-16T23:55:00Z module=mqtt
WARN[0103] [client] failed to connect to broker, trying next module=mqtt
ERRO[0103] [client] Failed to connect to a broker module=mqtt
ERRO[0103] integration/mqtt: connection error error="network Error : x509: certificate has expired or is not yet valid: current time 2022-08-16T12:06:12+01:00 is before 2022-08-16T23:55:00Z"
ERRO[0105] [client] x509: certificate has expired or is not yet valid: current time 2022-08-16T12:06:14+01:00 is before 2022-08-16T23:55:00Z module=mqtt
WARN[0105] [client] failed to connect to broker, trying next module=mqtt
ERRO[0105] [client] Failed to connect to a broker module=mqtt
ERRO[0105] integration/mqtt: connection error error="network Error : x509: certificate has expired or is not yet valid: current time 2022-08-16T12:06:14+01:00 is before 2022-08-16T23:55:00Z"
ERRO[0107] [client] x509: certificate has expired or is not yet valid: current time 2022-08-16T12:06:16+01:00 is before 2022-08-16T23:55:00Z module=mqtt
WARN[0107] [client] failed to connect to broker, trying next module=mqtt
ERRO[0107] [client] Failed to connect to a broker module=mqtt
ERRO[0107] integration/mqtt: connection error error="network Error : x509: certificate has expired or is not yet valid: current time 2022-08-16T12:06:16+01:00 is before 2022-08-16T23:55:00Z"
ERRO[0109] [client] x509: certificate has expired or is not yet valid: current time 2022-08-16T12:06:18+01:00 is before 2022-08-16T23:55:00Z module=mqtt
WARN[0109] [client] failed to connect to broker, trying next module=mqtt
ERRO[0109] [client] Failed to connect to a broker module=mqtt
ERRO[0109] integration/mqtt: connection error error="network Error : x509: certificate has expired or is not yet valid: current time 2022-08-16T12:06:18+01:00 is before 2022-08-16T23:55:00Z"
ERRO[0111] [client] x509: certificate has expired or is not yet valid: current time 2022-08-16T12:06:20+01:00 is before 2022-08-16T23:55:00Z module=mqtt
WARN[0111] [client] failed to connect to broker, trying next module=mqtt
ERRO[0111] [client] Failed to connect to a broker module=mqtt
ERRO[0111] integration/mqtt: connection error error="network Error : x509: certificate has expired or is not yet valid: current time 2022-08-16T12:06:20+01:00 is before 2022-08-16T23:55:00Z"
ERRO[0113] [client] x509: certificate has expired or is not yet valid: current time 2022-08-16T12:06:22+01:00 is before 2022-08-16T23:55:00Z module=mqtt
WARN[0113] [client] failed to connect to broker, trying next module=mqtt
ERRO[0113] [client] Failed to connect to a broker module=mqtt
ERRO[0113] integration/mqtt: connection error error="network Error : x509: certificate has expired or is not yet valid: current time 2022-08-16T12:06:22+01:00 is before 2022-08-16T23:55:00Z"
ERRO[0115] [client] x509: certificate has expired or is not yet valid: current time 2022-08-16T12:06:24+01:00 is before 2022-08-16T23:55:00Z module=mqtt
WARN[0115] [client] failed to connect to broker, trying next module=mqtt
ERRO[0115] [client] Failed to connect to a broker module=mqtt
ERRO[0115] integration/mqtt: connection error error="network Error : x509: certificate has expired or is not yet valid: current time 2022-08-16T12:06:24+01:00 is before 2022-08-16T23:55:00Z"
ERRO[0117] [client] x509: certificate has expired or is not yet valid: current time 2022-08-16T12:06:26+01:00 is before 2022-08-16T23:55:00Z module=mqtt
WARN[0117] [client] failed to connect to broker, trying next module=mqtt
ERRO[0117] [client] Failed to connect to a broker module=mqtt
ERRO[0117] integration/mqtt: connection error error="network Error : x509: certificate has expired or is not yet valid: current time 2022-08-16T12:06:26+01:00 is before 2022-08-16T23:55:00Z"
ERRO[0119] [client] x509: certificate has expired or is not yet valid: current time 2022-08-16T12:06:28+01:00 is before 2022-08-16T23:55:00Z module=mqtt
WARN[0119] [client] failed to connect to broker, trying next module=mqtt
ERRO[0119] [client] Failed to connect to a broker module=mqtt
ERRO[0119] integration/mqtt: connection error error="network Error : x509: certificate has expired or is not yet valid: current time 2022-08-16T12:06:28+01:00 is before 2022-08-16T23:55:00Z"
However, it keeps looping / re-connecting. I've waited for 2 minutes. Tested against the v3.13.3 version. Could you post your full configuration file?
I have found a way to reproduce this issue. Now looking into it...
I have been able to reproduce this using the ChirpStack Gateway OS and by setting its time a few years back history (and disabling NTP). The weird thing is that I can only reproduce this on reboot. I changed the connect
method to:
func (b *Backend) connect() error {
log.Warn("connect begin")
b.connMux.Lock()
defer b.connMux.Unlock()
log.Warn("connect got mutex")
if err := b.auth.Update(b.clientOpts); err != nil {
return errors.Wrap(err, "integration/mqtt: update authentication error")
}
b.conn = paho.NewClient(b.clientOpts)
if token := b.conn.Connect(); token.WaitTimeout(b.maxTokenWait) && token.Error() != nil {
return token.Error()
}
log.Warn("connect end")
return nil
}
And connectLoop
to:
// connectLoop blocks until the client is connected
func (b *Backend) connectLoop() {
for {
log.Warn("Start connect")
if err := b.connect(); err != nil {
if b.terminateOnConnectError {
log.Fatal(err)
}
log.WithError(err).Error("integration/mqtt: connection error")
time.Sleep(time.Second * 2)
} else {
log.Warn("Connect ok")
break
}
}
}
note the log.Warn(...)
messages.
Reboot
On reboot, I see:
Dec 25 13:08:05 raspberrypi4 user.warn chirpstack-gateway-bridge[430]: time="2014-12-25T13:08:05.819251274Z" level=warning msg="Start connect"
Dec 25 13:08:05 ras pberrypi4 user.warn chirpstack-gateway-bridge[430]: time="2014-12-25T13:08:05.819473163Z" level=warning msg="connect begin"
Dec 25 13:08:05 raspberrypi4 user.warn chirpstack-gateway-bridge[430]: time="2014-12-25T13:08:05.819617052Z" level=warning msg="connect got mutex"
Dec 25 13:08:05 raspberrypi4 user.debug chirpstack-gateway-bridge[430]: time="2014-12-25T13:08:05.819794866Z" level=debug msg="[client] Connect()" module=mqtt
Dec 25 13:08:05 raspberrypi4 user.debug chirpstack-gateway-bridge[430]: time="2014-12-25T13:08:05.819954329Z" level=debug msg="[store] memorystore initialized" module=mqtt
Dec 25 13:08:05 raspberrypi4 user.debug chirpstack-gateway-bridge[430]: time="2014-12-25T13:08:05.82066007Z" level=debug msg="[client] about to write new connect msg" module=mqtt
Dec 25 13:08:06 raspberrypi4 user.warn chirpstack-gateway-bridge[430]: time="2014-12-25T13:08:06.820373199Z" level=warning msg="connect end"
Dec 25 13:08:06 raspberrypi4 user.warn chirpstack-gateway-bridge[430]: time="2014-12-25T13:08:06.820726625Z" level=warning msg="Connect ok"
Restart
On restart (of the chirpstack-gateway-bridge
process):
Dec 25 13:10:27 raspberrypi4 user.warn chirpstack-gateway-bridge[610]: time="2014-12-25T13:10:27.471678338Z" level=warning msg="Start connect"
Dec 25 13:10:27 raspberrypi4 user.warn chirpstack-gateway-bridge[610]: time="2014-12-25T13:10:27.472124189Z" level=warning msg="connect begin"
Dec 25 13:10:27 raspberrypi4 user.warn chirpstack-gateway-bridge[610]: time="2014-12-25T13:10:27.472261838Z" level=warning msg="connect got mutex"
Dec 25 13:10:27 raspberrypi4 user.debug chirpstack-gateway-bridge[610]: time="2014-12-25T13:10:27.472429967Z" level=debug msg="[client] Connect()" module=mqtt
Dec 25 13:10:27 raspberrypi4 user.debug chirpstack-gateway-bridge[610]: time="2014-12-25T13:10:27.472610208Z" level=debug msg="[store] memorystore initialized" module=mqtt
Dec 25 13:10:27 raspberrypi4 user.debug chirpstack-gateway-bridge[610]: time="2014-12-25T13:10:27.472982189Z" level=debug msg="[client] about to write new connect msg" module=mqtt
Dec 25 13:10:27 raspberrypi4 user.err chirpstack-gateway-bridge[610]: time="2014-12-25T13:10:27.586977337Z" level=error msg="[client] x509: certificate has expired or is not yet valid: current time 2014-12-25T13:10:27Z is before 2021-10-05T13:52:00Z" module=mqtt
Dec 25 13:10:27 raspberrypi4 user.warn chirpstack-gateway-bridge[610]: time="2014-12-25T13:10:27.587226134Z" level=warning msg="[client] failed to connect to broker, trying next" module=mqtt
Dec 25 13:10:27 raspberrypi4 user.err chirpstack-gateway-bridge[610]: time="2014-12-25T13:10:27.587405671Z" level=error msg="[client] Failed to connect to a broker" module=mqtt
Dec 25 13:10:27 raspberrypi4 user.debug chirpstack-gateway-bridge[610]: time="2014-12-25T13:10:27.58756293Z" level=debug msg="[store] memorystore closed" module=mqtt
Dec 25 13:10:27 raspberrypi4 user.err chirpstack-gateway-bridge[610]: time="2014-12-25T13:10:27.587799245Z" level=error msg="integration/mqtt: connection error" error="network Error : x509: certificate has expired or is not yet valid: current time 2014-12-25T13:10:27Z is b
In summary, on reboot the connect
returns no error and the client thinks it is connected (and the connectLoop
terminates), but reports the error afterwards. Then after a restart of the chirpstack-gateway-bridge
service, the connect
does return an error and the connectLoop
tries to re-connect.
Edit: I did disable the chirpstack-concentratord
to rule out the possibility that this issue has to do with a possible boot order (e.g. concentratord not yet started when the GW bridge is started on boot).
:facepalm: token.WaitTimeout
returns false
in case of timeout, I think I made the assumption that after the timeout token.Error()
would contain a timeout error, which is not the case.