netmaker icon indicating copy to clipboard operation
netmaker copied to clipboard

[Bug]: loss of network connection, stops the service, and never restarts it, windows

Open si458 opened this issue 2 years ago • 5 comments

Contact Details

No response

What happened?

for some strange reason, whenever my server 2012 r2 macvhine losses connection to the mqtt for even a few seconds,

the software then issue new certificates and try to restart the windows service, but never actually restarts it?

also the windows service has no recovery methods either, they are all set to 'Take No Action' instead of 'Restart the service'

Version

v0.14.1

What OS are you using?

Windows

Relevant log output

[netclient.exe] 2022-05-30 16:50:17 received peer update for node Win2012R2 myNetwork 
[netclient.exe] 2022-05-30 16:51:00 received peer update for node Win2012R2 myNetwork 
[netclient.exe] 2022-05-30 16:51:01 error updating /etc/hosts open c:\windows\system32\drivers\etc\hosts: The process cannot access the file because it is being used by another process. 
[netclient.exe] 2022-05-30 16:51:07 checkin for myNetwork complete 
[netclient.exe] 2022-05-30 16:51:33 received peer update for node Win2012R2 myNetwork 
[netclient.exe] 2022-05-30 16:52:30 checkin for myNetwork complete 
[netclient.exe] 2022-05-30 16:53:30 checkin for myNetwork complete 
[netclient.exe] 2022-05-30 16:54:30 checkin for myNetwork complete 
[netclient.exe] 2022-05-30 16:55:18 received peer update for node Win2012R2 myNetwork 
[netclient.exe] 2022-05-30 16:56:00 unable to connect to broker, retrying ... 
Ping tcp://broker.nm.mydomain.com:8883(1.2.3.4:8883) - Connected - time=10.9991ms
Ping tcp://broker.nm.mydomain.com:8883(1.2.3.4:8883) - Connected - time=9.9961ms
Ping tcp://broker.nm.mydomain.com:8883(1.2.3.4:8883) - Connected - time=11.0499ms
[netclient.exe] 2022-05-30 16:56:04 could not connect to broker broker.nm.mydomain.com connect timeout 
[netclient.exe] 2022-05-30 16:56:04 connection issue detected.. attempt connection with new certs 
[netclient.exe] 2022-05-30 16:56:04 register at https://api.nm.mydomain.com:443/api/server/register 
[netclient.exe] 2022-05-30 16:56:05 certificates/key saved  
[netclient.exe] 2022-05-30 16:56:06 running stop of Windows Netclient daemon 
[netclient.exe] 2022-05-30 16:56:07 shutting down netclient daemon 
[netclient.exe] 2022-05-30 16:56:07 shutting down daemon for server  broker.nm.mefoo.com 
[netclient.exe] 2022-05-30 16:56:07 successfully ran stop of Windows Netclient daemon 
[netclient.exe] 2022-05-30 16:56:07 running start of Windows Netclient daemon 
[netclient.exe] 2022-05-30 16:56:08 successfully ran start of Windows Netclient daemon 
[netclient.exe] 2022-05-30 16:56:08 checkin for myNetwork complete 
[netclient.exe] 2022-05-30 16:56:08 checkin routine closed 
[netclient.exe] 2022-05-30 16:56:08 shutdown complete

Contributing guidelines

  • [X] Yes, I did.

si458 avatar Jun 01 '22 11:06 si458

Is this perhaps an issue with 2012 compatibility? that's a rather old system. We only test on Windows 10/11.

Also, new certificates only get added if the connection fails as well as the retry, which normally should be about a minute at least. If it is happening with a few seconds of missed connections that's a bigger issue.

afeiszli avatar Jun 01 '22 15:06 afeiszli

Is this perhaps an issue with 2012 compatibility? that's a rather old system. We only test on Windows 10/11.

Also, new certificates only get added if the connection fails as well as the retry, which normally should be about a minute at least. If it is happening with a few seconds of missed connections that's a bigger issue.

It might very well a bigger issue because I have this exact issue on two systems both are server 2012 r2

Also yes server 2012 r2 is old but it's still support by Microsoft until end of next year https://docs.microsoft.com/en-us/lifecycle/products/windows-server-2012-r2

si458 avatar Jun 01 '22 16:06 si458

I dont think we're going to be spending much time on 2012 compatibility, it's a bit out of scope for us. Have you tested on a newer system and seen similar issues?

afeiszli avatar Jun 01 '22 17:06 afeiszli

I dont think we're going to be spending much time on 2012 compatibility, it's a bit out of scope for us. Have you tested on a newer system and seen similar issues?

I'm running the latest on server 2016, Ubuntu and Windows 10 machines and all of them have no issues, it's just the 2 server 2012 r2 machines that are having this weird issue?

si458 avatar Jun 01 '22 17:06 si458

im not too sure if this is because of the update 0.14.2 OR because i have changed my network name, but so fair 24 hours later, and the servers havent disconnected once!

si458 avatar Jun 05 '22 07:06 si458

Closing, since issue was resolved

afeiszli avatar Sep 22 '22 11:09 afeiszli