nsq icon indicating copy to clipboard operation
nsq copied to clipboard

nsqd: issues running as a NSSM service or scheduled task on windows

Open elvarb opened this issue 7 years ago • 12 comments

Running NSQd 0.3.8 in a user session on Windows works perfectly but running it any other way does not work.

Tested for Schedule Tasks

  • Pointing directly to nsqd.exe
  • Pointing directly to a bat file that starts nsqd.exe
  • Made my own golang program that does os.exec start on nsqd.exe
  • Made my own golang program that starts a bat file that starts nsqd.exe
  • Made a powershell script that uses start-process for nsqd.exe, and to the bad file
  • Have a C# program start nsqd.exe

I can see in the task manager that the process starts for a second and then goes away. In the results in the eventlog for the task is always that it starts "with return code 2147942401.". Google says this could be a permission problem but it does not seem to be the case. Tried running it as SYSTEM, NETWORK SERVICE, LOCAL SERVICE and a specific user. All this works if the scheduled task is set to Run only when user is logged on, then a console window pops up with the stdout information. Nothing works when its set to Run weather user is logged on or not. In my golang app, bat file and powershell script I tested I tried to capture the stdout from NSQd but nothing was ever captured.

Tested for Service

  • Have NSSM configured to point to nsqd.exe
  • Have NSSM configured to point to my golang program
  • Have NSSM configured to point to a bat file

In the NSSM log file it logs: "2016/11/28 11:10:30 The service process could not connect to the service controller."

elvarb avatar Nov 28 '16 11:11 elvarb

There is a similar issue I'm having with the latest version of Telegraf, does not work with NSSM. But Telegraf is using a native golang service handler that works.

In the Telegraf github issues

  • https://github.com/influxdata/telegraf/issues/860
  • https://github.com/influxdata/telegraf/issues/1760 There is a reference to Kardianos issue
  • https://github.com/kardianos/service/issues/72

They both talk about sc.exe working and I just tested it and it does indeed work!

From the NSQ release history it seems that service handling was added in version 0.3.7 and when Telegraf got its service manager handling as added as well. So when the application can do service handling it stops working in all those cases talked about above.

elvarb avatar Nov 28 '16 12:11 elvarb

cc @judwhite

ploxiln avatar Nov 28 '16 18:11 ploxiln

@elvarb interesting, I'll try to repro your results wrt a scheduled task and other ways of launching the binary.

judwhite avatar Nov 30 '16 05:11 judwhite

I am seeing this with NSQ 0.3.8-go1.6.2.

Tried running as a service with NSSM and WinSW service wrappers. No output is captured from the process, that exits immediately after launch.

The only error seen is

"2016/12/02 14:11:27 The service process could not connect to the service controller."

with both nssm and winsw.

I tried

  • running nsqd.exe directly with nssm and winsw, with and without args
  • running a batch script that starts nsqd.exe with nssm and winsw

esiqveland avatar Dec 02 '16 13:12 esiqveland

@esiqveland Can you try using sc.exe?

sc create nsqlookupd binpath= "c:\nsq\nsqlookupd.exe" start= auto DisplayName= "nsqlookupd"
sc description nsqlookupd "nsqlookupd"
sc start nsqlookupd

sc create nsqd binpath= "c:\nsq\nsqd.exe -mem-queue-size=0 -lookupd-tcp-address=127.0.0.1:4160 -data-path=c:\nsq\data" start= auto DisplayName= "nsqd"
sc description nsqd "nsqd"
sc start nsqd

judwhite avatar Dec 05 '16 07:12 judwhite

Thank you @judwhite!

It seems to be working with sc.exe, but I couldn't find a way to access any logs of the output from the process.

esiqveland avatar Dec 05 '16 08:12 esiqveland

@esiqveland I'll work on that later this week

judwhite avatar Dec 05 '16 10:12 judwhite

Interestingly, nsq_to_file.exe works with NSSM and WinSW, but not with sc.exe.

esiqveland avatar Dec 07 '16 07:12 esiqveland

@esiqveland nsq_to_file doesn't have native Windows Service support. Only nsqd and nsqlookupd do at this point.

judwhite avatar Dec 07 '16 08:12 judwhite

In https://github.com/nsqio/nsq/issues/853#issuecomment-280900898 judwhite says "If people prefer to use NSSM there are ways to support that without removing native support" (for nsqd/nsqlookupd running directly as windows services).

I think that would be nice, particularly since other nsq binaries including nsq_to_nsq and nsq_to_file can only run under NSSM / WinSW (so currently, different nsq binaries - that could be considered services - can't all be run in any single way on windows).

ploxiln avatar Feb 22 '17 21:02 ploxiln

The native service support is actually quite nice, but it should be mentioned somewhere that it won't work with other service wrappers because of this. The logs are an issue though, as you don't get any std* output from these native services.

esiqveland avatar Feb 23 '17 21:02 esiqveland

@esiqveland We'll tackle the logs for nsqd and nsqlookupd when running as a Windows Service as part of #853. My experience has been people would prefer to use native support and use NSSM only when native support isn't available. Both scenarios can be supported, and that might be a better short term solution since logging may be a longer effort.

@ploxiln You're right about nsq_to_nsq and nsq_to_file. I'll roll support for running as a Windows Service and make sure they still work with NSSM into those along with nsqd and nsqlookupd. The only change to NSQ (other than updating nsq_to_nsq and nsq_to_file) will be to update go-svc once https://github.com/judwhite/go-svc/issues/6 lands.

judwhite avatar Feb 25 '17 22:02 judwhite