gitea icon indicating copy to clipboard operation
gitea copied to clipboard

The status of gitea 1.17 service is always "starting" in windows

Open LJFan opened this issue 2 years ago • 4 comments

Description

I create a service name gitea following the instruction in Windows Service. However, when I start the gitea service, the service is always in the 'starting' status The status is 'running' actually because gitea behaves normally.

Gitea Version

1.17.0

Can you reproduce the bug on the Gitea demo site?

No

Log Gist

No response

Screenshots

No response

Git Version

git version 2.36.1.windows.1

Operating System

Windows Server 2022 Standard

How are you running Gitea?

I use the newest (1.17.0) prebuild version of gitea gitea-1.17.0-gogit-windows-4.0-amd64.exe.

Database

MSSQL

LJFan avatar Aug 02 '22 01:08 LJFan

Logs if you think there is a bug. https://docs.gitea.io/en-us/seek-help/

Or, do you mean that you think that the status must be "running" instead of "starting"?

wxiaoguang avatar Aug 02 '22 03:08 wxiaoguang

I have just upgraded to 1.17 windows go-git as well (windows 10), and have the same issue. The windows service is stuck on starting and never progresses to running, leaving it where we can neither stop, start or restart it. However the app is running and responsive.

log snippet from 1.16.9 prior to upgrade

2022/08/02 15:17:47 ...les/queue/manager.go:145:Remove() [T] Queue Manager removed: QID: 15
2022/08/02 15:17:47 cmd/web.go:208:listen() [I] Listen: http://0.0.0.0:3000
2022/08/02 15:17:47 cmd/web.go:212:listen() [I] AppURL(ROOT_URL): http://localhost:3000/
2022/08/02 15:17:47 cmd/web.go:215:listen() [I] LFS server enabled
2022/08/02 15:17:47 ...s/graceful/server.go:61:NewServer() [I] Starting new Web server: tcp:0.0.0.0:3000 on PID: 4284
2022/08/02 15:17:47 ...s/graceful/server.go:74:func1() [D] Starting server on tcp:0.0.0.0:3000 (PID: 4284)
2022/08/02 15:17:47 ...l/manager_windows.go:119:Execute() [T] Sending Running state to SVC
2022/08/02 15:17:47 ...l/manager_windows.go:127:Execute() [T] Started
2022/08/02 15:17:57 ...l/manager_windows.go:146:Execute() [T] SVC sent interrogate
2022/08/02 15:17:57 ...l/manager_windows.go:146:Execute() [T] SVC sent interrogate
2022/08/02 15:17:57 ...ource/manager_run.go:40:Run() [T] Event source has no listeners
2022/08/02 15:17:57 ...ource/manager_run.go:50:Run() [T] Pausing the eventsource

and log snippet after upgrade to 1.17

2022/08/02 15:39:03 ...les/queue/manager.go:159:Remove() [T] [62e89c57-6] Queue Manager removed: QID: 15
2022/08/02 15:39:03 cmd/web.go:217:listen() [I] [62e89c57-29] Listen: http://0.0.0.0:3000
2022/08/02 15:39:03 cmd/web.go:221:listen() [I] [62e89c57-29] AppURL(ROOT_URL): http://localhost:3000/
2022/08/02 15:39:03 cmd/web.go:224:listen() [I] [62e89c57-29] LFS server enabled
2022/08/02 15:39:03 ...s/graceful/server.go:61:NewServer() [I] [62e89c57-29] Starting new Web server: tcp:0.0.0.0:3000 on PID: 3460
2022/08/02 15:39:03 ...s/graceful/server.go:74:func1() [D] [62e89c57-29] Starting server on tcp:0.0.0.0:3000 (PID: 3460)
2022/08/02 15:39:13 ...ource/manager_run.go:47:Run() [T] [62e89c57-21] Event source has no listeners
2022/08/02 15:39:13 ...ource/manager_run.go:57:Run() [T] [62e89c57-21] Pausing the eventsource

snoopydo avatar Aug 02 '22 03:08 snoopydo

~~Not sure whehter gitea.exe is using Windows Service API.~~

~~If not, then the problem is that gitea.exe is not a real Windows service program. So the service created by sc.exe can just run, but can not report its status or be controlled. To be compatible with Windows Service system, a service program must use Windows Service API to handle status switch and status report.~~


After reading the code, I think that the gitea.exe is using Windows Service API, but it might be broken recently 😂

wxiaoguang avatar Aug 02 '22 03:08 wxiaoguang

Me too, I just upgraded from 1.16 to 1.17 windows go-git, and I have the same issue. Gitea service is stuck on starting and never progresses to running, although it seems does not matter, but I can not close or shutdown app, except I kill the progress manully in windows task manager.

MartinLutherSu avatar Aug 02 '22 09:08 MartinLutherSu

None of this code has changed on behalf of Gitea and it is doing precisely the same thing that it has always done - something must have changed in our dependent upstream code.

zeripath avatar Aug 17 '22 21:08 zeripath

:( so probably not resolved for 1.17.1

eeyrjmr avatar Aug 18 '22 18:08 eeyrjmr

IsAnInteractiveSession() is depreciated, unsure of this is related

eeyrjmr avatar Aug 18 '22 19:08 eeyrjmr

yes but the problem is that IsWindowsService has a different permissions profile and won't work for most users. see #15454

Now it's possible that something has changed.

Fundamentally though I don't run Windows so I can't debug this. You will need to compile and test versions to try to figure this out.

zeripath avatar Aug 18 '22 19:08 zeripath

OK, so looking at your log fragment I am not seeing:

[T] Not running a service ... using the debug SVC manager

So we should just examine this code a bit more clearly:

https://github.com/go-gitea/gitea/blob/3d12018a74856313430dd8635d94c50d891f4528/modules/graceful/manager_windows.go#L89-L131

We know we get to:

https://github.com/go-gitea/gitea/blob/3d12018a74856313430dd8635d94c50d891f4528/modules/graceful/manager_windows.go#L100

because we don't see:

https://github.com/go-gitea/gitea/blob/3d12018a74856313430dd8635d94c50d891f4528/modules/graceful/manager_windows.go#L92

So this leads to Question 1

1. Is IsInteractiveSession blocking or not?

Let's assume for the moment that it doesn't, we don't see

https://github.com/go-gitea/gitea/blob/3d12018a74856313430dd8635d94c50d891f4528/modules/graceful/manager_windows.go#L106

So if we're not blocked at 100 svc.Run(...) should call the our Execute function. Leading to Question 2

2. Is Execute ever run

Assuming Execute is run then we should see:

https://github.com/go-gitea/gitea/blob/3d12018a74856313430dd8635d94c50d891f4528/modules/graceful/manager_windows.go#L122

so that would mean we were blocked at:

https://github.com/go-gitea/gitea/blob/3d12018a74856313430dd8635d94c50d891f4528/modules/graceful/manager_windows.go#L116-L120


Now we're lacking in Trace statements here but we could easily check this with pprof.

So I think it would be very helpful to give us the cpuprofile from pprof. So please enable pprof and give us the output of cpuprofile.

In app.ini add:

[server]
ENABLE_PPROF=true

Start Gitea and then run:

curl -o cpuprofile.out http://localhost:6060/debug/pprof/profile?seconds=15

Paste the outputed cpuprofile.out up here. (You may need to zip it.)

If you want to interrogate this yourself, this command will open a webbrowser to examine it:

go tool pprof --http=:6061 http://localhost:6060/debug/pprof/profile?seconds=15

zeripath avatar Aug 18 '22 19:08 zeripath

https://github.com/go-gitea/gitea/blob/3d12018a74856313430dd8635d94c50d891f4528/modules/graceful/manager_windows.go#L116-L120

I am probably wrong... But that logic looks backwards

The "else" clause is executed if setting.StartupTimeout is less than or equal to zero, yet the waitHint argument uses this value.

The svc.status calls look the wrong way round. Probably not related to this, unless a subtle change in golang-1.18 (used for gitea-1.17?) has a change in behaviour

eeyrjmr avatar Aug 18 '22 20:08 eeyrjmr

yeah that does look like a mistake but ... it's been like that for a long time - so I don't think it can be the cause necessarily.

What do you mean about the status calls being the wrong way round?


The PPROF profile I asked for would be the best thing to give us to work out where and if things are blocked.

zeripath avatar Aug 18 '22 20:08 zeripath

Just that the if/else calls look the wrong way round.

I'll get pprof data tomorrow evening. I am predominantly a Linux user but do baby sit a windows machine that flagged this

eeyrjmr avatar Aug 18 '22 20:08 eeyrjmr

Just that the if/else calls look the wrong way round.

Yeah, it's really weird that has sneaked through and that I've made such a weird mistake. I just don't understand how it could have been there for so long.

Feel free to send a PR to switch it around but it would probably be better to do it at the same time as we figure out what is causing the problem.

zeripath avatar Aug 18 '22 21:08 zeripath

I've attached my output from curl -o cpuprofile.out http://localhost:6060/debug/pprof/profile?seconds=15 along with the gitea.log

gitea 1.17 on windows 10 as a service

log.zip

snoopydo avatar Aug 18 '22 21:08 snoopydo

I have attached my logs. I also upgraded to gitea-1.17.1

eeyrjmr_logs.zip

eeyrjmr avatar Aug 19 '22 22:08 eeyrjmr

Wait, the logs provided have so far been set to INFO but to support the postulation made in https://github.com/go-gitea/gitea/issues/20609#issuecomment-1219886613 they need to be set to TRACE.

eeyrjmr avatar Aug 20 '22 10:08 eeyrjmr

I have attached my logs. I also upgraded to gitea-1.17.1

eeyrjmr_logs.zip

ah sorry it appears I need the goroutine profile...

curl -o goroutine.out http://localhost:6060/debug/pprof/goroutine?debug=2

That should be a text readable file.

If you are running 1.18/dev you could simply go to http(s)://GITEA/admin/monitor/stacktrace and then look for goroutine stacktrace that has windows/svc/service.go:Run() its stacktrace.

zeripath avatar Aug 20 '22 10:08 zeripath

Yup I need trace logs for this. You should be able to set the EXPRESSION=modules/graceful if you don't want to be overwhelmed.

zeripath avatar Aug 20 '22 11:08 zeripath

new logs

  1. loglevel TRACE
  2. goroutine eeyrjmr_v2.zip

With logs set to TRACE we are not seeing: log.Trace("Awaiting server start-up") so the original assumption is correct.

So this is causing a problem on a WinServer-2019 but is working fine with Windows11. This is an old block of code but gitea-1.17 is released using golang-1.18 while gitea-1.16 used golang-1.17 ... could there be something in that if/else swap that previously wasn't an issue but now is (some change in golang ... nothing in hte changelogs)

I am going to try to build on windows... I have the gentoo-linux ebuild to reference and I have all the build deps. As a sanity check I will swap those if/else clauses ...

eeyrjmr avatar Aug 20 '22 12:08 eeyrjmr

The trace logs are missing the important things I need.

I don't see Awaiting server start-up but yet the goroutine stack definitely implies that should have happened...


Now looking at the goroutine stack we see:

goroutine 16 [select]:
code.gitea.io/gitea/modules/graceful.(*Manager).awaitServer(0xc000137b00, 0x0)
	/source/modules/graceful/manager_windows.go:244 +0x1d8
code.gitea.io/gitea/modules/graceful.(*Manager).Execute(0xc000137b00, {0x4227fc0?, 0xae6285?, 0xc0012a1fd0?}, 0xc001354240, 0xc0013b22a0?)
	/source/modules/graceful/manager_windows.go:124 +0xe7
golang.org/x/sys/windows/svc.serviceMain.func2()
	/go/pkg/mod/golang.org/x/[email protected]/windows/svc/service.go:232 +0x45
created by golang.org/x/sys/windows/svc.serviceMain

This is the goroutine is created by svc.Run as a result of:

https://github.com/go-gitea/gitea/blob/8cceee4084c116dedc731da050d49549a8f71a1d/modules/graceful/manager_windows.go#L109-L111

It's Goroutine is number 18: /source/modules/graceful/manager_windows.go:110
goroutine 18 [syscall, locked to thread]:
syscall.SyscallN(0x7ffabf0666d0?, {0xc0014e3ef8?, 0x3?, 0x0?})
	/usr/local/go/src/runtime/syscall_windows.go:556 +0x109
syscall.Syscall(0xc00013ee10?, 0x0?, 0x0?, 0xc0014e3f58?, 0xcd069e?)
	/usr/local/go/src/runtime/syscall_windows.go:494 +0x3b
golang.org/x/sys/windows.StartServiceCtrlDispatcher(0x346af75?)
	/go/pkg/mod/golang.org/x/[email protected]/windows/zsyscall_windows.go:1304 +0x55
golang.org/x/sys/windows/svc.Run({0x346af75?, 0x5?}, {0x4851b00?, 0xc000137b00?})
	/go/pkg/mod/golang.org/x/[email protected]/windows/svc/service.go:292 +0x125
code.gitea.io/gitea/modules/graceful.(*Manager).start.func1()
	/source/modules/graceful/manager_windows.go:110 +0x36
created by code.gitea.io/gitea/modules/graceful.(*Manager).start
	/source/modules/graceful/manager_windows.go:109 +0xabb

Now looking at Goroutine 16 we see that the head of the stack is:

code.gitea.io/gitea/modules/graceful.(*Manager).awaitServer(0xc000137b00, 0x0)
	/source/modules/graceful/manager_windows.go:244 +0x1d8`

https://github.com/go-gitea/gitea/blob/8cceee4084c116dedc731da050d49549a8f71a1d/modules/graceful/manager_windows.go#L228-L251

This function will return once all of the server listeners are created so unfortunately this profile was acquired too early. But it does indicate that we're not blocked at 120 and in fact get to L124

https://github.com/go-gitea/gitea/blob/8cceee4084c116dedc731da050d49549a8f71a1d/modules/graceful/manager_windows.go#L124

zeripath avatar Aug 20 '22 13:08 zeripath

Would you say it’s „safe“ to run the service like that until a fix is available? Everything seems to work properly after all. Or would it be better go back to 1.16.9 for the time being?

holyhoehle avatar Aug 20 '22 16:08 holyhoehle

goroutine.zip

new goroutine log. I watched the gitea.log until the service was created. hopefully this was long enough

eeyrjmr avatar Aug 20 '22 16:08 eeyrjmr

Would you say it’s „safe“ to run the service like that until a fix is available? Everything seems to work properly after all. Or would it be better go back to 1.16.9 for the time being?

The problem is we can't downgrade the db so we are where we are. it works just any backup scripts won't work (disabled mine and weekly manually do the steps). The only risk with hard terminating is if someone is writing. This is why I am also doing such things late at night and also I flush before killing.

eeyrjmr avatar Aug 20 '22 16:08 eeyrjmr

goroutine 15 [select]:
code.gitea.io/gitea/modules/graceful.(*Manager).awaitServer(0xc0013df560, 0x0)
	/source/modules/graceful/manager_windows.go:244 +0x1d8
code.gitea.io/gitea/modules/graceful.(*Manager).Execute(0xc0013df560, {0x4227fc0?, 0xae6285?, 0xc0012adfd0?}, 0xc001304240, 0xc0012b6780?)
	/source/modules/graceful/manager_windows.go:124 +0xe7
golang.org/x/sys/windows/svc.serviceMain.func2()
	/go/pkg/mod/golang.org/x/[email protected]/windows/svc/service.go:232 +0x45
created by golang.org/x/sys/windows/svc.serviceMain
	/go/pkg/mod/golang.org/x/[email protected]/windows/svc/service.go:231 +0x2b4

Still seeing awaitServer...

So may be it's deadlocked somehow - but I don't see how.

I think we need the full associated logs - there's something going on here.

zeripath avatar Aug 20 '22 16:08 zeripath

Is it possible that you don't have INSTALL_LOCK set?

If not, well there's your problem. The server won't be ready until that lock is set.

zeripath avatar Aug 20 '22 16:08 zeripath

app.ini pasted below with ### where info would be. This is run on a windows-2019 server using a service created as per: https://docs.gitea.io/en-us/windows-service/ (with the path changed to d:\gitea)

APP_NAME = Gitea: ### Development instance
RUN_USER = COMPUTERNAME$
RUN_MODE = prod

[security]
INTERNAL_TOKEN     = #
INSTALL_LOCK       = true
SECRET_KEY         = #
PASSWORD_HASH_ALGO = pbkdf2

[database]
DB_TYPE  = mysql
HOST     = 127.0.0.1:3306
NAME     = #
USER     = #
PASSWD   = #
SCHEMA   = 
SSL_MODE = disable
CHARSET  = utf8
PATH     = D:\gitea\data\gitea.db
LOG_SQL  = false

[repository]
ROOT = D:/gitea/data/gitea-repositories

[server]
SSH_DOMAIN       = ###.com
DOMAIN           = ###.com
HTTP_PORT        = 3000
ROOT_URL         = http://###.com:3000/
DISABLE_SSH      = true
SSH_PORT         = 22
LFS_START_SERVER = true
LFS_JWT_SECRET   = #
OFFLINE_MODE     = true
#STARTUP_TIMEOUT = 10000
#ENABLE_PPROF=true

[lfs]
; Where your lfs files reside, default is data/lfs.
PATH = D:/gitea/data/lfs

[mailer]
ENABLED = false

[service]
DISABLE_REGISTRATION 		  = true
REGISTER_EMAIL_CONFIRM            = false
ENABLE_NOTIFY_MAIL                = false
ALLOW_ONLY_EXTERNAL_REGISTRATION  = false
ENABLE_CAPTCHA                    = false
REQUIRE_SIGNIN_VIEW               = true
DEFAULT_KEEP_EMAIL_PRIVATE        = false
DEFAULT_ALLOW_CREATE_ORGANIZATION = false
DEFAULT_ENABLE_TIMETRACKING       = true
NO_REPLY_ADDRESS                  = noreply.localhost

[picture]
DISABLE_GRAVATAR        = true
ENABLE_FEDERATED_AVATAR = false

[openid]
ENABLE_OPENID_SIGNIN = false
ENABLE_OPENID_SIGNUP = false

[session]
PROVIDER = file

[log]
MODE      = file
LEVEL     = info
ROOT_PATH = D:/gitea/log
ROUTER    = console

[ui]
THEMES = auto,gitea,arc-green,gitea-modern 

[migrations]
ALLOW_LOCALNETWORKS = true

As per logs. I think I have provided everything, with the level set to TRACE as well. if there is anything else please advise

eeyrjmr avatar Aug 20 '22 16:08 eeyrjmr

Thanks. The whole SVC thing is a red herring.

The problem is:

https://github.com/go-gitea/gitea/blob/3d52edc7a43ca429b7921b96c3663c32179881ab/modules/ssh/init.go#L19-L22

This block needs to call builtinUnused()

zeripath avatar Aug 20 '22 16:08 zeripath

Would you say it’s „safe“ to run the service like that until a fix is available? Everything seems to work properly after all. Or would it be better go back to 1.16.9 for the time being?

The problem is we can't downgrade the db so we are where we are. it works just any backup scripts won't work (disabled mine and weekly manually do the steps). The only risk with hard terminating is if someone is writing. This is why I am also doing such things late at night and also I flush before killing.

Thanks for the explanation. I’m going to handle it like that as well then for now.

holyhoehle avatar Aug 20 '22 17:08 holyhoehle

Workaround for the moment is to enable ssh - you don't need to enable to internal ssh server and you can even set the SSH_DOMAIN to something like NOT AVAILABLE DO NOT USE temporarily.

zeripath avatar Aug 20 '22 17:08 zeripath

I just made an update from gitea v1.16.8 to gitea v1.17.1. Im runnig the gitea-1.17.1-windows-4.0-amd64.exe and have same issue as described by LJFan.

The Operating System is Windows Server 2016. The Windows Event Viewer is not showing any Errors and the GiTea Logfile isn`t showing any issues either.

Here the first few Lines of Gitea Log

2022/08/26 12:44:56 cmd/web.go:217:listen() [I] [6308a428] Listen: https://0.0.0.0:3000
2022/08/26 12:44:56 cmd/web.go:221:listen() [I] [6308a428] AppURL(ROOT_URL): https://servername:3000/
2022/08/26 12:44:56 cmd/web.go:224:listen() [I] [6308a428] LFS server enabled
2022/08/26 12:44:56 models/repo.go:583:repoStatsCorrectNumClosedPulls() [I] [6308a427-37] [SQL] UPDATE `repository` SET num_closed_pulls=(SELECT COUNT(*) FROM `issue` WHERE repo_id=? AND is_closed=? AND is_pull=?) WHERE id=? [674 true true 674]
2022/08/26 12:44:56 ...s/graceful/server.go:61:NewServer() [I] [6308a428] Starting new Web server: tcp:0.0.0.0:3000 on PID: 5988
2022/08/26 12:44:56 models/repo.go:583:repoStatsCorrectNumClosedPulls() [I] [6308a427-37] [SQL] UPDATE `repository` SET num_closed_pulls=(SELECT COUNT(*) FROM `issue` WHERE repo_id=? AND is_closed=? AND is_pull=?) WHERE id=? [675 true true 675] - 999.6µs
2022/08/26 12:44:56 models/repo.go:658:CheckRepoStats() [I] [6308a427-37] [SQL] SELECT label.id FROM `label` WHERE label.num_issues!=(SELECT COUNT(*) FROM `issue_label` WHERE label_id=label.id) []
2022/08/26 12:44:56 ...s/graceful/server.go:74:func1() [D] [6308a428] Starting server on tcp:0.0.0.0:3000 (PID: 5988)

Scharunge avatar Aug 26 '22 11:08 Scharunge