[QA] [Flaky] Fails to reach service debug address
Build: https://drone.owncloud.com/owncloud/ocis/44210/17/5 https://drone.owncloud.com/owncloud/ocis/44197/21/5 https://drone.owncloud.com/owncloud/ocis/44195/21/5
When a user requests these URLs with "GET" and no authentication # AuthContext::aUserRequestsTheseUrlsWithAndNoAuthentication()
| endpoint | service |
| http://%base_url_hostname%:9229/healthz | audit |
| http://%base_url_hostname%:9229/readyz | audit |
cURL error 7: Failed to connect to ocis-server port 9229: Connection refused (see https://curl.haxx.se/libcurl/c/libcurl-errors.html) for http://ocis-server:9229/healthz (GuzzleHttp\Exception\ConnectException)
==> REQUEST
GET /healthz
X-Request-ID: apiServiceAvailability/serviceAvailabilityCheck.feature:142-150
Scenarios:
/drone/src/tests/acceptance/features/apiServiceAvailability/serviceAvailabilityCheck.feature:111
/drone/src/tests/acceptance/features/apiServiceAvailability/serviceAvailabilityCheck.feature:120
/drone/src/tests/acceptance/features/apiServiceAvailability/serviceAvailabilityCheck.feature:131
/drone/src/tests/acceptance/features/apiServiceAvailability/serviceAvailabilityCheck.feature:142
ociswrapper log
2025/03/14 01:05:31 [ociswrapper] ocis service port 9250 is no longer reachable
2025/03/14 01:05:31 [ociswrapper] Restarting oCIS server...
2025/03/14 01:05:31 [ociswrapper] Starting oCIS service...
2025/03/14 01:05:34 [ociswrapper] oCIS server is ready to accept requests
2025/03/14 01:05:34 Starting audit service...
{"level":"info","service":"audit","service":"audit","endpoint":"/healthz","time":"2025-03-14T01:05:34Z","line":"/drone/src/ocis-pkg/service/debug/service.go:27","message":"no probe provided, reverting to default (OK)"}
2025/03/14 01:05:36 [ociswrapper] audit service is ready to listen on port 9229
2025/03/14 01:05:38 [ociswrapper] audit service is ready to listen on port 9229
{"level":"info","service":"audit","transport":"stream","server":"audit","time":"2025-03-14T01:05:38Z","line":"/drone/src/services/audit/pkg/command/server.go:56","message":"Shutting down server"}
2025/03/14 01:05:40 [ociswrapper] audit service port 9229 is no longer reachable
2025/03/14 01:05:40 audit service stopped successfully
2025/03/14 01:05:40 [ociswrapper] Stopping oCIS server...
2025/03/14 01:05:42 [ociswrapper] ocis service port 9250 is no longer reachable
2025/03/14 01:05:42 [ociswrapper] Restarting oCIS server...
2025/03/14 01:05:42 [ociswrapper] Starting oCIS service...
2025/03/14 01:05:44 [ociswrapper] oCIS server is ready to accept requests
even though audit service port starts listening, it seems http://ocis-server:9229/healthz endpoint is not available .
Log
2025/03/18 05:47:33 Starting audit service...
{"level":"info","service":"audit","service":"audit","endpoint":"/healthz","time":"2025-03-18T05:47:33Z","line":"/drone/src/ocis-pkg/service/debug/service.go:27","message":"no probe provided, reverting to default (OK)"}
2025/03/18 05:47:35 audit service is ready to listen on port 9229
{"level":"info","service":"audit","transport":"stream","server":"audit","time":"2025-03-18T05:47:37Z","line":"/drone/src/services/audit/pkg/command/server.go:56","message":"Shutting down server"}
2025/03/18 05:47:39 audit service stopped successfully
2025/03/18 05:47:39 [ociswrapper] Stopping oCIS server...
Everytime test failed we get this log
{"level":"info","service":"audit","service":"audit","endpoint":"/healthz","time":"2025-03-18T05:47:33Z","line":"/drone/src/ocis-pkg/service/debug/service.go:27","message":"no probe provided, reverting to default (OK)"}
even though audit service port starts listening, it seems http://ocis-server:9229/healthz endpoint is not available .
Log
2025/03/18 05:47:33 Starting audit service... {"level":"info","service":"audit","service":"audit","endpoint":"/healthz","time":"2025-03-18T05:47:33Z","line":"/drone/src/ocis-pkg/service/debug/service.go:27","message":"no probe provided, reverting to default (OK)"} 2025/03/18 05:47:35 audit service is ready to listen on port 9229 {"level":"info","service":"audit","transport":"stream","server":"audit","time":"2025-03-18T05:47:37Z","line":"/drone/src/services/audit/pkg/command/server.go:56","message":"Shutting down server"} 2025/03/18 05:47:39 audit service stopped successfully 2025/03/18 05:47:39 [ociswrapper] Stopping oCIS server...Everytime test failed we get this log
{"level":"info","service":"audit","service":"audit","endpoint":"/healthz","time":"2025-03-18T05:47:33Z","line":"/drone/src/ocis-pkg/service/debug/service.go:27","message":"no probe provided, reverting to default (OK)"}
cc @2403905
~The audit service doesn't have the healthz check the readyz only.~
~The auth-bearer doesn't have any probs.~
~There is a server probes table~ ~ https://github.com/owncloud/ocis/issues/10281~
Each service has the default /healthz and /readyz. If the service doesn't have any additional checks it fails to default and we can see "no probe provided, reverting to default (OK)" in the logs and HTTP response 200.
https://github.com/owncloud/ocis/blob/cdb179f656235a0f99f67f9b41c5104440354fed/ocis-pkg/service/debug/service.go#L27
https://github.com/owncloud/ocis/blob/cdb179f656235a0f99f67f9b41c5104440354fed/ocis-pkg/service/debug/service.go#L48
In a failed test curl returns error 7 Failed to connect() to host or proxy.
https://drone.owncloud.com/owncloud/ocis/44197/21/5
It looks like the audit service is down at this moment.
Can the ociswrapper stop a service before the tests are done?
Recent failures:
- https://drone.owncloud.com/owncloud/ocis/44542/17/5
- https://drone.owncloud.com/owncloud/ocis/44558/17/5
- https://drone.owncloud.com/owncloud/ocis/44664/17/5
- https://drone.owncloud.com/owncloud/ocis/44674/17/5
- https://drone.owncloud.com/owncloud/ocis/44695/17/5
Possible fix: #11198 Let's see the builds
unassigning myself
@nirajacharya2 @saw-jan @anon-pradip
CI => https://drone.owncloud.com/owncloud/ocis/45074/21/5
cURL error 7: Failed to connect to ocis-server port 9134: Connection refused (see https://curl.haxx.se/libcurl/c/libcurl-errors.html) for http://ocis-server:9134/healthz (GuzzleHttp\Exception\ConnectException)
This main issue mentioned in this ticket has been fixed by https://github.com/owncloud/ocis/pull/11198 And we no longer see the failures mentioned in the ticket description. So closing here
@nirajacharya2 @saw-jan @anon-pradip
CI => https://drone.owncloud.com/owncloud/ocis/45074/21/5
cURL error 7: Failed to connect to ocis-server port 9134: Connection refused (see https://curl.haxx.se/libcurl/c/libcurl-errors.html) for http://ocis-server:9134/healthz (GuzzleHttp\Exception\ConnectException)
This is a similar but different failure. Please open a separate ticket for this