ocis icon indicating copy to clipboard operation
ocis copied to clipboard

6.3.0 is active for a long time after a k6 run

Open butonic opened this issue 1 year ago • 12 comments

I am running 6min load tests and have a 6GB log file here with additional logs about service registration and deregistration to verify the robustness of the nats-js-kv based registry. AFAICT the registry behavles correctly. But there are some findings that we should tackle:

  • [ ] make readyz reflect reality https://github.com/owncloud/ocis/issues/10036#issuecomment-2348442095
  • [ ] userlog service is active for a long time https://github.com/owncloud/ocis/issues/10036#issuecomment-2345943669
  • [ ] frontend is trying to autoaccept shares too long https://github.com/owncloud/ocis/issues/10036#issuecomment-2346023045
  • [ ] I see Token used befor issued errors https://github.com/owncloud/ocis/issues/10036#issuecomment-2348491468
  • [ ] a panic in the gateway: https://github.com/owncloud/ocis/issues/10036#issuecomment-2348570438

butonic avatar Sep 12 '24 10:09 butonic

AFAICT the userlog service has to chew through a long list of events. IMO we can parallelize adding events to user stores in

	// III) store the eventID for each user
	for _, id := range users {
		if err := ul.addEventToUser(id, event); err != nil {
			ul.log.Error().Err(err).Str("userID", id).Str("eventid", event.ID).Msg("failed to store event for user")
			return
		}
	}

or maybe even spawn a go routine for every event:

diff --git a/services/userlog/pkg/service/service.go b/services/userlog/pkg/service/service.go
index f18e4c18c6..0bc48cff43 100644
--- a/services/userlog/pkg/service/service.go
+++ b/services/userlog/pkg/service/service.go
@@ -97,7 +97,7 @@ func NewUserlogService(opts ...Option) (*UserlogService, error) {
 // MemorizeEvents stores eventIDs a user wants to receive
 func (ul *UserlogService) MemorizeEvents(ch <-chan events.Event) {
        for event := range ch {
-               ul.processEvent(event)
+               go ul.processEvent(event)
        }
 }
 

but we should use a worker pool instead of this simplistic solution

butonic avatar Sep 12 '24 10:09 butonic

hm and autoaccept shares in the frontend is also too slow to consume the events:

diff --git a/services/frontend/pkg/command/events.go b/services/frontend/pkg/command/events.go
index 2487165533..954fab7af8 100644
--- a/services/frontend/pkg/command/events.go
+++ b/services/frontend/pkg/command/events.go
@@ -118,24 +118,26 @@ func AutoAcceptShares(ev events.ShareCreated, autoAcceptDefault bool, l log.Logg
        }
 
        for _, uid := range userIDs {
-               if !autoAcceptShares(ctx, uid, autoAcceptDefault, vs) {
-                       continue
-               }
+               go func(uid *user.UserId) {
+                       if !autoAcceptShares(ctx, uid, autoAcceptDefault, vs) {
+                               return
+                       }
 
-               gwc, err := gatewaySelector.Next()
-               if err != nil {
-                       l.Error().Err(err).Msg("cannot get gateway client")
-                       continue
-               }
-               resp, err := gwc.UpdateReceivedShare(ctx, updateShareRequest(ev.ShareID, uid))
-               if err != nil {
-                       l.Error().Err(err).Msg("error sending grpc request")
-                       continue
-               }
+                       gwc, err := gatewaySelector.Next()
+                       if err != nil {
+                               l.Error().Err(err).Msg("cannot get gateway client")
+                               return
+                       }
+                       resp, err := gwc.UpdateReceivedShare(ctx, updateShareRequest(ev.ShareID, uid))
+                       if err != nil {
+                               l.Error().Err(err).Msg("error sending grpc request")
+                               return
+                       }
 
-               if resp.GetStatus().GetCode() != rpc.Code_CODE_OK {
-                       l.Error().Interface("status", resp.GetStatus()).Str("userid", uid.GetOpaqueId()).Msg("unexpected status code while accepting share")
-               }
+                       if resp.GetStatus().GetCode() != rpc.Code_CODE_OK {
+                               l.Error().Interface("status", resp.GetStatus()).Str("userid", uid.GetOpaqueId()).Msg("unexpected status code while accepting share")
+                       }
+               }(uid)
        }
 
 }

the frontend will log tons of errors like this:

┌─────────────────────────────────────────────────────────────────── Logs(school-91001/frontend:frontend)[tail] ────────────────────────────────────────────────────────────────────┐
│                                                         Autoscroll:Off     FullScreen:Off     Timestamps:Off     Wrap:On                                                          │
│ frontend-7c8969fbbb-xzfck {"level":"error","service":"frontend","status":{"code":6,"message":"error getting received share","trace":"584ec5e4efae632ea344ca9075710d66"},"userid": │
│ "411b9ad0-95da-4b59-9848-671d36bf9e7f","time":"2024-09-12T11:14:45Z","line":"github.com/owncloud/ocis/v2/services/frontend/pkg/command/events.go:137","message":"unexpected statu │
│ s code while accepting share"}                                                                                                                                                    │
│ frontend-7c8969fbbb-xzfck {"level":"error","service":"frontend","status":{"code":6,"message":"error getting received share","trace":"e4bda889637ffb871773bdc30bc660ee"},"userid": │
│ "f6e1e884-a4ee-4b7d-a68a-5cffcfb0c614","time":"2024-09-12T11:14:45Z","line":"github.com/owncloud/ocis/v2/services/frontend/pkg/command/events.go:137","message":"unexpected statu │
│ s code while accepting share"}                                                                                                                                                    │
│ frontend-7c8969fbbb-bfkbr {"level":"error","service":"frontend","status":{"code":6,"message":"error getting received share","trace":"f464ebb5144bc1d251decd1fa928b2ce"},"userid": │
│ "fcf77ab0-1140-46ad-82a1-23bd86659955","time":"2024-09-12T11:14:45Z","line":"github.com/owncloud/ocis/v2/services/frontend/pkg/command/events.go:137","message":"unexpected statu │
│ s code while accepting share"}                                                                                                                                                    │
│ frontend-7c8969fbbb-xzfck {"level":"error","service":"frontend","status":{"code":6,"message":"error getting received share","trace":"7b3a501b2fb5f3ae15225f35b47b06d3"},"userid": │
│ "ad217814-0571-4ad2-ae7a-00d332552ea2","time":"2024-09-12T11:14:45Z","line":"github.com/owncloud/ocis/v2/services/frontend/pkg/command/events.go:137","message":"unexpected statu │
│ s code while accepting share"}                                                                                                                                                    │
│ frontend-7c8969fbbb-bfkbr {"level":"error","service":"frontend","status":{"code":6,"message":"error getting received share","trace":"e14d1a00e9d49aa269581f44ec719996"},"userid": │
│ "9ca0356d-c2cd-4c0e-a758-00b68b315e11","time":"2024-09-12T11:14:45Z","line":"github.com/owncloud/ocis/v2/services/frontend/pkg/command/events.go:137","message":"unexpected statu │
│ s code while accepting share"}                                                                                                                                                    │
│ frontend-7c8969fbbb-xzfck nats: slow consumer, messages dropped on connection [41261574] for subscription on "main-queue"                                                         │
│ frontend-7c8969fbbb-xzfck {"level":"error","service":"frontend","status":{"code":6,"message":"error getting received share","trace":"f734c7466f4da655a817f34256101f2d"},"userid": │
│ "db96dfd9-b4db-4279-8e70-791ea5490495","time":"2024-09-12T11:14:45Z","line":"github.com/owncloud/ocis/v2/services/frontend/pkg/command/events.go:137","message":"unexpected statu │
│ s code while accepting share"}                                                                                                                                                    │
│ frontend-7c8969fbbb-bfkbr {"level":"error","service":"frontend","status":{"code":6,"message":"error getting received share","trace":"7ac371ad25d1bc0b2f49570e797c5755"},"userid": │
│ "790c4571-bd01-4d83-b4d9-04330e29e541","time":"2024-09-12T11:14:45Z","line":"github.com/owncloud/ocis/v2/services/frontend/pkg/command/events.go:137","message":"unexpected statu │
│ s code while accepting share"}                                                                                                                                                    │
│ frontend-7c8969fbbb-xzfck {"level":"error","service":"frontend","status":{"code":6,"message":"error getting received share","trace":"f3163cbf764e7d41a847cb9d7eedb32f"},"userid": │
│ "c8eb2f77-b90e-4ccd-a827-66d618da2250","time":"2024-09-12T11:14:45Z","line":"github.com/owncloud/ocis/v2/services/frontend/pkg/command/events.go:137","message":"unexpected statu │
│ s code while accepting share"}                                                                                                                                                    │
│ frontend-7c8969fbbb-bfkbr {"level":"error","service":"frontend","status":{"code":6,"message":"error getting received share","trace":"bcb0b5c8aa758427410df8af690a7dc6"},"userid": │
│ "5aa43c8d-502c-421e-9191-355b8bd632a4","time":"2024-09-12T11:14:45Z","line":"github.com/owncloud/ocis/v2/services/frontend/pkg/command/events.go:137","message":"unexpected statu │
│ s code while accepting share"}                                                                                                                                                    │

and the sharing service

┌──────────────────────────────────────────────────────────── Logs(school-91001/sharing-84944f47c6-hk65q:sharing)[tail] ────────────────────────────────────────────────────────────┐
│                                                         Autoscroll:Off     FullScreen:Off     Timestamps:Off     Wrap:On                                                          │
│ rovider/usershareprovider.go:488","message":"error getting received share"}                                                                                                       │
│ {"level":"error","service":"sharing","pkg":"rgrpc","traceid":"8305329fbea49382070acc1a411301d3","error":"error: not found: opaque_id:\"55b65915-d282-4be3-88f7-a07a3d092c01:b208f │
│ 68f-40e3-4333-bcd3-7711302d80d8:f23f736d-17a0-4ab9-a78b-ccf264b498af\"","time":"2024-09-12T11:11:51Z","line":"github.com/cs3org/reva/[email protected]/internal/grpc/services/usersharep │
│ rovider/usershareprovider.go:488","message":"error getting received share"}                                                                                                       │
│ {"level":"error","service":"sharing","pkg":"rgrpc","traceid":"30e1cf60fd15db313a62bf47a40ea4e3","error":"error: not found: opaque_id:\"55b65915-d282-4be3-88f7-a07a3d092c01:b208f │
│ 68f-40e3-4333-bcd3-7711302d80d8:439a51c6-e891-4a57-827a-3355e7bd865d\"","time":"2024-09-12T11:11:51Z","line":"github.com/cs3org/reva/[email protected]/internal/grpc/services/usersharep │
│ rovider/usershareprovider.go:488","message":"error getting received share"}                                                                                                       │
│ {"level":"error","service":"sharing","pkg":"rgrpc","traceid":"fc7a5171c3038b65d8698d5a7154f347","error":"error: not found: opaque_id:\"55b65915-d282-4be3-88f7-a07a3d092c01:b208f │
│ 68f-40e3-4333-bcd3-7711302d80d8:439a51c6-e891-4a57-827a-3355e7bd865d\"","time":"2024-09-12T11:11:51Z","line":"github.com/cs3org/reva/[email protected]/internal/grpc/services/usersharep │
│ rovider/usershareprovider.go:488","message":"error getting received share"}                                                                                                       │
│ {"level":"error","service":"sharing","pkg":"rgrpc","traceid":"ad3623166893f7c5b208972d42776b05","error":"error: not found: opaque_id:\"55b65915-d282-4be3-88f7-a07a3d092c01:b208f │
│ 68f-40e3-4333-bcd3-7711302d80d8:439a51c6-e891-4a57-827a-3355e7bd865d\"","time":"2024-09-12T11:11:51Z","line":"github.com/cs3org/reva/[email protected]/internal/grpc/services/usersharep │
│ rovider/usershareprovider.go:488","message":"error getting received share"}                                                                                                       │
│ {"level":"error","service":"sharing","pkg":"rgrpc","traceid":"4b0fbbfd0295fd8ff5cf16b3fc744f7f","error":"error: not found: opaque_id:\"55b65915-d282-4be3-88f7-a07a3d092c01:b208f │
│ 68f-40e3-4333-bcd3-7711302d80d8:f23f736d-17a0-4ab9-a78b-ccf264b498af\"","time":"2024-09-12T11:11:51Z","line":"github.com/cs3org/reva/[email protected]/internal/grpc/services/usersharep │
│ rovider/usershareprovider.go:488","message":"error getting received share"}                                                                                                       │
│ {"level":"error","service":"sharing","pkg":"rgrpc","traceid":"89216a5ff7316070f292a0572228f792","error":"error: not found: opaque_id:\"55b65915-d282-4be3-88f7-a07a3d092c01:b208f │
│ 68f-40e3-4333-bcd3-7711302d80d8:439a51c6-e891-4a57-827a-3355e7bd865d\"","time":"2024-09-12T11:11:52Z","line":"github.com/cs3org/reva/[email protected]/internal/grpc/services/usersharep │
│ rovider/usershareprovider.go:488","message":"error getting received share"}                                                                                                       │
│ {"level":"error","service":"sharing","pkg":"rgrpc","traceid":"30dd83935491d3b4a11f6e5dfbb06228","error":"error: not found: opaque_id:\"55b65915-d282-4be3-88f7-a07a3d092c01:b208f │
│ 68f-40e3-4333-bcd3-7711302d80d8:439a51c6-e891-4a57-827a-3355e7bd865d\"","time":"2024-09-12T11:11:52Z","line":"github.com/cs3org/reva/[email protected]/internal/grpc/services/usersharep │
│ rovider/usershareprovider.go:488","message":"error getting received share"}                                                                                                       │
│ {"level":"error","service":"sharing","pkg":"rgrpc","traceid":"88e88942394de3fbd72809acd11528cc","error":"error: not found: opaque_id:\"55b65915-d282-4be3-88f7-a07a3d092c01:b208f │
│ 68f-40e3-4333-bcd3-7711302d80d8:f23f736d-17a0-4ab9-a78b-ccf264b498af\"","time":"2024-09-12T11:11:52Z","line":"github.com/cs3org/reva/[email protected]/internal/grpc/services/usersharep │
│ rovider/usershareprovider.go:488","message":"error getting received share"}                                                                                                       │
│ {"level":"error","service":"sharing","pkg":"rgrpc","traceid":"20befebb192e0e71c557b8e1ccb89a98","error":"error: not found: opaque_id:\"55b65915-d282-4be3-88f7-a07a3d092c01:b208f │
│ 68f-40e3-4333-bcd3-7711302d80d8:439a51c6-e891-4a57-827a-3355e7bd865d\"","time":"2024-09-12T11:11:52Z","line":"github.com/cs3org/reva/[email protected]/internal/grpc/services/usersharep │
│ rovider/usershareprovider.go:488","message":"error getting received share"}                                                                                                       │
│                                                                                                                                                                                   │

the search service also cannot keep up indexing files, but it does not have to multiply events for all users that have access in the shared space, so it settles down far sooner.

butonic avatar Sep 12 '24 11:09 butonic

hm restarting the userlog service quiets it ... restarting the fronted seems to pick up events again

butonic avatar Sep 12 '24 11:09 butonic

testing 6.3.0

# k6 run ~/cdperf/packages/k6-tests/artifacts/koko-platform-000-mixed-ramping-k6.js

          /\      |‾‾| /‾‾/   /‾‾/   
     /\  /  \     |  |/  /   /  /    
    /  \/    \    |     (   /   ‾‾\  
   /          \   |  |\  \ |  (‾)  | 
  / __________ \  |__| \__\ \_____/ .io

  execution: local
     script: /root/cdperf/packages/k6-tests/artifacts/koko-platform-000-mixed-ramping-k6.js
     output: -

  scenarios: (100.00%) 8 scenarios, 75 max VUs, 6m30s max duration (incl. graceful stop):
           * add_remove_tag_100: Up to 5 looping VUs for 6m0s over 3 stages (gracefulRampDown: 30s, exec: add_remove_tag_100, gracefulStop: 30s)
           * create_remove_group_share_090: Up to 5 looping VUs for 6m0s over 3 stages (gracefulRampDown: 30s, exec: create_remove_group_share_090, gracefulStop: 30s)
           * create_space_080: Up to 5 looping VUs for 6m0s over 3 stages (gracefulRampDown: 30s, exec: create_space_080, gracefulStop: 30s)
           * create_upload_rename_delete_folder_and_file_040: Up to 5 looping VUs for 6m0s over 3 stages (gracefulRampDown: 30s, exec: create_upload_rename_delete_folder_and_file_040, gracefulStop: 30s)
           * download_050: Up to 5 looping VUs for 6m0s over 3 stages (gracefulRampDown: 30s, exec: download_050, gracefulStop: 30s)
           * navigate_file_tree_020: Up to 10 looping VUs for 6m0s over 3 stages (gracefulRampDown: 30s, exec: navigate_file_tree_020, gracefulStop: 30s)
           * sync_client_110: Up to 20 looping VUs for 6m0s over 3 stages (gracefulRampDown: 30s, exec: sync_client_110, gracefulStop: 30s)
           * user_group_search_070: Up to 20 looping VUs for 6m0s over 3 stages (gracefulRampDown: 30s, exec: user_group_search_070, gracefulStop: 30s)


     ✓ authn -> loginPageResponse - status
     ✓ authn -> authorizationResponse - status
     ✓ authn -> accessTokenResponse - status
     ✗ client -> user.findUser - status
      ↳  99% — ✓ 402 / ✗ 2
     ✓ client -> role.getMyDrives - status
     ✗ client -> resource.getResourceProperties - status
      ↳  99% — ✓ 2868 / ✗ 19
     ✓ client -> application.createDrive - status
     ✗ client -> resource.createResource - status
      ↳  98% — ✓ 81 / ✗ 1
     ✗ client -> drive.deactivateDrive - status
      ↳  98% — ✓ 82 / ✗ 1
     ✗ client -> tag.addTagToResource - status
      ↳  0% — ✓ 0 / ✗ 47
     ✗ client -> drive.deleteDrive - status
      ↳  98% — ✓ 82 / ✗ 1
     ✓ client -> group.findGroup - status
     ✓ client -> tag.getTagForResource - status
     ✗ test -> resource.getTags - name - match
      ↳  0% — ✓ 0 / ✗ 47
     ✗ client -> tag.removeTagToResource - status
      ↳  0% — ✓ 0 / ✗ 47
     ✗ client -> share.createShareInvitation - status
      ↳  92% — ✓ 36 / ✗ 3
     ✗ client -> resource.uploadResource - status
      ↳  97% — ✓ 42 / ✗ 1
     ✓ client -> resource.moveResource - status
     ✗ client -> share.deleteShareInvitation - status
      ↳  82% — ✓ 32 / ✗ 7
     ✗ client -> resource.downloadResource - status
      ↳  97% — ✓ 95 / ✗ 2
     ✗ client -> resource.deleteResource - status
      ↳  97% — ✓ 80 / ✗ 2

     checks.........................: 96.01% ✓ 4338      ✗ 180 
     data_received..................: 1.6 GB 4.2 MB/s
     data_sent......................: 812 MB 2.1 MB/s
     http_req_blocked...............: avg=927.66µs min=200ns   med=270ns    max=70.05ms p(90)=581ns    p(95)=1.17µs  
     http_req_connecting............: avg=325.64µs min=0s      med=0s       max=21.57ms p(90)=0s       p(95)=0s      
     http_req_duration..............: avg=361.1ms  min=2.81ms  med=272.55ms max=11.35s  p(90)=596.24ms p(95)=826.18ms
       { expected_response:true }...: avg=366.98ms min=2.81ms  med=275.31ms max=11.35s  p(90)=602.71ms p(95)=829.36ms
     http_req_failed................: 2.83%  ✓ 133       ✗ 4563
     http_req_receiving.............: avg=9.68ms   min=23.2µs  med=96.54µs  max=6.42s   p(90)=221.77µs p(95)=66.85ms 
     http_req_sending...............: avg=2.93ms   min=22.65µs med=92.51µs  max=3.58s   p(90)=158µs    p(95)=191.1µs 
     http_req_tls_handshaking.......: avg=581.51µs min=0s      med=0s       max=33.76ms p(90)=0s       p(95)=0s      
     http_req_waiting...............: avg=348.48ms min=2.71ms  med=267.95ms max=9.75s   p(90)=572.57ms p(95)=796.82ms
     http_reqs......................: 4696   12.142205/s
     iteration_duration.............: avg=6.8s     min=1.04s   med=1.32s    max=46.51s  p(90)=15.67s   p(95)=18.39s  
     iterations.....................: 3393   8.773105/s
     vus............................: 1      min=0       max=75
     vus_max........................: 75     min=75      max=75


running (6m26.8s), 00/75 VUs, 3393 complete and 1 interrupted iterations
add_remove_tag_100             ✓ [======================================] 0/5 VUs    6m0s
create_remove_group_share_090  ✓ [======================================] 0/5 VUs    6m0s
create_space_080               ✓ [======================================] 0/5 VUs    6m0s
create_upload_rename_delete... ✓ [======================================] 0/5 VUs    6m0s
download_050                   ✓ [======================================] 0/5 VUs    6m0s
navigate_file_tree_020         ✓ [======================================] 00/10 VUs  6m0s
sync_client_110                ✓ [======================================] 00/20 VUs  6m0s
user_group_search_070          ✓ [======================================] 00/20 VUs  6m0s

der userlog service hat viel zu tun ....

┌────────────────────────────────────────────────────────────────── Pods(school-91001/userlog)[10] </app=userlog> ──────────────────────────────────────────────────────────────────┐
│ NAME↑                      PF  READY  STATUS     RESTARTS  CPU  MEM  %CPU/R  %CPU/L  %MEM/R  %MEM/L IP             NODE                                                    AGE    │
│ userlog-7b5758d8cb-6gt5g   ●   1/1    Running           0   79   97     526     n/a     194      97 10.96.6.61     shoot--420505--de-lasttest-worker-main-z1-5d8cf-lcmtl   13m    │
│ userlog-7b5758d8cb-8n46r   ●   1/1    Running           0  103   84     686     n/a     168      84 10.96.14.238   shoot--420505--de-lasttest-worker-main-z1-5d8cf-xmntq   14m    │
│ userlog-7b5758d8cb-9x5jx   ●   1/1    Running           0  107   75     713     n/a     150      75 10.96.2.134    shoot--420505--de-lasttest-worker-main-z1-5d8cf-klghn   12m    │
│ userlog-7b5758d8cb-9zbzx   ●   1/1    Running           0   75   73     500     n/a     146      73 10.96.8.117    shoot--420505--de-lasttest-worker-main-z1-5d8cf-t7565   13m    │
│ userlog-7b5758d8cb-85dpp   ●   1/1    Running           0   87   82     580     n/a     165      82 10.96.1.87     shoot--420505--de-lasttest-worker-main-z1-5d8cf-fwgmh   12m    │
│ userlog-7b5758d8cb-f24v6   ●   1/1    Running           0  130   86     866     n/a     173      86 10.96.17.153   shoot--420505--de-lasttest-worker-main-z1-5d8cf-cm6gp   14m    │
│ userlog-7b5758d8cb-fqkhp   ●   1/1    Running           0   70   98     466     n/a     197      98 10.96.9.236    shoot--420505--de-lasttest-worker-main-z1-5d8cf-2xfdv   13m    │
│ userlog-7b5758d8cb-g5j82   ●   1/1    Running           0   73   96     486     n/a     192      96 10.96.16.66    shoot--420505--de-lasttest-worker-main-z1-5d8cf-d5brg   27m    │
│ userlog-7b5758d8cb-gjbv2   ●   1/1    Running           0  104   88     693     n/a     176      88 10.96.0.23     shoot--420505--de-lasttest-worker-main-z1-5d8cf-qh9vb   27m    │
│ userlog-7b5758d8cb-jts9s   ●   1/1    Running           0   93   96     620     n/a     193      96 10.96.13.224   shoot--420505--de-lasttest-worker-main-z1-5d8cf-q4mzt   15m    │
┌────────────────────────────────────────────────────────────────── Pods(school-91001/userlog)[10] </app=userlog> ──────────────────────────────────────────────────────────────────┐
│ NAME↑                      PF  READY  STATUS     RESTARTS  CPU  MEM  %CPU/R  %CPU/L  %MEM/R  %MEM/L IP             NODE                                                    AGE    │
│ userlog-7b5758d8cb-6gt5g   ●   1/1    Running           0  114   90     760     n/a     181      90 10.96.6.61     shoot--420505--de-lasttest-worker-main-z1-5d8cf-lcmtl   17m    │
│ userlog-7b5758d8cb-8n46r   ●   1/1    Running           0   79   89     526     n/a     178      89 10.96.14.238   shoot--420505--de-lasttest-worker-main-z1-5d8cf-xmntq   18m    │
│ userlog-7b5758d8cb-9x5jx   ●   1/1    Running           0   99   77     660     n/a     154      77 10.96.2.134    shoot--420505--de-lasttest-worker-main-z1-5d8cf-klghn   16m    │
│ userlog-7b5758d8cb-9zbzx   ●   1/1    Running           0  101   83     673     n/a     166      83 10.96.8.117    shoot--420505--de-lasttest-worker-main-z1-5d8cf-t7565   17m    │
│ userlog-7b5758d8cb-85dpp   ●   1/1    Running           0   88   84     586     n/a     169      84 10.96.1.87     shoot--420505--de-lasttest-worker-main-z1-5d8cf-fwgmh   16m    │
│ userlog-7b5758d8cb-f24v6   ●   1/1    Running           0   95   88     633     n/a     177      88 10.96.17.153   shoot--420505--de-lasttest-worker-main-z1-5d8cf-cm6gp   18m    │
│ userlog-7b5758d8cb-fqkhp   ●   1/1    Running           0   94   92     626     n/a     184      92 10.96.9.236    shoot--420505--de-lasttest-worker-main-z1-5d8cf-2xfdv   17m    │
│ userlog-7b5758d8cb-g5j82   ●   1/1    Running           0   74   98     493     n/a     196      98 10.96.16.66    shoot--420505--de-lasttest-worker-main-z1-5d8cf-d5brg   30m    │
│ userlog-7b5758d8cb-gjbv2   ●   1/1    Running           0   88   87     586     n/a     175      87 10.96.0.23     shoot--420505--de-lasttest-worker-main-z1-5d8cf-qh9vb   30m    │
│ userlog-7b5758d8cb-jts9s   ●   1/1    Running           0   74   98     493     n/a     197      98 10.96.13.224   shoot--420505--de-lasttest-worker-main-z1-5d8cf-q4mzt   19m    │
┌────────────────────────────────────────────────────────────────── Pods(school-91001/userlog)[10] </app=userlog> ──────────────────────────────────────────────────────────────────┐
│ NAME↑                      PF  READY  STATUS     RESTARTS  CPU  MEM  %CPU/R  %CPU/L  %MEM/R  %MEM/L IP             NODE                                                    AGE    │
│ userlog-7b5758d8cb-6gt5g   ●   1/1    Running           0   82   92     546     n/a     184      92 10.96.6.61     shoot--420505--de-lasttest-worker-main-z1-5d8cf-lcmtl   52m    │
│ userlog-7b5758d8cb-8n46r   ●   1/1    Running           0  118   91     786     n/a     182      91 10.96.14.238   shoot--420505--de-lasttest-worker-main-z1-5d8cf-xmntq   53m    │
│ userlog-7b5758d8cb-9x5jx   ●   1/1    Running           0   83   82     553     n/a     164      82 10.96.2.134    shoot--420505--de-lasttest-worker-main-z1-5d8cf-klghn   51m    │
│ userlog-7b5758d8cb-9zbzx   ●   1/1    Running           0  114   81     760     n/a     162      81 10.96.8.117    shoot--420505--de-lasttest-worker-main-z1-5d8cf-t7565   52m    │
│ userlog-7b5758d8cb-85dpp   ●   1/1    Running           0   98   88     653     n/a     177      88 10.96.1.87     shoot--420505--de-lasttest-worker-main-z1-5d8cf-fwgmh   51m    │
│ userlog-7b5758d8cb-f24v6   ●   1/1    Running           0   96   95     640     n/a     190      95 10.96.17.153   shoot--420505--de-lasttest-worker-main-z1-5d8cf-cm6gp   53m    │
│ userlog-7b5758d8cb-fqkhp   ●   1/1    Running           0   71   95     473     n/a     190      95 10.96.9.236    shoot--420505--de-lasttest-worker-main-z1-5d8cf-2xfdv   52m    │
│ userlog-7b5758d8cb-g5j82   ●   1/1    Running           0   73   94     486     n/a     189      94 10.96.16.66    shoot--420505--de-lasttest-worker-main-z1-5d8cf-d5brg   66m    │
│ userlog-7b5758d8cb-gjbv2   ●   1/1    Running           0   77   95     513     n/a     191      95 10.96.0.23     shoot--420505--de-lasttest-worker-main-z1-5d8cf-qh9vb   66m    │
│ userlog-7b5758d8cb-jts9s   ●   1/1    Running           1    3   56      20     n/a     112      56 10.96.13.224   shoot--420505--de-lasttest-worker-main-z1-5d8cf-q4mzt   54m    │

restarting the deployment ... kills the load and all other services can scale down as well

┌────────────────────────────────────────────────────────────────────────── Deployments(school-91001)[34] ──────────────────────────────────────────────────────────────────────────┐
│ NAME↑                                                                   READY                         UP-TO-DATE                         AVAILABLE AGE                            │
│ antivirus                                                                 2/2                                  2                                 2 211d                           │
│ appprovider-bycs-office                                                   1/1                                  1                                 1 211d                           │
│ appregistry                                                               1/1                                  1                                 1 211d                           │
│ audit                                                                     2/2                                  2                                 2 211d                           │
│ authmachine                                                               2/2                                  2                                 2 211d                           │
│ authservice                                                               2/2                                  2                                 2 211d                           │
│ clientlog                                                                 2/2                                  2                                 2 211d                           │
│ eventhistory                                                              2/2                                  2                                 2 211d                           │
│ frontend                                                                  2/2                                  2                                 2 211d                           │
│ gateway                                                                   4/4                                  4                                 4 211d                           │
│ graph                                                                     2/2                                  2                                 2 211d                           │
│ groups                                                                    6/6                                  6                                 6 211d                           │
│ nats-box                                                                  1/1                                  1                                 1 211d                           │
│ notifications                                                             1/1                                  1                                 1 211d                           │
│ ocdav                                                                     2/2                                  2                                 2 211d                           │
│ ocs                                                                       2/2                                  2                                 2 211d                           │
│ policies                                                                  2/2                                  2                                 2 211d                           │
│ postprocessing                                                            2/2                                  2                                 2 211d                           │
│ proxy                                                                     2/2                                  2                                 2 211d                           │
│ search                                                                    1/1                                  1                                 1 211d                           │
│ settings                                                                  8/8                                  8                                 8 211d                           │
│ sharing                                                                   2/2                                  2                                 2 211d                           │
│ sse                                                                       2/2                                  2                                 2 211d                           │
│ storagepubliclink                                                         2/2                                  2                                 2 211d                           │
│ storageshares                                                             2/2                                  2                                 2 211d                           │
│ storagesystem                                                             5/5                                  5                                 5 211d                           │
│ storageusers                                                              6/6                                  6                                 6 211d                           │
│ thumbnails                                                                1/1                                  1                                 1 211d                           │
│ userlog                                                                 10/10                                 10                                10 211d                           │
│ users                                                                     8/8                                  8                                 8 211d                           │
│ web                                                                       2/2                                  2                                 2 211d                           │
│ webdav                                                                    2/2                                  2                                 2 211d                           │
│ webfinger                                                                 2/2                                  2                                 2 211d                           │
│ wopi-wopiserver                                                           2/2                                  2                                 2 211d                           │
┌────────────────────────────────────────────────────────────────────────── Deployments(school-91001)[34] ──────────────────────────────────────────────────────────────────────────┐
│ NAME↑                                                                   READY                         UP-TO-DATE                         AVAILABLE AGE                            │
│ antivirus                                                                 2/2                                  2                                 2 211d                           │
│ appprovider-bycs-office                                                   1/1                                  1                                 1 211d                           │
│ appregistry                                                               1/1                                  1                                 1 211d                           │
│ audit                                                                     2/2                                  2                                 2 211d                           │
│ authmachine                                                               2/2                                  2                                 2 211d                           │
│ authservice                                                               2/2                                  2                                 2 211d                           │
│ clientlog                                                                 2/2                                  2                                 2 211d                           │
│ eventhistory                                                              2/2                                  2                                 2 211d                           │
│ frontend                                                                  2/2                                  2                                 2 211d                           │
│ gateway                                                                   2/2                                  2                                 2 211d                           │
│ graph                                                                     2/2                                  2                                 2 211d                           │
│ groups                                                                    2/2                                  2                                 2 211d                           │
│ nats-box                                                                  1/1                                  1                                 1 211d                           │
│ notifications                                                             1/1                                  1                                 1 211d                           │
│ ocdav                                                                     2/2                                  2                                 2 211d                           │
│ ocs                                                                       2/2                                  2                                 2 211d                           │
│ policies                                                                  2/2                                  2                                 2 211d                           │
│ postprocessing                                                            2/2                                  2                                 2 211d                           │
│ proxy                                                                     2/2                                  2                                 2 211d                           │
│ search                                                                    1/1                                  1                                 1 211d                           │
│ settings                                                                  2/2                                  2                                 2 211d                           │
│ sharing                                                                   2/2                                  2                                 2 211d                           │
│ sse                                                                       2/2                                  2                                 2 211d                           │
│ storagepubliclink                                                         2/2                                  2                                 2 211d                           │
│ storageshares                                                             2/2                                  2                                 2 211d                           │
│ storagesystem                                                             5/5                                  5                                 5 211d                           │
│ storageusers                                                              2/2                                  2                                 2 211d                           │
│ thumbnails                                                                1/1                                  1                                 1 211d                           │
│ userlog                                                                   2/2                                  2                                 2 211d                           │
│ users                                                                     2/2                                  2                                 2 211d                           │
│ web                                                                       2/2                                  2                                 2 211d                           │
│ webdav                                                                    2/2                                  2                                 2 211d                           │
│ webfinger                                                                 2/2                                  2                                 2 211d                           │
│ wopi-wopiserver                                                           2/2                                  2                                 2 211d                           │

hm storage-system also seems busy

┌──────────────────────────────────────────────────────────── Pods(school-91001/storagesystem)[5] </app=storagesystem> ─────────────────────────────────────────────────────────────┐
│ NAME↑                          PF READY STATUS   RESTARTS  CPU  MEM  %CPU/R  %CPU/L  %MEM/R  %MEM/L IP             NODE                                                    AGE    │
│ storagesystem-659f98589-2pc6x  ●  1/1   Running         0   39   53      78     n/a      70      17 10.96.1.243    shoot--420505--de-lasttest-worker-main-z1-5d8cf-fwgmh   14m    │
│ storagesystem-659f98589-7dcxq  ●  1/1   Running         0   36   63      72     n/a      85      21 10.96.14.15    shoot--420505--de-lasttest-worker-main-z1-5d8cf-xmntq   69m    │
│ storagesystem-659f98589-phzwl  ●  1/1   Running         0  185   84     370     n/a     112      28 10.96.16.205   shoot--420505--de-lasttest-worker-main-z1-5d8cf-d5brg   81m    │
│ storagesystem-659f98589-pkrl7  ●  1/1   Running         0  101   76     202     n/a     101      25 10.96.13.225   shoot--420505--de-lasttest-worker-main-z1-5d8cf-q4mzt   69m    │
│ storagesystem-659f98589-r2tr7  ●  1/1   Running         0  136   66     272     n/a      88      22 10.96.8.225    shoot--420505--de-lasttest-worker-main-z1-5d8cf-t7565   69m    │

hm the sharing service tries to get received shares

┌──────────────────────────────────────────────────────────── Logs(school-91001/sharing-84944f47c6-hk65q:sharing)[tail] ────────────────────────────────────────────────────────────┐
│                                                         Autoscroll:Off     FullScreen:Off     Timestamps:Off     Wrap:On                                                          │
│ rovider/usershareprovider.go:488","message":"error getting received share"}                                                                                                       │
│ {"level":"error","service":"sharing","pkg":"rgrpc","traceid":"8305329fbea49382070acc1a411301d3","error":"error: not found: opaque_id:\"55b65915-d282-4be3-88f7-a07a3d092c01:b208f │
│ 68f-40e3-4333-bcd3-7711302d80d8:f23f736d-17a0-4ab9-a78b-ccf264b498af\"","time":"2024-09-12T11:11:51Z","line":"github.com/cs3org/reva/[email protected]/internal/grpc/services/usersharep │
│ rovider/usershareprovider.go:488","message":"error getting received share"}                                                                                                       │
│ {"level":"error","service":"sharing","pkg":"rgrpc","traceid":"30e1cf60fd15db313a62bf47a40ea4e3","error":"error: not found: opaque_id:\"55b65915-d282-4be3-88f7-a07a3d092c01:b208f │
│ 68f-40e3-4333-bcd3-7711302d80d8:439a51c6-e891-4a57-827a-3355e7bd865d\"","time":"2024-09-12T11:11:51Z","line":"github.com/cs3org/reva/[email protected]/internal/grpc/services/usersharep │
│ rovider/usershareprovider.go:488","message":"error getting received share"}                                                                                                       │
│ {"level":"error","service":"sharing","pkg":"rgrpc","traceid":"fc7a5171c3038b65d8698d5a7154f347","error":"error: not found: opaque_id:\"55b65915-d282-4be3-88f7-a07a3d092c01:b208f │
│ 68f-40e3-4333-bcd3-7711302d80d8:439a51c6-e891-4a57-827a-3355e7bd865d\"","time":"2024-09-12T11:11:51Z","line":"github.com/cs3org/reva/[email protected]/internal/grpc/services/usersharep │
│ rovider/usershareprovider.go:488","message":"error getting received share"}                                                                                                       │
│ {"level":"error","service":"sharing","pkg":"rgrpc","traceid":"ad3623166893f7c5b208972d42776b05","error":"error: not found: opaque_id:\"55b65915-d282-4be3-88f7-a07a3d092c01:b208f │
│ 68f-40e3-4333-bcd3-7711302d80d8:439a51c6-e891-4a57-827a-3355e7bd865d\"","time":"2024-09-12T11:11:51Z","line":"github.com/cs3org/reva/[email protected]/internal/grpc/services/usersharep │
│ rovider/usershareprovider.go:488","message":"error getting received share"}                                                                                                       │
│ {"level":"error","service":"sharing","pkg":"rgrpc","traceid":"4b0fbbfd0295fd8ff5cf16b3fc744f7f","error":"error: not found: opaque_id:\"55b65915-d282-4be3-88f7-a07a3d092c01:b208f │
│ 68f-40e3-4333-bcd3-7711302d80d8:f23f736d-17a0-4ab9-a78b-ccf264b498af\"","time":"2024-09-12T11:11:51Z","line":"github.com/cs3org/reva/[email protected]/internal/grpc/services/usersharep │
│ rovider/usershareprovider.go:488","message":"error getting received share"}                                                                                                       │
│ {"level":"error","service":"sharing","pkg":"rgrpc","traceid":"89216a5ff7316070f292a0572228f792","error":"error: not found: opaque_id:\"55b65915-d282-4be3-88f7-a07a3d092c01:b208f │
│ 68f-40e3-4333-bcd3-7711302d80d8:439a51c6-e891-4a57-827a-3355e7bd865d\"","time":"2024-09-12T11:11:52Z","line":"github.com/cs3org/reva/[email protected]/internal/grpc/services/usersharep │
│ rovider/usershareprovider.go:488","message":"error getting received share"}                                                                                                       │
│ {"level":"error","service":"sharing","pkg":"rgrpc","traceid":"30dd83935491d3b4a11f6e5dfbb06228","error":"error: not found: opaque_id:\"55b65915-d282-4be3-88f7-a07a3d092c01:b208f │
│ 68f-40e3-4333-bcd3-7711302d80d8:439a51c6-e891-4a57-827a-3355e7bd865d\"","time":"2024-09-12T11:11:52Z","line":"github.com/cs3org/reva/[email protected]/internal/grpc/services/usersharep │
│ rovider/usershareprovider.go:488","message":"error getting received share"}                                                                                                       │
│ {"level":"error","service":"sharing","pkg":"rgrpc","traceid":"88e88942394de3fbd72809acd11528cc","error":"error: not found: opaque_id:\"55b65915-d282-4be3-88f7-a07a3d092c01:b208f │
│ 68f-40e3-4333-bcd3-7711302d80d8:f23f736d-17a0-4ab9-a78b-ccf264b498af\"","time":"2024-09-12T11:11:52Z","line":"github.com/cs3org/reva/[email protected]/internal/grpc/services/usersharep │
│ rovider/usershareprovider.go:488","message":"error getting received share"}                                                                                                       │
│ {"level":"error","service":"sharing","pkg":"rgrpc","traceid":"20befebb192e0e71c557b8e1ccb89a98","error":"error: not found: opaque_id:\"55b65915-d282-4be3-88f7-a07a3d092c01:b208f │
│ 68f-40e3-4333-bcd3-7711302d80d8:439a51c6-e891-4a57-827a-3355e7bd865d\"","time":"2024-09-12T11:11:52Z","line":"github.com/cs3org/reva/[email protected]/internal/grpc/services/usersharep │
│ rovider/usershareprovider.go:488","message":"error getting received share"}                                                                                                       │
│                                                                                                                                                                                   │

from the frontend

┌─────────────────────────────────────────────────────────────────── Logs(school-91001/frontend:frontend)[tail] ────────────────────────────────────────────────────────────────────┐
│                                                         Autoscroll:Off     FullScreen:Off     Timestamps:Off     Wrap:On                                                          │
│ frontend-7c8969fbbb-xzfck {"level":"error","service":"frontend","status":{"code":6,"message":"error getting received share","trace":"584ec5e4efae632ea344ca9075710d66"},"userid": │
│ "411b9ad0-95da-4b59-9848-671d36bf9e7f","time":"2024-09-12T11:14:45Z","line":"github.com/owncloud/ocis/v2/services/frontend/pkg/command/events.go:137","message":"unexpected statu │
│ s code while accepting share"}                                                                                                                                                    │
│ frontend-7c8969fbbb-xzfck {"level":"error","service":"frontend","status":{"code":6,"message":"error getting received share","trace":"e4bda889637ffb871773bdc30bc660ee"},"userid": │
│ "f6e1e884-a4ee-4b7d-a68a-5cffcfb0c614","time":"2024-09-12T11:14:45Z","line":"github.com/owncloud/ocis/v2/services/frontend/pkg/command/events.go:137","message":"unexpected statu │
│ s code while accepting share"}                                                                                                                                                    │
│ frontend-7c8969fbbb-bfkbr {"level":"error","service":"frontend","status":{"code":6,"message":"error getting received share","trace":"f464ebb5144bc1d251decd1fa928b2ce"},"userid": │
│ "fcf77ab0-1140-46ad-82a1-23bd86659955","time":"2024-09-12T11:14:45Z","line":"github.com/owncloud/ocis/v2/services/frontend/pkg/command/events.go:137","message":"unexpected statu │
│ s code while accepting share"}                                                                                                                                                    │
│ frontend-7c8969fbbb-xzfck {"level":"error","service":"frontend","status":{"code":6,"message":"error getting received share","trace":"7b3a501b2fb5f3ae15225f35b47b06d3"},"userid": │
│ "ad217814-0571-4ad2-ae7a-00d332552ea2","time":"2024-09-12T11:14:45Z","line":"github.com/owncloud/ocis/v2/services/frontend/pkg/command/events.go:137","message":"unexpected statu │
│ s code while accepting share"}                                                                                                                                                    │
│ frontend-7c8969fbbb-bfkbr {"level":"error","service":"frontend","status":{"code":6,"message":"error getting received share","trace":"e14d1a00e9d49aa269581f44ec719996"},"userid": │
│ "9ca0356d-c2cd-4c0e-a758-00b68b315e11","time":"2024-09-12T11:14:45Z","line":"github.com/owncloud/ocis/v2/services/frontend/pkg/command/events.go:137","message":"unexpected statu │
│ s code while accepting share"}                                                                                                                                                    │
│ frontend-7c8969fbbb-xzfck nats: slow consumer, messages dropped on connection [41261574] for subscription on "main-queue"                                                         │
│ frontend-7c8969fbbb-xzfck {"level":"error","service":"frontend","status":{"code":6,"message":"error getting received share","trace":"f734c7466f4da655a817f34256101f2d"},"userid": │
│ "db96dfd9-b4db-4279-8e70-791ea5490495","time":"2024-09-12T11:14:45Z","line":"github.com/owncloud/ocis/v2/services/frontend/pkg/command/events.go:137","message":"unexpected statu │
│ s code while accepting share"}                                                                                                                                                    │
│ frontend-7c8969fbbb-bfkbr {"level":"error","service":"frontend","status":{"code":6,"message":"error getting received share","trace":"7ac371ad25d1bc0b2f49570e797c5755"},"userid": │
│ "790c4571-bd01-4d83-b4d9-04330e29e541","time":"2024-09-12T11:14:45Z","line":"github.com/owncloud/ocis/v2/services/frontend/pkg/command/events.go:137","message":"unexpected statu │
│ s code while accepting share"}                                                                                                                                                    │
│ frontend-7c8969fbbb-xzfck {"level":"error","service":"frontend","status":{"code":6,"message":"error getting received share","trace":"f3163cbf764e7d41a847cb9d7eedb32f"},"userid": │
│ "c8eb2f77-b90e-4ccd-a827-66d618da2250","time":"2024-09-12T11:14:45Z","line":"github.com/owncloud/ocis/v2/services/frontend/pkg/command/events.go:137","message":"unexpected statu │
│ s code while accepting share"}                                                                                                                                                    │
│ frontend-7c8969fbbb-bfkbr {"level":"error","service":"frontend","status":{"code":6,"message":"error getting received share","trace":"bcb0b5c8aa758427410df8af690a7dc6"},"userid": │
│ "5aa43c8d-502c-421e-9191-355b8bd632a4","time":"2024-09-12T11:14:45Z","line":"github.com/owncloud/ocis/v2/services/frontend/pkg/command/events.go:137","message":"unexpected statu │
│ s code while accepting share"}                                                                                                                                                    │

ooohhhhhh panicks in the gateway

gateway-7f5bf548b5-6ngtm {"level":"error","service":"gateway","pkg":"rgrpc","traceid":"aa17d3149cdf7a0b3af481b7779c78e8","user-agent":"grpc-go/1.65.0","from":"tcp://10.96.14.238 │
│ :59376","uri":"/cs3.gateway.v1beta1.GatewayAPI/ListStorageSpaces","start":"12/Sep/2024:10:51:38 +0000","end":"12/Sep/2024:10:51:38 +0000","time_ns":526572,"code":"Internal","tim │
│ e":"2024-09-12T10:51:38Z","line":"github.com/cs3org/reva/[email protected]/internal/grpc/interceptors/log/log.go:69","message":"rpc error: code = Internal desc = user not found in cont │
│ ext"}                                                                                                                                                                             │
│ gateway-7f5bf548b5-6ngtm goroutine 631110 [running]:                                                                                                                              │
│ gateway-7f5bf548b5-6ngtm runtime/debug.Stack()                                                                                                                                    │
│ gateway-7f5bf548b5-6ngtm     runtime/debug/stack.go:26 +0x5e                                                                                                                      │
│ gateway-7f5bf548b5-6ngtm runtime/debug.PrintStack()                                                                                                                               │
│ gateway-7f5bf548b5-6ngtm     runtime/debug/stack.go:18 +0x13                                                                                                                      │
│ gateway-7f5bf548b5-6ngtm github.com/cs3org/reva/v2/internal/grpc/interceptors/recovery.recoveryFunc({0x4d70bb8, 0xc002ac27e0}, {0x45ed900, 0x4d3ead0})                            │
│ gateway-7f5bf548b5-6ngtm     github.com/cs3org/reva/[email protected]/internal/grpc/interceptors/recovery/recovery.go:48 +0x34                                                           │
│ gateway-7f5bf548b5-6ngtm github.com/grpc-ecosystem/go-grpc-middleware/recovery.recoverFrom({0x4d70bb8?, 0xc002ac27e0?}, {0x45ed900?, 0x4d3ead0?}, 0xc0028a8ad8?)                  │
│ gateway-7f5bf548b5-6ngtm     github.com/grpc-ecosystem/[email protected]/recovery/interceptors.go:61 +0x30                                                                │
│ gateway-7f5bf548b5-6ngtm github.com/grpc-ecosystem/go-grpc-middleware/recovery.UnaryServerInterceptor.func1.1()                                                                   │
│ gateway-7f5bf548b5-6ngtm     github.com/grpc-ecosystem/[email protected]/recovery/interceptors.go:29 +0x75                                                                │
│ gateway-7f5bf548b5-6ngtm panic({0x45ed900?, 0x4d3ead0?})                                                                                                                          │
│ gateway-7f5bf548b5-6ngtm     runtime/panic.go:785 +0x132                                                                                                                          │
│ gateway-7f5bf548b5-6ngtm github.com/cs3org/reva/v2/pkg/ctx.ContextMustGetUser({0x4d70bb8?, 0xc002ac27e0?})                                                                        │
│ gateway-7f5bf548b5-6ngtm     github.com/cs3org/reva/[email protected]/pkg/ctx/userctx.go:48 +0x58                                                                                        │
│ gateway-7f5bf548b5-6ngtm github.com/cs3org/reva/v2/internal/grpc/services/gateway.(*cachedRegistryClient).ListStorageProviders(0xc002ed1300, {0x4d70bb8, 0xc002ac27e0}, 0xc002ac2 │
│ 990, {0x0, 0x0, 0x0})                                                                                                                                                             │
│ gateway-7f5bf548b5-6ngtm     github.com/cs3org/reva/[email protected]/internal/grpc/services/gateway/storageprovidercache.go:47 +0x94                                                    │
│ gateway-7f5bf548b5-6ngtm github.com/cs3org/reva/v2/internal/grpc/services/gateway.(*svc).ListStorageSpaces(0xc000624d00, {0x4d70bb8, 0xc002ac27e0}, 0xc001074bd0)                 │
│ gateway-7f5bf548b5-6ngtm     github.com/cs3org/reva/[email protected]/internal/grpc/services/gateway/storageprovider.go:287 +0xb16                                                       │
│ gateway-7f5bf548b5-6ngtm github.com/cs3org/go-cs3apis/cs3/gateway/v1beta1._GatewayAPI_ListStorageSpaces_Handler.func1({0x4d70bb8?, 0xc002ac27e0?}, {0x4af65a0?, 0xc001074bd0?})   │
│ gateway-7f5bf548b5-6ngtm     github.com/cs3org/[email protected]/cs3/gateway/v1beta1/gateway_api_grpc.pb.go:2597 +0xce

butonic avatar Sep 12 '24 13:09 butonic

after making userlog and frontend more concurrent still broken -> http_req_failed................: 35.86% ✓ 1662 ✗ 2972

# k6 run ~/cdperf/packages/k6-tests/artifacts/koko-platform-000-mixed-ramping-k6.js

          /\      |‾‾| /‾‾/   /‾‾/   
     /\  /  \     |  |/  /   /  /    
    /  \/    \    |     (   /   ‾‾\  
   /          \   |  |\  \ |  (‾)  | 
  / __________ \  |__| \__\ \_____/ .io

  execution: local
     script: /root/cdperf/packages/k6-tests/artifacts/koko-platform-000-mixed-ramping-k6.js
     output: -

  scenarios: (100.00%) 8 scenarios, 75 max VUs, 6m30s max duration (incl. graceful stop):
           * add_remove_tag_100: Up to 5 looping VUs for 6m0s over 3 stages (gracefulRampDown: 30s, exec: add_remove_tag_100, gracefulStop: 30s)
           * create_remove_group_share_090: Up to 5 looping VUs for 6m0s over 3 stages (gracefulRampDown: 30s, exec: create_remove_group_share_090, gracefulStop: 30s)
           * create_space_080: Up to 5 looping VUs for 6m0s over 3 stages (gracefulRampDown: 30s, exec: create_space_080, gracefulStop: 30s)
           * create_upload_rename_delete_folder_and_file_040: Up to 5 looping VUs for 6m0s over 3 stages (gracefulRampDown: 30s, exec: create_upload_rename_delete_folder_and_file_040, gracefulStop: 30s)
           * download_050: Up to 5 looping VUs for 6m0s over 3 stages (gracefulRampDown: 30s, exec: download_050, gracefulStop: 30s)
           * navigate_file_tree_020: Up to 10 looping VUs for 6m0s over 3 stages (gracefulRampDown: 30s, exec: navigate_file_tree_020, gracefulStop: 30s)
           * sync_client_110: Up to 20 looping VUs for 6m0s over 3 stages (gracefulRampDown: 30s, exec: sync_client_110, gracefulStop: 30s)
           * user_group_search_070: Up to 20 looping VUs for 6m0s over 3 stages (gracefulRampDown: 30s, exec: user_group_search_070, gracefulStop: 30s)

WARN[0145] Request Failed                                error="Post \"https://91001.drive.cloudspeicher-bayern.de/graph/v1beta1/drives/55b65915-d282-4be3-88f7-a07a3d092c01$107d95ef-65f2-44a4-b124-5fd477153d27/items/55b65915-d282-4be3-88f7-a07a3d092c01$107d95ef-65f2-44a4-b124-5fd477153d27!f15571bf-de16-424e-90d9-559bfa923826/invite\": unexpected EOF"
WARN[0225] Request Failed                                error="Post \"https://91001.drive.cloudspeicher-bayern.de/graph/v1beta1/drives/55b65915-d282-4be3-88f7-a07a3d092c01$390c628e-d9b5-451f-9f80-53d7979cdfd4/items/55b65915-d282-4be3-88f7-a07a3d092c01$390c628e-d9b5-451f-9f80-53d7979cdfd4!1dc53212-b332-40be-a5ab-9ca09f5d4931/invite\": unexpected EOF"
WARN[0234] Request Failed                                error="Delete \"https://91001.drive.cloudspeicher-bayern.de/graph/v1beta1/drives/55b65915-d282-4be3-88f7-a07a3d092c01$107d95ef-65f2-44a4-b124-5fd477153d27/items/55b65915-d282-4be3-88f7-a07a3d092c01$107d95ef-65f2-44a4-b124-5fd477153d27!ec7914d2-4c07-4ac6-b515-016c30573635/permissions/55b65915-d282-4be3-88f7-a07a3d092c01:107d95ef-65f2-44a4-b124-5fd477153d27:f520493c-0e4f-441f-b034-7604fbe82d23\": unexpected EOF"
WARN[0253] Request Failed                                error="stream error: stream ID 27; INTERNAL_ERROR; received from peer"
WARN[0313] Request Failed                                error="Post \"https://91001.drive.cloudspeicher-bayern.de/graph/v1beta1/drives/55b65915-d282-4be3-88f7-a07a3d092c01$390c628e-d9b5-451f-9f80-53d7979cdfd4/items/55b65915-d282-4be3-88f7-a07a3d092c01$390c628e-d9b5-451f-9f80-53d7979cdfd4!9c1d7aa0-8d00-4531-a6ba-111c753b0f07/invite\": unexpected EOF"
WARN[0350] Request Failed                                error="Delete \"https://91001.drive.cloudspeicher-bayern.de/graph/v1beta1/drives/55b65915-d282-4be3-88f7-a07a3d092c01$107d95ef-65f2-44a4-b124-5fd477153d27/items/55b65915-d282-4be3-88f7-a07a3d092c01$107d95ef-65f2-44a4-b124-5fd477153d27!bc3e79b6-ccb8-4e89-b5cb-a7252317bb95/permissions/55b65915-d282-4be3-88f7-a07a3d092c01:107d95ef-65f2-44a4-b124-5fd477153d27:5735f288-3c1e-4d5c-bdbd-939ef812cded\": unexpected EOF"

     ✓ authn -> loginPageResponse - status
     ✓ authn -> authorizationResponse - status
     ✓ authn -> accessTokenResponse - status
     ✗ client -> user.findUser - status
      ↳  90% — ✓ 363 / ✗ 40
     ✗ client -> role.getMyDrives - status
      ↳  68% — ✓ 34 / ✗ 16
     ✗ client -> resource.getResourceProperties - status
      ↳  56% — ✓ 1630 / ✗ 1273
     ✗ client -> application.createDrive - status
      ↳  93% — ✓ 71 / ✗ 5
     ✗ client -> resource.createResource - status
      ↳  58% — ✓ 47 / ✗ 33
     ✗ client -> drive.deactivateDrive - status
      ↳  82% — ✓ 63 / ✗ 13
     ✗ client -> tag.addTagToResource - status
      ↳  0% — ✓ 0 / ✗ 33
     ✗ client -> drive.deleteDrive - status
      ↳  80% — ✓ 61 / ✗ 15
     ✗ client -> group.findGroup - status
      ↳  91% — ✓ 33 / ✗ 3
     ✗ client -> tag.getTagForResource - status
      ↳  18% — ✓ 6 / ✗ 26
     ✗ test -> resource.getTags - name - match
      ↳  0% — ✓ 0 / ✗ 32
     ✗ client -> tag.removeTagToResource - status
      ↳  0% — ✓ 0 / ✗ 32
     ✗ client -> resource.downloadResource - status
      ↳  51% — ✓ 50 / ✗ 48
     ✗ client -> share.createShareInvitation - status
      ↳  22% — ✓ 8 / ✗ 27
     ✗ client -> resource.uploadResource - status
      ↳  59% — ✓ 26 / ✗ 18
     ✗ client -> share.deleteShareInvitation - status
      ↳  11% — ✓ 4 / ✗ 31
     ✗ client -> resource.moveResource - status
      ↳  70% — ✓ 31 / ✗ 13
     ✗ client -> resource.deleteResource - status
      ↳  54% — ✓ 43 / ✗ 36

     checks.........................: 61.85% ✓ 2747      ✗ 1694
     data_received..................: 752 MB 1.9 MB/s
     data_sent......................: 829 MB 2.1 MB/s
     http_req_blocked...............: avg=970.81µs min=200ns   med=270ns    max=59.07ms  p(90)=591ns    p(95)=1.23µs  
     http_req_connecting............: avg=348.55µs min=0s      med=0s       max=21.11ms  p(90)=0s       p(95)=0s      
     http_req_duration..............: avg=531.07ms min=2.64ms  med=211ms    max=50.01s   p(90)=636.98ms p(95)=990.44ms
       { expected_response:true }...: avg=476.14ms min=2.64ms  med=273.89ms max=29.33s   p(90)=795.11ms p(95)=1.2s    
     http_req_failed................: 35.86% ✓ 1662      ✗ 2972
     http_req_receiving.............: avg=6.5ms    min=0s      med=92.81µs  max=5.99s    p(90)=189.06µs p(95)=9.27ms  
     http_req_sending...............: avg=1.75ms   min=27.74µs med=95.9µs   max=837.85ms p(90)=162.01µs p(95)=196.22µs
     http_req_tls_handshaking.......: avg=605.01µs min=0s      med=0s       max=35.69ms  p(90)=0s       p(95)=0s      
     http_req_waiting...............: avg=522.81ms min=2.55ms  med=201.95ms max=50.01s   p(90)=624.68ms p(95)=973.07ms
     http_reqs......................: 4634   11.982131/s
     iteration_duration.............: avg=6.79s    min=1.02s   med=1.29s    max=1m54s    p(90)=15.61s   p(95)=18.32s  
     iterations.....................: 3400   8.791378/s
     vus............................: 1      min=0       max=75
     vus_max........................: 75     min=75      max=75


running (6m26.7s), 00/75 VUs, 3400 complete and 4 interrupted iterations
add_remove_tag_100             ✓ [======================================] 0/5 VUs    6m0s
create_remove_group_share_090  ✓ [======================================] 0/5 VUs    6m0s
create_space_080               ✓ [======================================] 0/5 VUs    6m0s
create_upload_rename_delete... ✓ [======================================] 0/5 VUs    6m0s
download_050                   ✓ [======================================] 0/5 VUs    6m0s
navigate_file_tree_020         ✓ [======================================] 00/10 VUs  6m0s
sync_client_110                ✓ [======================================] 00/20 VUs  6m0s
user_group_search_070          ✓ [======================================] 00/20 VUs  6m0s

butonic avatar Sep 12 '24 13:09 butonic

rerunning after things settled down -> getting worse: http_req_failed................: 52.12% ✓ 2110 ✗ 1938

# k6 run ~/cdperf/packages/k6-tests/artifacts/koko-platform-000-mixed-ramping-k6.js

          /\      |‾‾| /‾‾/   /‾‾/   
     /\  /  \     |  |/  /   /  /    
    /  \/    \    |     (   /   ‾‾\  
   /          \   |  |\  \ |  (‾)  | 
  / __________ \  |__| \__\ \_____/ .io

  execution: local
     script: /root/cdperf/packages/k6-tests/artifacts/koko-platform-000-mixed-ramping-k6.js
     output: -

  scenarios: (100.00%) 8 scenarios, 75 max VUs, 6m30s max duration (incl. graceful stop):
           * add_remove_tag_100: Up to 5 looping VUs for 6m0s over 3 stages (gracefulRampDown: 30s, exec: add_remove_tag_100, gracefulStop: 30s)
           * create_remove_group_share_090: Up to 5 looping VUs for 6m0s over 3 stages (gracefulRampDown: 30s, exec: create_remove_group_share_090, gracefulStop: 30s)
           * create_space_080: Up to 5 looping VUs for 6m0s over 3 stages (gracefulRampDown: 30s, exec: create_space_080, gracefulStop: 30s)
           * create_upload_rename_delete_folder_and_file_040: Up to 5 looping VUs for 6m0s over 3 stages (gracefulRampDown: 30s, exec: create_upload_rename_delete_folder_and_file_040, gracefulStop: 30s)
           * download_050: Up to 5 looping VUs for 6m0s over 3 stages (gracefulRampDown: 30s, exec: download_050, gracefulStop: 30s)
           * navigate_file_tree_020: Up to 10 looping VUs for 6m0s over 3 stages (gracefulRampDown: 30s, exec: navigate_file_tree_020, gracefulStop: 30s)
           * sync_client_110: Up to 20 looping VUs for 6m0s over 3 stages (gracefulRampDown: 30s, exec: sync_client_110, gracefulStop: 30s)
           * user_group_search_070: Up to 20 looping VUs for 6m0s over 3 stages (gracefulRampDown: 30s, exec: user_group_search_070, gracefulStop: 30s)


     ✓ authn -> loginPageResponse - status
     ✓ authn -> authorizationResponse - status
     ✓ authn -> accessTokenResponse - status
     ✗ client -> user.findUser - status
      ↳  80% — ✓ 267 / ✗ 65
     ✗ client -> role.getMyDrives - status
      ↳  56% — ✓ 28 / ✗ 22
     ✗ client -> resource.getResourceProperties - status
      ↳  36% — ✓ 898 / ✗ 1532
     ✗ client -> resource.downloadResource - status
      ↳  38% — ✓ 35 / ✗ 55
     ✗ client -> application.createDrive - status
      ↳  44% — ✓ 31 / ✗ 39
     ✗ client -> resource.createResource - status
      ↳  38% — ✓ 30 / ✗ 48
     ✗ client -> drive.deactivateDrive - status
      ↳  38% — ✓ 27 / ✗ 43
     ✗ client -> resource.uploadResource - status
      ↳  33% — ✓ 14 / ✗ 28
     ✗ client -> tag.addTagToResource - status
      ↳  0% — ✓ 0 / ✗ 30
     ✗ client -> drive.deleteDrive - status
      ↳  32% — ✓ 23 / ✗ 47
     ✗ client -> resource.moveResource - status
      ↳  38% — ✓ 16 / ✗ 26
     ✗ client -> tag.getTagForResource - status
      ↳  36% — ✓ 11 / ✗ 19
     ✗ test -> resource.getTags - name - match
      ↳  0% — ✓ 0 / ✗ 30
     ✗ client -> tag.removeTagToResource - status
      ↳  0% — ✓ 0 / ✗ 30
     ✗ client -> resource.deleteResource - status
      ↳  30% — ✓ 24 / ✗ 54
     ✗ client -> group.findGroup - status
      ↳  77% — ✓ 28 / ✗ 8
     ✗ client -> share.createShareInvitation - status
      ↳  16% — ✓ 6 / ✗ 30
     ✗ client -> share.deleteShareInvitation - status
      ↳  5% — ✓ 2 / ✗ 34

     checks.........................: 44.45% ✓ 1713      ✗ 2140
     data_received..................: 600 MB 1.5 MB/s
     data_sent......................: 789 MB 2.0 MB/s
     http_req_blocked...............: avg=1.05ms   min=210ns   med=270ns    max=81.99ms p(90)=581ns    p(95)=3.22µs  
     http_req_connecting............: avg=371.44µs min=0s      med=0s       max=24.81ms p(90)=0s       p(95)=0s      
     http_req_duration..............: avg=1.09s    min=2.87ms  med=137.23ms max=30.54s  p(90)=739.97ms p(95)=3.21s   
       { expected_response:true }...: avg=849.72ms min=2.87ms  med=233.8ms  max=30.54s  p(90)=514.01ms p(95)=1.01s   
     http_req_failed................: 52.12% ✓ 2110      ✗ 1938
     http_req_receiving.............: avg=4.86ms   min=25.03µs med=85.37µs  max=2.76s   p(90)=166.93µs p(95)=270.54µs
     http_req_sending...............: avg=2.27ms   min=27.2µs  med=90.26µs  max=2.37s   p(90)=159.05µs p(95)=194.1µs 
     http_req_tls_handshaking.......: avg=665.6µs  min=0s      med=0s       max=64.35ms p(90)=0s       p(95)=0s      
     http_req_waiting...............: avg=1.09s    min=2.75ms  med=135.99ms max=30.54s  p(90)=641.06ms p(95)=3.21s   
     http_reqs......................: 4048   10.410742/s
     iteration_duration.............: avg=8.09s    min=1.03s   med=1.23s    max=1m25s   p(90)=17.35s   p(95)=34.62s  
     iterations.....................: 2849   7.327125/s
     vus............................: 1      min=0       max=75
     vus_max........................: 75     min=75      max=75


running (6m28.8s), 00/75 VUs, 2849 complete and 1 interrupted iterations
add_remove_tag_100             ✓ [======================================] 0/5 VUs    6m0s
create_remove_group_share_090  ✓ [======================================] 0/5 VUs    6m0s
create_space_080               ✓ [======================================] 0/5 VUs    6m0s
create_upload_rename_delete... ✓ [======================================] 0/5 VUs    6m0s
download_050                   ✓ [======================================] 0/5 VUs    6m0s
navigate_file_tree_020         ✓ [======================================] 00/10 VUs  6m0s
sync_client_110                ✓ [======================================] 00/20 VUs  6m0s
user_group_search_070          ✓ [======================================] 00/20 VUs  6m0s

butonic avatar Sep 12 '24 13:09 butonic

and another run after things settled down ... getting worse http_req_failed................: 80.37% ✓ 2936 ✗ 717

k6 run ~/cdperf/packages/k6-tests/artifacts/koko-platform-000-mixed-ramping-k6.js

          /\      |‾‾| /‾‾/   /‾‾/   
     /\  /  \     |  |/  /   /  /    
    /  \/    \    |     (   /   ‾‾\  
   /          \   |  |\  \ |  (‾)  | 
  / __________ \  |__| \__\ \_____/ .io

  execution: local
     script: /root/cdperf/packages/k6-tests/artifacts/koko-platform-000-mixed-ramping-k6.js
     output: -

  scenarios: (100.00%) 8 scenarios, 75 max VUs, 6m30s max duration (incl. graceful stop):
           * add_remove_tag_100: Up to 5 looping VUs for 6m0s over 3 stages (gracefulRampDown: 30s, exec: add_remove_tag_100, gracefulStop: 30s)
           * create_remove_group_share_090: Up to 5 looping VUs for 6m0s over 3 stages (gracefulRampDown: 30s, exec: create_remove_group_share_090, gracefulStop: 30s)
           * create_space_080: Up to 5 looping VUs for 6m0s over 3 stages (gracefulRampDown: 30s, exec: create_space_080, gracefulStop: 30s)
           * create_upload_rename_delete_folder_and_file_040: Up to 5 looping VUs for 6m0s over 3 stages (gracefulRampDown: 30s, exec: create_upload_rename_delete_folder_and_file_040, gracefulStop: 30s)
           * download_050: Up to 5 looping VUs for 6m0s over 3 stages (gracefulRampDown: 30s, exec: download_050, gracefulStop: 30s)
           * navigate_file_tree_020: Up to 10 looping VUs for 6m0s over 3 stages (gracefulRampDown: 30s, exec: navigate_file_tree_020, gracefulStop: 30s)
           * sync_client_110: Up to 20 looping VUs for 6m0s over 3 stages (gracefulRampDown: 30s, exec: sync_client_110, gracefulStop: 30s)
           * user_group_search_070: Up to 20 looping VUs for 6m0s over 3 stages (gracefulRampDown: 30s, exec: user_group_search_070, gracefulStop: 30s)


     ✓ authn -> loginPageResponse - status
     ✓ authn -> authorizationResponse - status
     ✓ authn -> accessTokenResponse - status
     ✗ client -> user.findUser - status
      ↳  19% — ✓ 66 / ✗ 265
     ✗ client -> role.getMyDrives - status
      ↳  42% — ✓ 21 / ✗ 29
     ✗ client -> resource.getResourceProperties - status
      ↳  5% — ✓ 116 / ✗ 1882
     ✗ client -> resource.createResource - status
      ↳  2% — ✓ 2 / ✗ 75
     ✗ client -> application.createDrive - status
      ↳  2% — ✓ 2 / ✗ 68
     ✗ client -> drive.deactivateDrive - status
      ↳  0% — ✓ 0 / ✗ 70
     ✗ client -> drive.deleteDrive - status
      ↳  0% — ✓ 0 / ✗ 70
     ✗ client -> resource.uploadResource - status
      ↳  0% — ✓ 0 / ✗ 39
     ✗ client -> tag.addTagToResource - status
      ↳  0% — ✓ 0 / ✗ 44
     ✗ client -> group.findGroup - status
      ↳  7% — ✓ 3 / ✗ 35
     ✗ client -> resource.moveResource - status
      ↳  2% — ✓ 1 / ✗ 38
     ✗ client -> share.createShareInvitation - status
      ↳  0% — ✓ 0 / ✗ 38
     ✗ client -> tag.getTagForResource - status
      ↳  18% — ✓ 8 / ✗ 36
     ✗ test -> resource.getTags - name - match
      ↳  0% — ✓ 0 / ✗ 44
     ✗ client -> share.deleteShareInvitation - status
      ↳  0% — ✓ 0 / ✗ 38
     ✗ client -> resource.deleteResource - status
      ↳  0% — ✓ 0 / ✗ 77
     ✗ client -> tag.removeTagToResource - status
      ↳  0% — ✓ 0 / ✗ 44
     ✗ client -> resource.downloadResource - status
      ↳  1% — ✓ 1 / ✗ 88

     checks.........................: 14.17% ✓ 492      ✗ 2980
     data_received..................: 11 MB  30 kB/s
     data_sent......................: 666 MB 1.8 MB/s
     http_req_blocked...............: avg=1.15ms   min=210ns   med=280ns   max=53.84ms  p(90)=642ns    p(95)=6.84ms  
     http_req_connecting............: avg=415.35µs min=0s      med=0s      max=22.29ms  p(90)=0s       p(95)=853.74µs
     http_req_duration..............: avg=1.32s    min=2.17ms  med=57.25ms max=30.29s   p(90)=1.11s    p(95)=15.33s  
       { expected_response:true }...: avg=1.14s    min=2.17ms  med=72.43ms max=20.54s   p(90)=443.68ms p(95)=7.24s   
     http_req_failed................: 80.37% ✓ 2936     ✗ 717 
     http_req_receiving.............: avg=694.06µs min=18.89µs med=65.75µs max=298.75ms p(90)=134.5µs  p(95)=167.38µs
     http_req_sending...............: avg=2.07ms   min=28.74µs med=91.18µs max=1.38s    p(90)=156.18µs p(95)=190.88µs
     http_req_tls_handshaking.......: avg=713.85µs min=0s      med=0s      max=30.44ms  p(90)=0s       p(95)=5.49ms  
     http_req_waiting...............: avg=1.32s    min=2.05ms  med=56.68ms max=30.29s   p(90)=1.1s     p(95)=15.33s  
     http_reqs......................: 3653   9.824014/s
     iteration_duration.............: avg=9.56s    min=1.02s   med=1.14s   max=1m16s    p(90)=22.24s   p(95)=34.38s  
     iterations.....................: 2417   6.500039/s
     vus............................: 1      min=0      max=75
     vus_max........................: 75     min=75     max=75


running (6m11.8s), 00/75 VUs, 2417 complete and 2 interrupted iterations
add_remove_tag_100             ✓ [======================================] 0/5 VUs    6m0s
create_remove_group_share_090  ✓ [======================================] 0/5 VUs    6m0s
create_space_080               ✓ [======================================] 0/5 VUs    6m0s
create_upload_rename_delete... ✓ [======================================] 0/5 VUs    6m0s
download_050                   ✓ [======================================] 0/5 VUs    6m0s
navigate_file_tree_020         ✓ [======================================] 00/10 VUs  6m0s
sync_client_110                ✓ [======================================] 00/20 VUs  6m0s
user_group_search_070          ✓ [======================================] 00/20 VUs  6m0s

butonic avatar Sep 12 '24 13:09 butonic

hm ... registry is still not forgetting nodes:

│ proxy-7cf6ffcfc9-2jgq7 {"level":"error","service":"proxy","request-id":"5daa8746-6bf7-4e93-b7c0-88ee2696cb0c","userid":"afabfc40-6cdc-4444-8c15-da201b23d46e","error":"{\"id\":\"go.micro.client\",\"code\":503,\"d │
│ etail\":\"connection error: desc = \\\"transport: Error while dialing: dial tcp 10.96.8.35:9191: i/o timeout\\\"\",\"status\":\"Service Unavailable\"}","time":"2024-09-12T13:13:01Z","line":"github.com/owncloud/o │
│ cis/v2/services/proxy/pkg/userroles/oidcroles.go:114","message":"Could not load roles"}                                                                                                                             │
│ proxy-7cf6ffcfc9-2jgq7 {"level":"error","service":"proxy","error":"{\"id\":\"go.micro.client\",\"code\":503,\"detail\":\"connection error: desc = \\\"transport: Error while dialing: dial tcp 10.96.8.35:9191: i/o │
│  timeout\\\"\",\"status\":\"Service Unavailable\"}","time":"2024-09-12T13:13:01Z","line":"github.com/owncloud/ocis/v2/services/proxy/pkg/middleware/account_resolver.go:170","message":"Could not get user roles"}  │
│ proxy-7cf6ffcfc9-2jgq7 {"level":"info","service":"proxy","proto":"HTTP/1.1","request-id":"5daa8746-6bf7-4e93-b7c0-88ee2696cb0c","traceid":"8e9a4486e588e2f371f904db0f838957","remote-addr":"77.21.27.45","method":" │
│ GET","status":500,"path":"/ocs/v2.php/apps/notifications/api/v1/notifications/sse","duration":20106.608758,"bytes":0,"time":"2024-09-12T13:13:01Z","line":"github.com/owncloud/ocis/v2/services/proxy/pkg/middlewar │
│ e/accesslog.go:34","message":"access-log"}                                                                                                                                                                          │
│ proxy-7cf6ffcfc9-2jgq7 {"level":"info","service":"proxy","proto":"HTTP/1.1","request-id":"6aeec16abe8539f74f7e7645b485cbbd","traceid":"9ae67f328cad2873f3509381a4ec1e3e","remote-addr":"173.44.117.68","method":"GE │
│ T","status":401,"path":"/graph/v1beta1/drives/55b65915-d282-4be3-88f7-a07a3d092c01$107d95ef-65f2-44a4-b124-5fd477153d27/items/55b65915-d282-4be3-88f7-a07a3d092c01$107d95ef-65f2-44a4-b124-5fd477153d27!f15571bf-de │
│ 16-424e-90d9-559bfa923826/invite\\","duration":0.104635,"bytes":0,"time":"2024-09-12T13:15:16Z","line":"github.com/owncloud/ocis/v2/services/proxy/pkg/middleware/accesslog.go:34","message":"access-log"}          │
│ proxy-7cf6ffcfc9-2jgq7 {"level":"error","service":"proxy","error":"{\"id\":\"go.micro.client\",\"code\":503,\"detail\":\"connection error: desc = \\\"transport: Error while dialing: dial tcp 10.96.8.35:9191: i/o │
│  timeout\\\"\",\"status\":\"Service Unavailable\"}","time":"2024-09-12T13:15:33Z","line":"github.com/owncloud/ocis/v2/services/proxy/pkg/userroles/oidcroles.go:211","message":"Failed to list all roles"}          │
│ proxy-7cf6ffcfc9-2jgq7 {"level":"error","service":"proxy","request-id":"5daa8746-6bf7-4e93-b7c0-88ee2696cb0c","userid":"afabfc40-6cdc-4444-8c15-da201b23d46e","error":"{\"id\":\"go.micro.client\",\"code\":503,\"d │
│ etail\":\"connection error: desc = \\\"transport: Error while dialing: dial tcp 10.96.8.35:9191: i/o timeout\\\"\",\"status\":\"Service Unavailable\"}","time":"2024-09-12T13:15:33Z","line":"github.com/owncloud/o │
│ cis/v2/services/proxy/pkg/userroles/oidcroles.go:78","message":"Error mapping role names to role ids"}                                                                                                              │
│ proxy-7cf6ffcfc9-2jgq7 {"level":"error","service":"proxy","error":"{\"id\":\"go.micro.client\",\"code\":503,\"detail\":\"connection error: desc = \\\"transport: Error while dialing: dial tcp 10.96.8.35:9191: i/o │
│  timeout\\\"\",\"status\":\"Service Unavailable\"}","time":"2024-09-12T13:15:33Z","line":"github.com/owncloud/ocis/v2/services/proxy/pkg/middleware/account_resolver.go:170","message":"Could not get user roles"}  │
│ proxy-7cf6ffcfc9-2jgq7 {"level":"info","service":"proxy","proto":"HTTP/1.1","request-id":"5daa8746-6bf7-4e93-b7c0-88ee2696cb0c","traceid":"40ac88d22ac9e0f21d48e5ea54823645","remote-addr":"77.21.27.45","method":" │
│ GET","status":500,"path":"/ocs/v2.php/apps/notifications/api/v1/notifications/sse","duration":20108.687826,"bytes":0,"time":"2024-09-12T13:15:33Z","line":"github.com/owncloud/ocis/v2/services/proxy/pkg/middlewar │
│ e/accesslog.go:34","message":"access-log"} 

butonic avatar Sep 12 '24 13:09 butonic

hm so this is a case where we try to make a connection too early.

the gatewoay somtimes cannot connet to the storage users service:

"[pod/gateway-754d8b486c-x7p6z/gateway] {\"level\":\"error\",\"service\":\"gateway\",\"pkg\":\"rgrpc\",\"traceid\":\"8d57bf5baf9b67274185cb25ee09a891\",\"user-agent\":\"k6/0.45.0 (https://k6.io/)\",\"from\":\"tcp://10.96.17.28:59456\",\"uri\":\"/cs3.gateway.v1beta1.GatewayAPI/ListContainer\",\"start\":\"12/Sep/2024:16:34:39 +0000\",\"end\":\"12/Sep/2024:16:34:39 +0000\",\"time_ns\":2044278,\"code\":\"Unavailable\",\"time\":\"2024-09-12T16:34:39Z\",\"line\":\"github.com/cs3org/reva/[email protected]/internal/grpc/interceptors/log/log.go:69\",\"message\":\"rpc error: code = Unavailable desc = last connection error: connection error: desc = \\\"transport: Error while dialing: dial tcp 10.96.18.127:9157: connect: connection refused\\\"\"}"

ah that correlates with the startup time of the pod

"[pod/storageusers-584cc8758d-qwrx9/storageusers] {\"level\":\"debug\",\"time\":\"2024-09-12T16:34:39Z\",\"message\":\"scanning for stale .processing dirs\"}"
"[pod/storageusers-584cc8758d-qwrx9/storageusers] {\"level\":\"info\",\"service\":\"storage-users\",\"pkg\":\"rhttp\",\"time\":\"2024-09-12T16:34:39Z\",\"line\":\"github.com/cs3org/reva/[email protected]/pkg/rhttp/rhttp.go:220\",\"message\":\"http service enabled: dataprovider@/data\"}"
"[pod/storageusers-584cc8758d-qwrx9/storageusers] {\"level\":\"info\",\"service\":\"storage-users\",\"pkg\":\"rhttp\",\"time\":\"2024-09-12T16:34:39Z\",\"line\":\"github.com/cs3org/reva/[email protected]/pkg/rhttp/rhttp.go:193\",\"message\":\"http middleware enabled: requestid\"}"
"[pod/storageusers-584cc8758d-qwrx9/storageusers] {\"level\":\"info\",\"service\":\"storage-users\",\"pkg\":\"rhttp\",\"time\":\"2024-09-12T16:34:39Z\",\"line\":\"github.com/cs3org/reva/[email protected]/pkg/rhttp/rhttp.go:273\",\"message\":\"chaining http middleware requestid with priority  100\"}"
"[pod/storageusers-584cc8758d-qwrx9/storageusers] {\"level\":\"info\",\"service\":\"storage-users\",\"pkg\":\"rhttp\",\"time\":\"2024-09-12T16:34:39Z\",\"line\":\"github.com/cs3org/reva/[email protected]/pkg/rhttp/rhttp.go:278\",\"message\":\"unprotected URL: /data/tus\"}"
"[pod/storageusers-584cc8758d-qwrx9/storageusers] {\"level\":\"info\",\"service\":\"storage-users\",\"pkg\":\"rhttp\",\"time\":\"2024-09-12T16:34:39Z\",\"line\":\"github.com/cs3org/reva/[email protected]/pkg/rhttp/rhttp.go:124\",\"message\":\"http server listening at http://0.0.0.0:9158 '' ''\"}"
"[pod/storageusers-584cc8758d-qwrx9/storageusers] {\"level\":\"info\",\"service\":\"storage-users\",\"pkg\":\"rgrpc\",\"time\":\"2024-09-12T16:34:39Z\",\"line\":\"github.com/cs3org/reva/[email protected]/pkg/rgrpc/rgrpc.go:228\",\"message\":\"rgrpc: grpc service enabled: storageprovider\"}"
"[pod/storageusers-584cc8758d-qwrx9/storageusers] {\"level\":\"info\",\"service\":\"storage-users\",\"pkg\":\"rgrpc\",\"time\":\"2024-09-12T16:34:39Z\",\"line\":\"github.com/cs3org/reva/[email protected]/pkg/rgrpc/rgrpc.go:343\",\"message\":\"rgrpc: chaining grpc unary interceptor prometheus with priority 100\"}"
"[pod/storageusers-584cc8758d-qwrx9/storageusers] {\"level\":\"info\",\"service\":\"storage-users\",\"pkg\":\"rgrpc\",\"time\":\"2024-09-12T16:34:39Z\",\"line\":\"github.com/cs3org/reva/[email protected]/pkg/rgrpc/rgrpc.go:343\",\"message\":\"rgrpc: chaining grpc unary interceptor eventsmiddleware with priority 200\"}"
"[pod/storageusers-584cc8758d-qwrx9/storageusers] {\"level\":\"info\",\"service\":\"storage-users\",\"pkg\":\"rgrpc\",\"time\":\"2024-09-12T16:34:39Z\",\"line\":\"github.com/cs3org/reva/[email protected]/pkg/rgrpc/rgrpc.go:192\",\"message\":\"grpc server listening at tcp:0.0.0.0:9157\"}"

we can implement the /readyz endpoint following https://blog.gopheracademy.com/advent-2017/kubernetes-ready-service/#step-9-health-checks

func readyz(isReady *atomic.Value) http.HandlerFunc {
    return func(w http.ResponseWriter, _ *http.Request) {
        if isReady == nil || !isReady.Load().(bool) {
            http.Error(w, http.StatusText(http.StatusServiceUnavailable), http.StatusServiceUnavailable)
            return
        }
        w.WriteHeader(http.StatusOK)
    }
}

the atomic.Value needs to be set when the reva runtime has initialized all services. the decomposedfs uses the Init() call to set up the filesystem and check migrations. that might take a moment because it does some file locking.

we need to add an option to the reva runtime that we can use to change the atomic.Value. An option with a a channel? or a callback?. Then we can send a signal / make a callback when reva has finished initializing the services for grpc in pkg/rgrpc/rgrpc.go:

// Start starts the server.
func (s *Server) Start(ln net.Listener) error {
	if err := s.registerServices(); err != nil {
		err = errors.Wrap(err, "unable to register services")
		return err
	}

	s.listener = ln
	s.log.Info().Msgf("grpc server listening at %s:%s", s.Network(), s.Address())
        // TODO signal or callback that the service is ready
	err := s.s.Serve(s.listener)
	if err != nil {
		err = errors.Wrap(err, "serve failed")
		return err
	}
	return nil
}

The rhttp package also needs to signal when it has initialized all handlers. Then the reva runtime start() function needs to aggregate http and grpc servicce initialization to finally signal us if it is running.

Then the service registration also needs to be deferred until the services are ready in the server.go code

			grpcSvc := registry.BuildGRPCService(cfg.GRPC.Namespace+"."+cfg.Service.Name, cfg.GRPC.Protocol, cfg.GRPC.Addr, version.GetString())
			if err := registry.RegisterService(ctx, grpcSvc, logger); err != nil {
				logger.Fatal().Err(err).Msg("failed to register the grpc service")
			}

This can be done for all services that use the reva runtime. Starting with the storage-users.

How we properly determine when the non reva services are ready is a subsequent task. but it should work in a similar way: find a way to get notified when all handlers have been initialized, then change the atomic value.

The readyz handler could be a reusable ocis-pkg package.

So:

  • [ ] make /readyz reflect the atomic value
  • [ ] wait registering the service with go micro until the service is ready

butonic avatar Sep 13 '24 09:09 butonic

I have this 99x in the log:

"[pod/storagesystem-c9d9b459d-fvjb4/storagesystem] {\"level\":\"warn\",\"service\":\"storage-system\",\"pkg\":\"rgrpc\",\"traceid\":\"8c8e5026ceff9a43febe0825e19d5638\",\"error\":\"error parsing token: Token used before issued\",\"time\":\"2024-09-12T19:40:16Z\",\"line\":\"github.com/cs3org/reva/[email protected]/internal/grpc/interceptors/auth/auth.go:150\",\"message\":\"access token is invalid\"}"

WTF ... clock not in sync? How do we handle this. Even retrying won't help.

butonic avatar Sep 13 '24 09:09 butonic

I have this 99x in the log:

"[pod/storagesystem-c9d9b459d-fvjb4/storagesystem] {\"level\":\"warn\",\"service\":\"storage-system\",\"pkg\":\"rgrpc\",\"traceid\":\"8c8e5026ceff9a43febe0825e19d5638\",\"error\":\"error parsing token: Token used before issued\",\"time\":\"2024-09-12T19:40:16Z\",\"line\":\"github.com/cs3org/reva/[email protected]/internal/grpc/interceptors/auth/auth.go:150\",\"message\":\"access token is invalid\"}"

WTF ... clock not in sync? How do we handle this. Even retrying won't help.

What's the time difference though?

wkloucek avatar Sep 13 '24 09:09 wkloucek

and a panic in [pod/gateway-6fddd795b-b8njq/gateway]:

{
    "level": "error",
    "service": "gateway",
    "pkg": "rgrpc",
    "traceid": "51fc70040720c8b5e2e865285d7af23a",
    "time": "2024-09-12T21:31:10Z",
    "line": "github.com/cs3org/reva/[email protected]/internal/grpc/interceptors/recovery/recovery.go:50",
    "message": "user not found in context; stack: goroutine 1381306 [running]:
runtime/debug.Stack()
	runtime/debug/stack.go:26 +0x5e
github.com/cs3org/reva/v2/internal/grpc/interceptors/recovery.recoveryFunc({0x4d735d8, 0xc0025acb70}, {0x45f01e0, 0x4d414b0})
	github.com/cs3org/reva/[email protected]/internal/grpc/interceptors/recovery/recovery.go:50 +0x55
github.com/grpc-ecosystem/go-grpc-middleware/recovery.recoverFrom({0x4d735d8?, 0xc0025acb70?}, {0x45f01e0?, 0x4d414b0?}, 0xc00368cad8?)
	github.com/grpc-ecosystem/[email protected]/recovery/interceptors.go:61 +0x30
github.com/grpc-ecosystem/go-grpc-middleware/recovery.UnaryServerInterceptor.func1.1()
	github.com/grpc-ecosystem/[email protected]/recovery/interceptors.go:29 +0x75
panic({0x45f01e0?, 0x4d414b0?})
	runtime/panic.go:785 +0x132
github.com/cs3org/reva/v2/pkg/ctx.ContextMustGetUser({0x4d735d8?, 0xc0025acb70?})
	github.com/cs3org/reva/[email protected]/pkg/ctx/userctx.go:48 +0x58
github.com/cs3org/reva/v2/internal/grpc/services/gateway.(*cachedRegistryClient).ListStorageProviders(0xc0021a44a0, {0x4d735d8, 0xc0025acb70}, 0xc0025acd20, {0x0, 0x0, 0x0})
	github.com/cs3org/reva/[email protected]/internal/grpc/services/gateway/storageprovidercache.go:47 +0x94
github.com/cs3org/reva/v2/internal/grpc/services/gateway.(*svc).ListStorageSpaces(0xc001271110, {0x4d735d8, 0xc0025acb70}, 0xc000df7260)
	github.com/cs3org/reva/[email protected]/internal/grpc/services/gateway/storageprovider.go:287 +0xb16
github.com/cs3org/go-cs3apis/cs3/gateway/v1beta1._GatewayAPI_ListStorageSpaces_Handler.func1({0x4d735d8?, 0xc0025acb70?}, {0x4af8f80?, 0xc000df7260?})
	github.com/cs3org/[email protected]/cs3/gateway/v1beta1/gateway_api_grpc.pb.go:2597 +0xce
github.com/cs3org/reva/v2/internal/grpc/interceptors/prometheus.interceptorFromConfig.func1({0x4d735d8?, 0xc0025acb70?}, {0x4af8f80?, 0xc000df7260?}, 0x45f0ca0?, 0x4326140?)
	github.com/cs3org/reva/[email protected]/internal/grpc/interceptors/prometheus/prometheus.go:72 +0x3d
github.com/cs3org/reva/v2/pkg/rgrpc.(*Server).getInterceptors.ChainUnaryServer.func8.1({0x4d735d8?, 0xc0025acb70?}, {0x4af8f80?, 0xc000df7260?})
	github.com/grpc-ecosystem/[email protected]/chain.go:48 +0x45
github.com/cs3org/reva/v2/internal/grpc/interceptors/auth.NewUnary.func2({0x4d735d8, 0xc0025acb70}, {0x4af8f80, 0xc000df7260}, 0xc0021a43a0, 0xc0031daa80)
	github.com/cs3org/reva/[email protected]/internal/grpc/interceptors/auth/auth.go:137 +0x634
github.com/cs3org/reva/v2/pkg/rgrpc.(*Server).getInterceptors.ChainUnaryServer.func8.1({0x4d735d8?, 0xc0025acb70?}, {0x4af8f80?, 0xc000df7260?})
	github.com/grpc-ecosystem/[email protected]/chain.go:48 +0x45
github.com/grpc-ecosystem/go-grpc-middleware/recovery.UnaryServerInterceptor.func1({0x4d735d8?, 0xc0025acb70?}, {0x4af8f80?, 0xc000df7260?}, 0xc0017e9370?, 0x4f050f?)
	github.com/grpc-ecosystem/[email protected]/recovery/interceptors.go:33 +0x9e
github.com/cs3org/reva/v2/pkg/rgrpc.(*Server).getInterceptors.ChainUnaryServer.func8.1({0x4d735d8?, 0xc0025acb70?}, {0x4af8f80?, 0xc000df7260?})
	github.com/grpc-ecosystem/[email protected]/chain.go:48 +0x45
github.com/cs3org/reva/v2/pkg/rgrpc.(*Server).getInterceptors.NewUnary.func6({0x4d735d8, 0xc0025acb70}, {0x4af8f80, 0xc000df7260}, 0xc0021a43a0, 0xc0031dab00)
	github.com/cs3org/reva/[email protected]/internal/grpc/interceptors/log/log.go:39 +0x97
github.com/cs3org/reva/v2/pkg/rgrpc.(*Server).getInterceptors.ChainUnaryServer.func8.1({0x4d735d8?, 0xc0025acb70?}, {0x4af8f80?, 0xc000df7260?})
	github.com/grpc-ecosystem/[email protected]/chain.go:48 +0x45
github.com/cs3org/reva/v2/pkg/rgrpc.(*Server).getInterceptors.NewUnary.func5({0x4d735d8, 0xc0025acb70}, {0x4af8f80, 0xc000df7260}, 0x45f01e0?, 0xc0031dab40)
	github.com/cs3org/reva/[email protected]/internal/grpc/interceptors/useragent/useragent.go:38 +0xf0
github.com/cs3org/reva/v2/pkg/rgrpc.(*Server).getInterceptors.ChainUnaryServer.func8.1({0x4d735d8?, 0xc0025acb70?}, {0x4af8f80?, 0xc000df7260?})
	github.com/grpc-ecosystem/[email protected]/chain.go:48 +0x45
github.com/cs3org/reva/v2/pkg/rgrpc.(*Server).getInterceptors.NewUnary.func4({0x4d735d8, 0xc0025acae0}, {0x4af8f80, 0xc000df7260}, 0x4d61a38?, 0xc0031dab80)
	github.com/cs3org/reva/[email protected]/internal/grpc/interceptors/token/token.go:52 +0x256
github.com/cs3org/reva/v2/pkg/rgrpc.(*Server).getInterceptors.ChainUnaryServer.func8.1({0x4d735d8?, 0xc0025acae0?}, {0x4af8f80?, 0xc000df7260?})
	github.com/grpc-ecosystem/[email protected]/chain.go:48 +0x45
github.com/cs3org/reva/v2/pkg/rgrpc.(*Server).getInterceptors.NewUnary.func3({0x4d735d8, 0xc0025aca20}, {0x4af8f80, 0xc000df7260}, 0x0?, 0xc0031dabc0)
	github.com/cs3org/reva/[email protected]/internal/grpc/interceptors/appctx/appctx.go:51 +0x50a
github.com/cs3org/reva/v2/pkg/rgrpc.(*Server).getInterceptors.ChainUnaryServer.func8({0x4d735d8, 0xc0025aca20}, {0x4af8f80, 0xc000df7260}, 0xc0021a43a0, 0x80?)
	github.com/grpc-ecosystem/[email protected]/chain.go:53 +0x123
github.com/cs3org/go-cs3apis/cs3/gateway/v1beta1._GatewayAPI_ListStorageSpaces_Handler({0x4d0cb00, 0xc001271110}, {0x4d735d8, 0xc0025aca20}, 0xc002eae400, 0xc0014c23c0)
	github.com/cs3org/[email protected]/cs3/gateway/v1beta1/gateway_api_grpc.pb.go:2599 +0x143
google.golang.org/grpc.(*Server).processUnaryRPC(0xc0012f8e00, {0x4d735d8, 0xc0025ac930}, {0x4d89a40, 0xc00188f680}, 0xc002862000, 0xc0014c25a0, 0x6a54490, 0x0)
	google.golang.org/[email protected]/server.go:1379 +0xdf8
google.golang.org/grpc.(*Server).handleStream(0xc0012f8e00, {0x4d89a40, 0xc00188f680}, 0xc002862000)
	google.golang.org/[email protected]/server.go:1790 +0xe8b
google.golang.org/grpc.(*Server).serveStreams.func2.1()
	google.golang.org/[email protected]/server.go:1029 +0x7f
created by google.golang.org/grpc.(*Server).serveStreams.func2 in goroutine 41483
	google.golang.org/[email protected]/server.go:1040 +0x125
"
}
{
    "level": "error",
    "service": "gateway",
    "pkg": "rgrpc",
    "traceid": "51fc70040720c8b5e2e865285d7af23a",
    "user-agent": "grpc-go/1.65.0",
    "from": "tcp://10.96.9.42:52404",
    "uri": "/cs3.gateway.v1beta1.GatewayAPI/ListStorageSpaces",
    "start": "12/Sep/2024:21:31:10 +0000",
    "end": "12/Sep/2024:21:31:10 +0000",
    "time_ns": 705217,
    "code": "Internal",
    "time": "2024-09-12T21:31:10Z",
    "line": "github.com/cs3org/reva/[email protected]/internal/grpc/interceptors/log/log.go:69",
    "message": "rpc error: code = Internal desc = user not found in context"
}

butonic avatar Sep 13 '24 10:09 butonic

cc @kobergj

butonic avatar Oct 04 '24 12:10 butonic

cc @fschade

butonic avatar Oct 09 '24 15:10 butonic

hm and autoaccept shares in the frontend is also too slow to consume the events:

diff --git a/services/frontend/pkg/command/events.go b/services/frontend/pkg/command/events.go
index 2487165533..954fab7af8 100644
--- a/services/frontend/pkg/command/events.go
+++ b/services/frontend/pkg/command/events.go
@@ -118,24 +118,26 @@ func AutoAcceptShares(ev events.ShareCreated, autoAcceptDefault bool, l log.Logg
        }
 
        for _, uid := range userIDs {
-               if !autoAcceptShares(ctx, uid, autoAcceptDefault, vs) {
-                       continue
-               }
+               go func(uid *user.UserId) {
+                       if !autoAcceptShares(ctx, uid, autoAcceptDefault, vs) {
+                               return
+                       }
 
-               gwc, err := gatewaySelector.Next()
-               if err != nil {
-                       l.Error().Err(err).Msg("cannot get gateway client")
-                       continue
-               }
-               resp, err := gwc.UpdateReceivedShare(ctx, updateShareRequest(ev.ShareID, uid))
-               if err != nil {
-                       l.Error().Err(err).Msg("error sending grpc request")
-                       continue
-               }
+                       gwc, err := gatewaySelector.Next()
+                       if err != nil {
+                               l.Error().Err(err).Msg("cannot get gateway client")
+                               return
+                       }
+                       resp, err := gwc.UpdateReceivedShare(ctx, updateShareRequest(ev.ShareID, uid))
+                       if err != nil {
+                               l.Error().Err(err).Msg("error sending grpc request")
+                               return
+                       }
 
-               if resp.GetStatus().GetCode() != rpc.Code_CODE_OK {
-                       l.Error().Interface("status", resp.GetStatus()).Str("userid", uid.GetOpaqueId()).Msg("unexpected status code while accepting share")
-               }
+                       if resp.GetStatus().GetCode() != rpc.Code_CODE_OK {
+                               l.Error().Interface("status", resp.GetStatus()).Str("userid", uid.GetOpaqueId()).Msg("unexpected status code while accepting share")
+                       }
+               }(uid)
        }
 
 }

the frontend will log tons of errors like this:

┌─────────────────────────────────────────────────────────────────── Logs(school-91001/frontend:frontend)[tail] ────────────────────────────────────────────────────────────────────┐
│                                                         Autoscroll:Off     FullScreen:Off     Timestamps:Off     Wrap:On                                                          │
│ frontend-7c8969fbbb-xzfck {"level":"error","service":"frontend","status":{"code":6,"message":"error getting received share","trace":"584ec5e4efae632ea344ca9075710d66"},"userid": │
│ "411b9ad0-95da-4b59-9848-671d36bf9e7f","time":"2024-09-12T11:14:45Z","line":"github.com/owncloud/ocis/v2/services/frontend/pkg/command/events.go:137","message":"unexpected statu │
│ s code while accepting share"}                                                                                                                                                    │
│ frontend-7c8969fbbb-xzfck {"level":"error","service":"frontend","status":{"code":6,"message":"error getting received share","trace":"e4bda889637ffb871773bdc30bc660ee"},"userid": │
│ "f6e1e884-a4ee-4b7d-a68a-5cffcfb0c614","time":"2024-09-12T11:14:45Z","line":"github.com/owncloud/ocis/v2/services/frontend/pkg/command/events.go:137","message":"unexpected statu │
│ s code while accepting share"}                                                                                                                                                    │
│ frontend-7c8969fbbb-bfkbr {"level":"error","service":"frontend","status":{"code":6,"message":"error getting received share","trace":"f464ebb5144bc1d251decd1fa928b2ce"},"userid": │
│ "fcf77ab0-1140-46ad-82a1-23bd86659955","time":"2024-09-12T11:14:45Z","line":"github.com/owncloud/ocis/v2/services/frontend/pkg/command/events.go:137","message":"unexpected statu │
│ s code while accepting share"}                                                                                                                                                    │
│ frontend-7c8969fbbb-xzfck {"level":"error","service":"frontend","status":{"code":6,"message":"error getting received share","trace":"7b3a501b2fb5f3ae15225f35b47b06d3"},"userid": │
│ "ad217814-0571-4ad2-ae7a-00d332552ea2","time":"2024-09-12T11:14:45Z","line":"github.com/owncloud/ocis/v2/services/frontend/pkg/command/events.go:137","message":"unexpected statu │
│ s code while accepting share"}                                                                                                                                                    │
│ frontend-7c8969fbbb-bfkbr {"level":"error","service":"frontend","status":{"code":6,"message":"error getting received share","trace":"e14d1a00e9d49aa269581f44ec719996"},"userid": │
│ "9ca0356d-c2cd-4c0e-a758-00b68b315e11","time":"2024-09-12T11:14:45Z","line":"github.com/owncloud/ocis/v2/services/frontend/pkg/command/events.go:137","message":"unexpected statu │
│ s code while accepting share"}                                                                                                                                                    │
│ frontend-7c8969fbbb-xzfck nats: slow consumer, messages dropped on connection [41261574] for subscription on "main-queue"                                                         │
│ frontend-7c8969fbbb-xzfck {"level":"error","service":"frontend","status":{"code":6,"message":"error getting received share","trace":"f734c7466f4da655a817f34256101f2d"},"userid": │
│ "db96dfd9-b4db-4279-8e70-791ea5490495","time":"2024-09-12T11:14:45Z","line":"github.com/owncloud/ocis/v2/services/frontend/pkg/command/events.go:137","message":"unexpected statu │
│ s code while accepting share"}                                                                                                                                                    │
│ frontend-7c8969fbbb-bfkbr {"level":"error","service":"frontend","status":{"code":6,"message":"error getting received share","trace":"7ac371ad25d1bc0b2f49570e797c5755"},"userid": │
│ "790c4571-bd01-4d83-b4d9-04330e29e541","time":"2024-09-12T11:14:45Z","line":"github.com/owncloud/ocis/v2/services/frontend/pkg/command/events.go:137","message":"unexpected statu │
│ s code while accepting share"}                                                                                                                                                    │
│ frontend-7c8969fbbb-xzfck {"level":"error","service":"frontend","status":{"code":6,"message":"error getting received share","trace":"f3163cbf764e7d41a847cb9d7eedb32f"},"userid": │
│ "c8eb2f77-b90e-4ccd-a827-66d618da2250","time":"2024-09-12T11:14:45Z","line":"github.com/owncloud/ocis/v2/services/frontend/pkg/command/events.go:137","message":"unexpected statu │
│ s code while accepting share"}                                                                                                                                                    │
│ frontend-7c8969fbbb-bfkbr {"level":"error","service":"frontend","status":{"code":6,"message":"error getting received share","trace":"bcb0b5c8aa758427410df8af690a7dc6"},"userid": │
│ "5aa43c8d-502c-421e-9191-355b8bd632a4","time":"2024-09-12T11:14:45Z","line":"github.com/owncloud/ocis/v2/services/frontend/pkg/command/events.go:137","message":"unexpected statu │
│ s code while accepting share"}                                                                                                                                                    │

and the sharing service

┌──────────────────────────────────────────────────────────── Logs(school-91001/sharing-84944f47c6-hk65q:sharing)[tail] ────────────────────────────────────────────────────────────┐
│                                                         Autoscroll:Off     FullScreen:Off     Timestamps:Off     Wrap:On                                                          │
│ rovider/usershareprovider.go:488","message":"error getting received share"}                                                                                                       │
│ {"level":"error","service":"sharing","pkg":"rgrpc","traceid":"8305329fbea49382070acc1a411301d3","error":"error: not found: opaque_id:\"55b65915-d282-4be3-88f7-a07a3d092c01:b208f │
│ 68f-40e3-4333-bcd3-7711302d80d8:f23f736d-17a0-4ab9-a78b-ccf264b498af\"","time":"2024-09-12T11:11:51Z","line":"github.com/cs3org/reva/[email protected]/internal/grpc/services/usersharep │
│ rovider/usershareprovider.go:488","message":"error getting received share"}                                                                                                       │
│ {"level":"error","service":"sharing","pkg":"rgrpc","traceid":"30e1cf60fd15db313a62bf47a40ea4e3","error":"error: not found: opaque_id:\"55b65915-d282-4be3-88f7-a07a3d092c01:b208f │
│ 68f-40e3-4333-bcd3-7711302d80d8:439a51c6-e891-4a57-827a-3355e7bd865d\"","time":"2024-09-12T11:11:51Z","line":"github.com/cs3org/reva/[email protected]/internal/grpc/services/usersharep │
│ rovider/usershareprovider.go:488","message":"error getting received share"}                                                                                                       │
│ {"level":"error","service":"sharing","pkg":"rgrpc","traceid":"fc7a5171c3038b65d8698d5a7154f347","error":"error: not found: opaque_id:\"55b65915-d282-4be3-88f7-a07a3d092c01:b208f │
│ 68f-40e3-4333-bcd3-7711302d80d8:439a51c6-e891-4a57-827a-3355e7bd865d\"","time":"2024-09-12T11:11:51Z","line":"github.com/cs3org/reva/[email protected]/internal/grpc/services/usersharep │
│ rovider/usershareprovider.go:488","message":"error getting received share"}                                                                                                       │
│ {"level":"error","service":"sharing","pkg":"rgrpc","traceid":"ad3623166893f7c5b208972d42776b05","error":"error: not found: opaque_id:\"55b65915-d282-4be3-88f7-a07a3d092c01:b208f │
│ 68f-40e3-4333-bcd3-7711302d80d8:439a51c6-e891-4a57-827a-3355e7bd865d\"","time":"2024-09-12T11:11:51Z","line":"github.com/cs3org/reva/[email protected]/internal/grpc/services/usersharep │
│ rovider/usershareprovider.go:488","message":"error getting received share"}                                                                                                       │
│ {"level":"error","service":"sharing","pkg":"rgrpc","traceid":"4b0fbbfd0295fd8ff5cf16b3fc744f7f","error":"error: not found: opaque_id:\"55b65915-d282-4be3-88f7-a07a3d092c01:b208f │
│ 68f-40e3-4333-bcd3-7711302d80d8:f23f736d-17a0-4ab9-a78b-ccf264b498af\"","time":"2024-09-12T11:11:51Z","line":"github.com/cs3org/reva/[email protected]/internal/grpc/services/usersharep │
│ rovider/usershareprovider.go:488","message":"error getting received share"}                                                                                                       │
│ {"level":"error","service":"sharing","pkg":"rgrpc","traceid":"89216a5ff7316070f292a0572228f792","error":"error: not found: opaque_id:\"55b65915-d282-4be3-88f7-a07a3d092c01:b208f │
│ 68f-40e3-4333-bcd3-7711302d80d8:439a51c6-e891-4a57-827a-3355e7bd865d\"","time":"2024-09-12T11:11:52Z","line":"github.com/cs3org/reva/[email protected]/internal/grpc/services/usersharep │
│ rovider/usershareprovider.go:488","message":"error getting received share"}                                                                                                       │
│ {"level":"error","service":"sharing","pkg":"rgrpc","traceid":"30dd83935491d3b4a11f6e5dfbb06228","error":"error: not found: opaque_id:\"55b65915-d282-4be3-88f7-a07a3d092c01:b208f │
│ 68f-40e3-4333-bcd3-7711302d80d8:439a51c6-e891-4a57-827a-3355e7bd865d\"","time":"2024-09-12T11:11:52Z","line":"github.com/cs3org/reva/[email protected]/internal/grpc/services/usersharep │
│ rovider/usershareprovider.go:488","message":"error getting received share"}                                                                                                       │
│ {"level":"error","service":"sharing","pkg":"rgrpc","traceid":"88e88942394de3fbd72809acd11528cc","error":"error: not found: opaque_id:\"55b65915-d282-4be3-88f7-a07a3d092c01:b208f │
│ 68f-40e3-4333-bcd3-7711302d80d8:f23f736d-17a0-4ab9-a78b-ccf264b498af\"","time":"2024-09-12T11:11:52Z","line":"github.com/cs3org/reva/[email protected]/internal/grpc/services/usersharep │
│ rovider/usershareprovider.go:488","message":"error getting received share"}                                                                                                       │
│ {"level":"error","service":"sharing","pkg":"rgrpc","traceid":"20befebb192e0e71c557b8e1ccb89a98","error":"error: not found: opaque_id:\"55b65915-d282-4be3-88f7-a07a3d092c01:b208f │
│ 68f-40e3-4333-bcd3-7711302d80d8:439a51c6-e891-4a57-827a-3355e7bd865d\"","time":"2024-09-12T11:11:52Z","line":"github.com/cs3org/reva/[email protected]/internal/grpc/services/usersharep │
│ rovider/usershareprovider.go:488","message":"error getting received share"}                                                                                                       │
│                                                                                                                                                                                   │

the search service also cannot keep up indexing files, but it does not have to multiply events for all users that have access in the shared space, so it settles down far sooner.

solved...

fschade avatar Oct 17 '24 13:10 fschade

solved...

can confirm 😍 no errors in nightly k6 test https://drone.owncloud.com/owncloud/ocis/40169/58

ScharfViktor avatar Oct 18 '24 08:10 ScharfViktor

@fschade hm frontend is still not fast enough, reopen?

butonic avatar Nov 05 '24 16:11 butonic

concurrent frontend autoaccept in https://github.com/owncloud/ocis/pull/10476

butonic avatar Nov 05 '24 16:11 butonic