Flaky changing quota requests - fails with 500 error
Builds:
- https://drone.owncloud.com/owncloud/ocis/41199/38/6
- https://drone.owncloud.com/owncloud/ocis/41198/22/5
- https://drone.owncloud.com/owncloud/ocis/41180/54/5
- https://drone.owncloud.com/owncloud/ocis/41185/38/6
And user "Admin" has changed the quota of the personal space of "Alice Hansen" space to "1" # SpacesContext::userHasChangedTheQuotaOfTheSpaceTo()
Expected response status code should be 200
Failed asserting that 500 matches expected 200.
==> REQUEST
PATCH /graph/v1.0/drives/28087b3e-b3a7-44a4-850f-082544e6f8b1$ca3ed4f6-c7ae-4ada-a482-95c459958ac3
X-Request-ID: coreApiShareOperationsToShares2/uploadToShare.feature:238-231
==> REQ BODY
{"quota":{"total":1}}
<== RESPONSE
500 Internal Server Error
X-Xss-Protection: 1; mode=block
<== RES BODY
{"error":{"code":"generalException","innererror":{"date":"2024-11-19T05:15:58Z","request-id":"coreApiShareOperationsToShares2/uploadToShare.feature:238-231"},"message":"grpc error"}}
Scenarios:
apiSpacesShares/shareOperations.feature:340
coreApiShareOperationsToShares2/uploadToShare.feature:218
coreApiShareOperationsToShares2/uploadToShare.feature:249
ocis log:
{"level":"error","service":"storage-users","pkg":"rgrpc","traceid":"2ce5a123ee3e21770ab2587d496aee71","error":"open /root/.ocis/storage/users/spaces/5a/3f0b56-cd9e-4575-ab9d-ec57268b6e6c/nodes/5a/3f/0b/56/-cd9e-4575-ab9d-ec57268b6e6c.mlock: no such file or directory","req":{"storage_space":{"id":{"opaque_id":"1e0ee64a-e36c-4a09-9138-0004fb359bce$5a3f0b56-cd9e-4575-ab9d-ec57268b6e6c"},"root":{"storage_id":"1e0ee64a-e36c-4a09-9138-0004fb359bce","opaque_id":"5a3f0b56-cd9e-4575-ab9d-ec57268b6e6c","space_id":"5a3f0b56-cd9e-4575-ab9d-ec57268b6e6c"},"quota":{"quota_max_bytes":1}}},"time":"2024-11-19T11:03:10Z","message":"failed to update storage space"}
{"level":"error","service":"storage-users","pkg":"rgrpc","traceid":"2ce5a123ee3e21770ab2587d496aee71","user-agent":"grpc-go/1.68.0","from":"tcp://127.0.0.1:46852","uri":"/cs3.storage.provider.v1beta1.SpacesAPI/UpdateStorageSpace","start":"19/Nov/2024:11:03:10 +0000","end":"19/Nov/2024:11:03:10 +0000","time_ns":1141739,"code":"Unknown","time":"2024-11-19T11:03:10Z","message":"open /root/.ocis/storage/users/spaces/5a/3f0b56-cd9e-4575-ab9d-ec57268b6e6c/nodes/5a/3f/0b/56/-cd9e-4575-ab9d-ec57268b6e6c.mlock: no such file or directory"}
^ CC @ScharfViktor @micbar @kobergj
This list will be updated frequently. Failed CI builds:
- https://drone.owncloud.com/owncloud/ocis/42271/62/5 (2024-12-31)
- https://drone.owncloud.com/owncloud/ocis/41220/38/6
- https://drone.owncloud.com/owncloud/ocis/41225/22/5
- https://drone.owncloud.com/owncloud/ocis/41263/39/5
- https://drone.owncloud.com/owncloud/ocis/41272/35/5
- https://drone.owncloud.com/owncloud/ocis/41288/38/5
- https://drone.owncloud.com/owncloud/ocis/41303/38/5
- https://drone.owncloud.com/owncloud/ocis/41417/53/5
- https://drone.owncloud.com/owncloud/ocis/41443/54/5
- https://drone.owncloud.com/owncloud/ocis/41445/54/5
- https://drone.owncloud.com/owncloud/ocis/41446/54/5
- https://drone.owncloud.com/owncloud/ocis/41450/54/5
- https://drone.owncloud.com/owncloud/ocis/41492/35/5
- https://drone.owncloud.com/owncloud/ocis/41494/38/6
- https://drone.owncloud.com/owncloud/ocis/41515/35/5
- https://drone.owncloud.com/owncloud/ocis/41594/35/5
- https://drone.owncloud.com/owncloud/ocis/41651/50/5
- https://drone.owncloud.com/owncloud/ocis/41651/51/5
- https://drone.owncloud.com/owncloud/ocis/41644/54/5
- https://drone.owncloud.com/owncloud/ocis/41658/53/5
- https://drone.owncloud.com/owncloud/ocis/41662/54/5
- https://drone.owncloud.com/owncloud/ocis/41839/50/5
- https://drone.owncloud.com/owncloud/ocis/41965/54/5
- https://drone.owncloud.com/owncloud/ocis/42080/54/5
- https://drone.owncloud.com/owncloud/ocis/42494/54/5 (2025-01-09)
- https://drone.owncloud.com/owncloud/ocis/42486/54/5
- https://drone.owncloud.com/owncloud/ocis/42479/55/5
- https://drone.owncloud.com/owncloud/ocis/42953/53/5
- https://drone.owncloud.com/owncloud/ocis/42933/55/5
- https://drone.owncloud.com/owncloud/ocis/44189/51/5
We should investigate why we get an 500 error in the DEV team.
This is in the logs of the latest failed run:
"level":"error","service":"storage-users","pkg":"rgrpc","traceid":"c92e9c1513f71fd3755bfe9419e7db15","error":"open /root/.ocis/storage/users/spaces/b4/68c2d3-0bc1-4b41-bfe7-c85db387057d/nodes/b4/68/c2/d3/-0bc1-4b41-bfe7-c85db387057d.mlock: no such file or
directory","req":{"storage_space":{"id":{"opaque_id":"96b0a8ee-ab8d-49e8-87dd-e1c3331c77f7$b468c2d3-0bc1-4b41-bfe7-c85db387057d"},"root":{"storage_id":"96b0a8ee-ab8d-49e8-87dd-e1c3331c77f7","opaque_id":"b468c2d3-0bc1-4b41-bfe7-c85db387057d","space_id":"b468
c2d3-0bc1-4b41-bfe7-c85db387057d"},"quota":{"quota_max_bytes":1}}},"time":"2024-11-27T09:23:36Z","message":"failed to update storage space"}
{"level":"error","service":"storage-users","pkg":"rgrpc","traceid":"c92e9c1513f71fd3755bfe9419e7db15","user-agent":"grpc-go/1.68.0","from":"tcp://127.0.0.1:34124","uri":"/cs3.storage.provider.v1beta1.SpacesAPI/UpdateStorageSpace","start":"27/Nov/2024:09:23:
36 +0000","end":"27/Nov/2024:09:23:36 +0000","time_ns":3131418,"code":"Unknown","time":"2024-11-27T09:23:36Z","message":"open /root/.ocis/storage/users/spaces/b4/68c2d3-0bc1-4b41-bfe7-c85db387057d/nodes/b4/68/c2/d3/-0bc1-4b41-bfe7-c85db387057d.mlock: no suc
h file or directory"}
The lock file disappeared?
The lock file disappeared?
More likely: the parent folder doesn't exist. AFAICS the lock file is always opened with O_CREAT
I created a stripped down ci config for reproducing this issue (https://github.com/owncloud/ocis/pull/10678). Of about 10 runs in that configuration only a single one (https://drone.owncloud.com/owncloud/ocis/41458) failed. After increasing the loglevel to debug, I could not reproduce this at all anymore :sob:
Running stuff locally I wasn't able to reproduce this either.
I wasn't able to get any useful log out of drone (see https://github.com/owncloud/ocis/pull/10678) up to now. And the issue is also not reproducible outside the CI.
I am unassigning myself for now. Since there are higher prio issues to work on :man_shrugging:
[Nightly-7-0] Nightly failure on changing quota tests
Describe
Build - https://drone.owncloud.com/owncloud/ocis/42165/59/6 https://drone.owncloud.com/owncloud/ocis/42164 Starting from 2024-12-24, getquota tests have been failing during nightly runs. The following scenarios are affected:
Scenario Outline: retrieving folder quota when no quota is set # /drone/src/tests/acceptance/features/coreApiWebdavProperties/getQuota.feature:12
Given using <dav-path-version> DAV path # FeatureContext::usingOldOrNewOrSpacesDavPath()
When user "Admin" changes the quota of the "Alice Hansen" space to "0" # SpacesContext::updateSpaceQuota()
Then the HTTP status code should be "200" # FeatureContext::thenTheHTTPStatusCodeShouldBe()
And as user "Alice" folder "/" should contain a property "d:quota-available-bytes" with value "0" # WebDavPropertiesContext::asUserFolderShouldContainAPropertyWithValueOrWithValue()
Examples:
| dav-path-version |
| old |
Failed step: Then the HTTP status code should be "200"
HTTP status code 500 is not the expected value 200
Failed asserting that 500 matches expected '200'.
| new |
Failed step: Then the HTTP status code should be "200"
HTTP status code 500 is not the expected value 200
Failed asserting that 500 matches expected '200'.
| spaces |
@smokeTest
Scenario Outline: retrieving folder quota when quota is set # /drone/src/tests/acceptance/features/coreApiWebdavProperties/getQuota.feature:24
Given using <dav-path-version> DAV path # FeatureContext::usingOldOrNewOrSpacesDavPath()
When user "Admin" changes the quota of the "Alice Hansen" space to "10000" # SpacesContext::updateSpaceQuota()
Then the HTTP status code should be "200" # FeatureContext::thenTheHTTPStatusCodeShouldBe()
And as user "Alice" folder "/" should contain a property "d:quota-available-bytes" with value "10000" # WebDavPropertiesContext::asUserFolderShouldContainAPropertyWithValueOrWithValue()
Examples:
| dav-path-version |
| old |
Failed step: Then the HTTP status code should be "200"
HTTP status code 500 is not the expected value 200
Failed asserting that 500 matches expected '200'.
| new |
| spaces |
Logs
## Scenario Outline: retrieving folder quota when no quota is set (coreApiWebdavProperties/getQuota.feature:19)
_______________________________________________________________________
==> REQUEST
PATCH /graph/v1.0/drives/d086f596-306d-4b6b-9a15-614f8da06e5b$f0e38ca9-efbd-41da-aea7-52494bbbaae7
X-Request-ID: coreApiWebdavProperties/getQuota.feature:19-14
==> REQ BODY
{"quota":{"total":0}}
<== RESPONSE
500 Internal Server Error
X-Xss-Protection: 1; mode=block
<== RES BODY
{"error":{"code":"generalException","innererror":{"date":"2024-12-24T03:48:12Z","request-id":"coreApiWebdavProperties/getQuota.feature:19-14"},"message":"grpc error"}}
### Then the HTTP status code should be "200"
oCIS log:
{"level":"error","service":"storage-users","pkg":"rgrpc","traceid":"7394e6afa4f706b829a842009cd1a2cd","error":"open /root/.ocis/storage/users/spaces/f2/a62d38-7f41-427e-8c2e-32b119fa1ae4/nodes/f2/a6/2d/38/-7f41-427e-8c2e-32b119fa1ae4.mlock: no such file or directory","req":{"storage_space":{"id":{"opaque_id":"684b7b1e-4b27-4cd0-a1da-7529d23957b2$f2a62d38-7f41-427e-8c2e-32b119fa1ae4"},"root":{"storage_id":"684b7b1e-4b27-4cd0-a1da-7529d23957b2","opaque_id":"f2a62d38-7f41-427e-8c2e-32b119fa1ae4","space_id":"f2a62d38-7f41-427e-8c2e-32b119fa1ae4"},"quota":{"quota_max_bytes":10000}}},"time":"2024-12-24T00:51:19Z","message":"failed to update storage space"}
{"level":"error","service":"storage-users","pkg":"rgrpc","traceid":"7394e6afa4f706b829a842009cd1a2cd","user-agent":"grpc-go/1.68.0","from":"tcp://127.0.0.1:36716","uri":"/cs3.storage.provider.v1beta1.SpacesAPI/UpdateStorageSpace","start":"24/Dec/2024:00:51:19 +0000","end":"24/Dec/2024:00:51:19 +0000","time_ns":1327822,"code":"Unknown","time":"2024-12-24T00:51:19Z","message":"open /root/.ocis/storage/users/spaces/f2/a62d38-7f41-427e-8c2e-32b119fa1ae4/nodes/f2/a6/2d/38/-7f41-427e-8c2e-32b119fa1ae4.mlock: no such file or directory"}
Setup
Please describe how you started the server and provide a list of relevant environment variables or configuration files.
OCIS_XXX=7.0.0
Couldn't reproduce the flakiness locally. It only occurs in CI and also doesn't generate any useful logs in CI. Currently, I'm unassigning myself form this issue.
List of recent build fail:
- https://drone.owncloud.com/owncloud/ocis/43448/51/5
- https://drone.owncloud.com/owncloud/ocis/43507/56/5
Interesting find:
The server sends 500 because there are two personal drives with the same name (alice). And there are some files missing for the old user in the storage/users/spaces folder.
I believe, the old one is from the previous test scenario.
Ednpoint: /graph/v1.0/drives
{
"value": [
{
"driveAlias": "personal/alice",
"id": "fe49f560-62f3-41fd-8364-fbcfee25e7af$c3326e64-7aa8-4d59-88ed-0da631963402",
"lastModifiedDateTime": "2025-02-14T06:39:05.440059135Z",
"name": "Alice Hansen",
"owner": {
"user": {
"displayName": "",
"id": "c3326e64-7aa8-4d59-88ed-0da631963402"
}
},
"root": {
"deleted": {
"state": "trashed"
},
"eTag": "\"ee10f95ca197e98906aad303ac00d5b9\"",
"id": "fe49f560-62f3-41fd-8364-fbcfee25e7af$c3326e64-7aa8-4d59-88ed-0da631963402",
"webDavUrl": "https://ocis-server:9200/dav/spaces/fe49f560-62f3-41fd-8364-fbcfee25e7af$c3326e64-7aa8-4d59-88ed-0da631963402"
},
},
{
"driveAlias": "personal/alice",
"id": "fe49f560-62f3-41fd-8364-fbcfee25e7af$3af10eab-f43d-42f3-92f0-d6b2d30262c5",
"lastModifiedDateTime": "2025-02-14T06:40:04.180433623Z",
"name": "Alice Hansen",
"owner": {
"user": {
"displayName": "",
"id": "3af10eab-f43d-42f3-92f0-d6b2d30262c5"
}
},
"root": {
"eTag": "\"75106885b3161a9c779c714f97d60bc4\"",
"id": "fe49f560-62f3-41fd-8364-fbcfee25e7af$3af10eab-f43d-42f3-92f0-d6b2d30262c5",
"webDavUrl": "https://ocis-server:9200/dav/spaces/fe49f560-62f3-41fd-8364-fbcfee25e7af$3af10eab-f43d-42f3-92f0-d6b2d30262c5"
},
},
{
"driveAlias": "personal/admin",
"id": "fe49f560-62f3-41fd-8364-fbcfee25e7af$447d3d5b-0919-4496-834a-53427cd8890a",
"lastModifiedDateTime": "2025-02-14T06:33:46.891064395Z",
"name": "Admin",
"owner": {
"user": {
"displayName": "",
"id": "447d3d5b-0919-4496-834a-53427cd8890a"
}
}
}
]
}
storage/users/spaces/**:
# admin
-0919-4496-834a-53427cd8890a
-0919-4496-834a-53427cd8890a.mlock
-0919-4496-834a-53427cd8890a.mpk
# new alice
-f43d-42f3-92f0-d6b2d30262c5
-f43d-42f3-92f0-d6b2d30262c5.mlock
-f43d-42f3-92f0-d6b2d30262c5.mpk
# old alice
-7aa8-4d59-88ed-0da631963402
# missing mlock file
-7aa8-4d59-88ed-0da631963402.mpk
This is fixable from the test code. We are using the wrong spaceId in the quota update request. But I don't know if this (showing broken/supposed to be deleted space in the drives list) needs fix from the server side
But I don't know if this (showing broken/supposed to be deleted space in the drives list) needs fix from the server side
During the test execution, sometimes the deleted user space is also shown in the /graph/v1.0/drives (See https://github.com/owncloud/ocis/issues/10603#issuecomment-2658945047) list and the supposed to be deleted space doesn't have .mlock file in the storage/users/spaces/... path.
If this is a fixable issue in the server
CC @2403905
@saw-jan do we know why the deletion of the user fails in the first place? I think the only chance we have is to avoid that original issue. I'm not sure how we can recover from a failed user deletion automatically.
@saw-jan do we know why the deletion of the user fails in the first place? I think the only chance we have is to avoid that original issue. I'm not sure how we can recover from a failed user deletion automatically.
From the logs, I can say that the user deletion was successful - at least the server sent successful response
{
"value": [
{
"driveAlias": "personal/alice",
"id": "fe49f560-62f3-41fd-8364-fbcfee25e7af$c3326e64-7aa8-4d59-88ed-0da631963402",
"lastModifiedDateTime": "2025-02-14T06:39:05.440059135Z",
"name": "Alice Hansen",
"root": {
"deleted": {
"state": "trashed" 👈
},
"eTag": "\"ee10f95ca197e98906aad303ac00d5b9\"",
"id": "fe49f560-62f3-41fd-8364-fbcfee25e7af$c3326e64-7aa8-4d59-88ed-0da631963402",
"webDavUrl": "https://ocis-server:9200/dav/spaces/fe49f560-62f3-41fd-8364-fbcfee25e7af$c3326e64-7aa8-4d59-88ed-0da631963402"
},
},
{
"driveAlias": "personal/alice",
"id": "fe49f560-62f3-41fd-8364-fbcfee25e7af$3af10eab-f43d-42f3-92f0-d6b2d30262c5",
"lastModifiedDateTime": "2025-02-14T06:40:04.180433623Z",
"name": "Alice Hansen",
"root": {
"eTag": "\"75106885b3161a9c779c714f97d60bc4\"",
"id": "fe49f560-62f3-41fd-8364-fbcfee25e7af$3af10eab-f43d-42f3-92f0-d6b2d30262c5",
"webDavUrl": "https://ocis-server:9200/dav/spaces/fe49f560-62f3-41fd-8364-fbcfee25e7af$3af10eab-f43d-42f3-92f0-d6b2d30262c5"
},
}
...
]
}
The old (deleted) account has this property:
"deleted": {
"state": "trashed"
},
^ @kobergj let us know if this is fixable or correct by design in the server side. We are hardening the API tests in https://github.com/owncloud/ocis/pull/11000
We need to find out why the delete of the personal space fails. But the logging in this part of the code is insufficient. I created a PR that adds some logging. We should be able to tackle the error once this is merged and it reoccurs. https://github.com/owncloud/ocis/pull/11037
We need to find out why the delete of the personal space fails. But the logging in this part of the code is insufficient. I created a PR that adds some logging. We should be able to tackle the error once this is merged and it reoccurs. #11037
OKay, thanks. We will wait for https://github.com/owncloud/ocis/pull/11037 to be merged and for test failure to reoccur before merging our test fix PR https://github.com/owncloud/ocis/pull/11000
@kobergj I made a PR from your PR and ran only the problematic test scenarios. here are the logs when this issue occurs: https://drone.owncloud.com/owncloud/ocis/43669/5/3, let's hope there're some useful info
PS: I have attached the following logs in case the drone logs disappear