Treesize propagation seems to be broken
This is on kubernetes with decomposeds3.
I noticed during testing the kubernetes deployment that when deleting my test folder of images this was logged:
{
"level": "error",
"service": "storage-users",
"host.name": "opencloud-api-64cc5fc565-hs99s",
"pkg": "rgrpc",
"traceid": "a2ed8a59b80ef96f125ebfae8675467e",
"method": "sync.Propagate",
"spaceid": "45a55527-b577-424a-9683-6e563a99e1dc",
"nodeid": "868ebeb6-0e4b-4f08-8055-21ff10314499",
"sizeDiff": -136331874,
"spaceid": "45a55527-b577-424a-9683-6e563a99e1dc",
"nodeid": "45a55527-b577-424a-9683-6e563a99e1dc",
"treeSize": 132706773,
"sizeDiff": -136331874,
"time": "2025-11-06T09:59:23Z",
"line": "github.com/opencloud-eu/reva/[email protected]/pkg/storage/pkg/decomposedfs/tree/propagator/sync.go:181",
"message": "Error when updating treesize of parent node. Updated treesize < 0. Reestting to 0"
}
So I uploaded the folder again and deleted it, reproducing the same error log.
Then I noticed a difference in the size that the folder would show in the file listing after the upload finished:
there are no hidden files in the space. And while it shows 111.7 MB the folder actually contains 156.059.341 Bytes
after deleting the folder the space is empty, but the quota says 42MB are used 🤔 :
The treesize is obviously off:
~/storage/users/spaces/45/a55527-b577-424a-9683-6e563a99e1dc/nodes/45/a5/55/27$ strings ./-b577-424a-9683-6e563a99e1dc.mpk
user.oc.tmtime
2025-11-06T10:17:49.453225698Z
user.oc.owner.idp
0https://keycloak.opencloud.test/realms/openCloud
user.oc.owner.id
$45a55527-b577-424a-9683-6e563a99e1dc
user.oc.owner.type
primary
user.oc.name
Admin Admin
user.oc.tmp.etag
user.oc.propagation
user.oc.space.id
$45a55527-b577-424a-9683-6e563a99e1dc
user.oc.treesize
42020494
user.oc.id
$45a55527-b577-424a-9683-6e563a99e1dc
user.oc.space.alias
-personal/10000000-0000-0000-0000-000000000000
user.oc.space.name
Admin Admin
user.oc.space.type
personal
using a new space:
we start 'empty':
finished upload
refresh page
move to trash
ok web could update the foldersize, but refreshing fixes this:
upload again ...
now we are at 139 MB... delete and upload again
118.9 MB ?!?!
ah that would explain how we can run into negative treesize. ... well maybe...
In any case all files have been uploaded correctly. just the size proagation was off.
The number of pods varied between 5 and 10 due to load.
The underlying fs is a rwx volume with cephfs:
[email protected]=/volumes/csi/csi-vol-f86052c4-a4be-4691-aa1b-c590f0f64a4d/1d535cc2-2b82-4ace-b297-2b8fa5b5b857 on /var/lib/opencloud type ceph (rw,relatime,name=csi-cephfs-node,secret=<hidden>,acl,mon_addr=10.43.107.178:6789)
testing on localhost
delete and upload
rinse and repeat
😢
Related issues
https://github.com/opencloud-eu/opencloud/issues/1157 https://github.com/opencloud-eu/opencloud/issues/1389 - but this is decomposedfs, no watch is involved
I only set "STORAGE_USERS_DRIVER": "decomposed" which means it is using the synchronous propagation provider.
I see double locks of the metadata happening. Added this diff to log sth:
diff --git a/pkg/storage/pkg/decomposedfs/metadata/messagepack_backend.go b/pkg/storage/pkg/decomposedfs/metadata/messagepack_backend.go
index 6bbdfa55e..d16d226c3 100644
--- a/pkg/storage/pkg/decomposedfs/metadata/messagepack_backend.go
+++ b/pkg/storage/pkg/decomposedfs/metadata/messagepack_backend.go
@@ -21,12 +21,14 @@ package metadata
import (
"context"
"errors"
+ "fmt"
"io"
"io/fs"
"os"
"path/filepath"
"strconv"
"strings"
+ "sync"
"time"
"github.com/google/renameio/v2"
@@ -40,13 +42,15 @@ import (
// MessagePackBackend persists the attributes in messagepack format inside the file
type MessagePackBackend struct {
- metaCache cache.FileMetadataCache
+ metaCache cache.FileMetadataCache
+ lockCounter *LockCounter
}
// NewMessagePackBackend returns a new MessagePackBackend instance
func NewMessagePackBackend(o cache.Config) MessagePackBackend {
return MessagePackBackend{
- metaCache: cache.GetFileMetadataCache(o),
+ metaCache: cache.GetFileMetadataCache(o),
+ lockCounter: NewLockCounter(),
}
}
@@ -318,7 +322,13 @@ func (b MessagePackBackend) Lock(n MetadataNode) (UnlockFunc, error) {
if err != nil {
return nil, err
}
+ b.lockCounter.Increment(metaLockPath)
+ currentLocks := b.lockCounter.Get(metaLockPath)
+ if currentLocks > 1 {
+ fmt.Printf("%d locks detected on node %s, path %s\n", currentLocks, n.GetID(), metaLockPath)
+ }
return func() error {
+ b.lockCounter.Decrement(metaLockPath)
err := mlock.Close()
if err != nil {
return err
@@ -330,3 +340,37 @@ func (b MessagePackBackend) Lock(n MetadataNode) (UnlockFunc, error) {
func (b MessagePackBackend) cacheKey(n MetadataNode) string {
return n.GetSpaceID() + "/" + n.GetID()
}
+
+// ------------------------------------------
+
+type LockCounter struct {
+ mu sync.Mutex
+ counts map[string]int
+}
+
+func NewLockCounter() *LockCounter {
+ return &LockCounter{
+ counts: make(map[string]int),
+ }
+}
+
+func (lc *LockCounter) Increment(path string) {
+ lc.mu.Lock()
+ defer lc.mu.Unlock()
+ lc.counts[path]++
+}
+
+func (lc *LockCounter) Decrement(path string) {
+ lc.mu.Lock()
+ defer lc.mu.Unlock()
+ lc.counts[path]--
+ if lc.counts[path] <= 0 {
+ delete(lc.counts, path)
+ }
+}
+
+func (lc *LockCounter) Get(path string) int {
+ lc.mu.Lock()
+ defer lc.mu.Unlock()
+ return lc.counts[path]
+}
We are deleting the lockfile after unlocking it, which will prevent the filelock from blocking, because the underlying file ... is a different file :(
posixfs with the hybrid metadata backend is also affected: