opencloud icon indicating copy to clipboard operation
opencloud copied to clipboard

Treesize propagation seems to be broken

Open butonic opened this issue 1 month ago • 4 comments

This is on kubernetes with decomposeds3.

I noticed during testing the kubernetes deployment that when deleting my test folder of images this was logged:

{
    "level": "error",
    "service": "storage-users",
    "host.name": "opencloud-api-64cc5fc565-hs99s",
    "pkg": "rgrpc",
    "traceid": "a2ed8a59b80ef96f125ebfae8675467e",
    "method": "sync.Propagate",
    "spaceid": "45a55527-b577-424a-9683-6e563a99e1dc",
    "nodeid": "868ebeb6-0e4b-4f08-8055-21ff10314499",
    "sizeDiff": -136331874,
    "spaceid": "45a55527-b577-424a-9683-6e563a99e1dc",
    "nodeid": "45a55527-b577-424a-9683-6e563a99e1dc",
    "treeSize": 132706773,
    "sizeDiff": -136331874,
    "time": "2025-11-06T09:59:23Z",
    "line": "github.com/opencloud-eu/reva/[email protected]/pkg/storage/pkg/decomposedfs/tree/propagator/sync.go:181",
    "message": "Error when updating treesize of parent node. Updated treesize < 0. Reestting to 0"
}

So I uploaded the folder again and deleted it, reproducing the same error log.

Then I noticed a difference in the size that the folder would show in the file listing after the upload finished:

Image

there are no hidden files in the space. And while it shows 111.7 MB the folder actually contains 156.059.341 Bytes

after deleting the folder the space is empty, but the quota says 42MB are used 🤔 :

Image

The treesize is obviously off:

~/storage/users/spaces/45/a55527-b577-424a-9683-6e563a99e1dc/nodes/45/a5/55/27$ strings  ./-b577-424a-9683-6e563a99e1dc.mpk 
user.oc.tmtime
2025-11-06T10:17:49.453225698Z
user.oc.owner.idp
0https://keycloak.opencloud.test/realms/openCloud
user.oc.owner.id
$45a55527-b577-424a-9683-6e563a99e1dc
user.oc.owner.type
primary
user.oc.name
Admin Admin
user.oc.tmp.etag
user.oc.propagation
user.oc.space.id
$45a55527-b577-424a-9683-6e563a99e1dc
user.oc.treesize
42020494
user.oc.id
$45a55527-b577-424a-9683-6e563a99e1dc
user.oc.space.alias
-personal/10000000-0000-0000-0000-000000000000
user.oc.space.name
Admin Admin
user.oc.space.type
personal

using a new space:

we start 'empty':

Image

finished upload

Image

refresh page

Image

move to trash

Image

ok web could update the foldersize, but refreshing fixes this:

Image

upload again ...

Image

now we are at 139 MB... delete and upload again

Image

118.9 MB ?!?!

ah that would explain how we can run into negative treesize. ... well maybe...

In any case all files have been uploaded correctly. just the size proagation was off.

The number of pods varied between 5 and 10 due to load.

The underlying fs is a rwx volume with cephfs:

[email protected]=/volumes/csi/csi-vol-f86052c4-a4be-4691-aa1b-c590f0f64a4d/1d535cc2-2b82-4ace-b297-2b8fa5b5b857 on /var/lib/opencloud type ceph (rw,relatime,name=csi-cephfs-node,secret=<hidden>,acl,mon_addr=10.43.107.178:6789)

testing on localhost

Image

delete and upload

Image

rinse and repeat

Image

😢

Related issues

https://github.com/opencloud-eu/opencloud/issues/1157 https://github.com/opencloud-eu/opencloud/issues/1389 - but this is decomposedfs, no watch is involved

butonic avatar Nov 06 '25 10:11 butonic

I only set "STORAGE_USERS_DRIVER": "decomposed" which means it is using the synchronous propagation provider.

butonic avatar Nov 10 '25 09:11 butonic

I see double locks of the metadata happening. Added this diff to log sth:

diff --git a/pkg/storage/pkg/decomposedfs/metadata/messagepack_backend.go b/pkg/storage/pkg/decomposedfs/metadata/messagepack_backend.go
index 6bbdfa55e..d16d226c3 100644
--- a/pkg/storage/pkg/decomposedfs/metadata/messagepack_backend.go
+++ b/pkg/storage/pkg/decomposedfs/metadata/messagepack_backend.go
@@ -21,12 +21,14 @@ package metadata
 import (
 	"context"
 	"errors"
+	"fmt"
 	"io"
 	"io/fs"
 	"os"
 	"path/filepath"
 	"strconv"
 	"strings"
+	"sync"
 	"time"
 
 	"github.com/google/renameio/v2"
@@ -40,13 +42,15 @@ import (
 
 // MessagePackBackend persists the attributes in messagepack format inside the file
 type MessagePackBackend struct {
-	metaCache cache.FileMetadataCache
+	metaCache   cache.FileMetadataCache
+	lockCounter *LockCounter
 }
 
 // NewMessagePackBackend returns a new MessagePackBackend instance
 func NewMessagePackBackend(o cache.Config) MessagePackBackend {
 	return MessagePackBackend{
-		metaCache: cache.GetFileMetadataCache(o),
+		metaCache:   cache.GetFileMetadataCache(o),
+		lockCounter: NewLockCounter(),
 	}
 }
 
@@ -318,7 +322,13 @@ func (b MessagePackBackend) Lock(n MetadataNode) (UnlockFunc, error) {
 	if err != nil {
 		return nil, err
 	}
+	b.lockCounter.Increment(metaLockPath)
+	currentLocks := b.lockCounter.Get(metaLockPath)
+	if currentLocks > 1 {
+		fmt.Printf("%d locks detected on node %s, path %s\n", currentLocks, n.GetID(), metaLockPath)
+	}
 	return func() error {
+		b.lockCounter.Decrement(metaLockPath)
 		err := mlock.Close()
 		if err != nil {
 			return err
@@ -330,3 +340,37 @@ func (b MessagePackBackend) Lock(n MetadataNode) (UnlockFunc, error) {
 func (b MessagePackBackend) cacheKey(n MetadataNode) string {
 	return n.GetSpaceID() + "/" + n.GetID()
 }
+
+// ------------------------------------------
+
+type LockCounter struct {
+	mu     sync.Mutex
+	counts map[string]int
+}
+
+func NewLockCounter() *LockCounter {
+	return &LockCounter{
+		counts: make(map[string]int),
+	}
+}
+
+func (lc *LockCounter) Increment(path string) {
+	lc.mu.Lock()
+	defer lc.mu.Unlock()
+	lc.counts[path]++
+}
+
+func (lc *LockCounter) Decrement(path string) {
+	lc.mu.Lock()
+	defer lc.mu.Unlock()
+	lc.counts[path]--
+	if lc.counts[path] <= 0 {
+		delete(lc.counts, path)
+	}
+}
+
+func (lc *LockCounter) Get(path string) int {
+	lc.mu.Lock()
+	defer lc.mu.Unlock()
+	return lc.counts[path]
+}

butonic avatar Nov 10 '25 11:11 butonic

We are deleting the lockfile after unlocking it, which will prevent the filelock from blocking, because the underlying file ... is a different file :(

butonic avatar Nov 10 '25 11:11 butonic

posixfs with the hybrid metadata backend is also affected:

Image

butonic avatar Nov 10 '25 12:11 butonic