Reclaiming more storage after delete+compact+vacuum

Open philrz opened this issue 2 years ago • 1 comments

The Zed lake maintains history about stored data on the assumption users may want to time travel to revisit the data as it looked in a prior state. The amount of storage consumed to maintain that history may become significant in some circumstances such as when there's been lots of commits (small data loads, granular deletes, etc.) When users decide they only care about data in its most recent state, commands like zed manage and zed vacuum can reclaim lots of storage by compacting/deleting underlying data objects that will no longer be referenced. However, some commit/snapshot history is still currently left behind. There should be commands that can help users safely reclaim that storage as well.

Details

At the time this issue is being opened, Zed it at commit afba255.

A community zync user recently asked the following question:

HI, in some ZED server, we have some pool with high volume of data on disk but only a few row (under 5) in it. We pass the vacuum command and manage command to clean/compact it but still consume space on disk. All the data consumed on disk are in the commits folder, Inside the folder, it have a lots of .zng and .snap.zng files. How can I clean data on disk unused but the pool ?

Here's an in-house repro that shows the effects similar to what's being described here. To simulate the extreme example of single-row commits, we've got a "record poster" tool that generates synthetic data by posting one JSON object at a time. To create an environment with lots of .zng and .snap.zng files inside the commits/ folder like the user is describing, after starting a zed serve, I run the command like this to post 10k records:

$ ./record-poster.py --exitafter 10000 --count
Posting: {"ts": "2023-12-14T23:52:48.624167Z", "message": "oshac menu Drokpa precooker yamaskite"}
Posting: {"ts": "2023-12-14T23:52:48.634553Z", "message": "Estotiland nanocephalia hapteron forthcoming nonmammalian billsticking actuaryship"}
Posting: {"ts": "2023-12-14T23:52:48.645388Z", "message": "hexastylar"}
...

The --count option makes the tool run a count() query after each record of data is posted, since this triggers the creation of a snapshot (i.e., a .snap.zng file).

At the end of that, our lake directory is ~2.1 GB in size. Broken down by what's under the directories that make up the specific pool:

$ du -sh *
 39M	branches
2.0G	commits
 78M	data

$ for dir in *; do   echo -n "$dir:"; find $dir | wc -l; done
branches:   10005
commits:   20001
data:   20001

To reproduce what the user is describing, I delete all but the five most recently-posted records, then compact and vacuum in an attempt to drop stored remnants of what came before. At the end that leaves me with just the 5 records I care about.

$ zed -version
Version: v1.11.1-27-g59b72d88

$ zed delete -where "ts < $(zed query -z 'from TestPool | head 5 | tail 1 | yield ts')"
2ZYbL2GB9IqGUs3rOs6zQPU8weI delete committed

$ zed manage
{"level":"info","ts":1702601939.068759,"logger":"pool","msg":"updating pool","name":"TestPool","id":"2ZYTMzeKYq7B8v0O7qhgPHAqtl3","branch":"main","interval":60,"vectors":false}
{"level":"info","ts":1702601939.1077251,"logger":"pool","msg":"compaction completed","name":"TestPool","id":"2ZYTMzeKYq7B8v0O7qhgPHAqtl3","branch":"main","interval":60,"vectors":false,"runs_found":1,"objects_compacted":5,"vectors_created":0}

$ zed vacuum -f
vacuumed 10000 objects

$ zed query 'from TestPool | count()'
5(uint64)

We can now see that nearly all the storage has been reclaimed from data/, but plenty is still consumed by branches/ and commits/.

$ du -sh *
 39M	branches
2.0G	commits
8.0K	data

$ for dir in *; do   echo -n "$dir:"; find $dir | wc -l; done
branches:   10007
commits:   20005
data:       3

In terms of what the user sees if they go looking for what's consuming it:

$ ls -l branches/
-rw-r--r--  1 phil  staff  118 Dec 14 15:52 1.zng
-rw-r--r--  1 phil  staff  121 Dec 14 15:52 10.zng
-rw-r--r--  1 phil  staff  121 Dec 14 15:52 100.zng
-rw-r--r--  1 phil  staff  121 Dec 14 15:53 1000.zng
-rw-r--r--  1 phil  staff  121 Dec 14 16:56 10000.zng
-rw-r--r--  1 phil  staff  121 Dec 14 16:56 10001.zng
-rw-r--r--  1 phil  staff  121 Dec 14 16:58 10002.zng
...

$ ls -l commits/
total 4251368
-rw-r--r--  1 phil  staff     629 Dec 14 15:52 2ZYTMuHjTxuJdChO3vWhibrQloX.snap.zng
-rw-r--r--  1 phil  staff     385 Dec 14 15:52 2ZYTMuHjTxuJdChO3vWhibrQloX.zng
-rw-r--r--  1 phil  staff     705 Dec 14 15:52 2ZYTMudagu62umCzVHXSYaOI7rc.snap.zng
-rw-r--r--  1 phil  staff     385 Dec 14 15:52 2ZYTMudagu62umCzVHXSYaOI7rc.zng
...
-rw-r--r--  1 phil  staff  426654 Dec 14 16:56 2ZYb3esOldUmwtlDQsRONjKbXMc.snap.zng
-rw-r--r--  1 phil  staff     384 Dec 14 16:56 2ZYb3esOldUmwtlDQsRONjKbXMc.zng
-rw-r--r--  1 phil  staff  426861 Dec 14 16:56 2ZYb3iArXDt4e8ct8hhwDOEqPLc.snap.zng
-rw-r--r--  1 phil  staff     386 Dec 14 16:56 2ZYb3iArXDt4e8ct8hhwDOEqPLc.zng
-rw-r--r--  1 phil  staff  426632 Dec 14 16:56 2ZYb3il6fxj78yXAYB0SJxHK3qg.snap.zng
-rw-r--r--  1 phil  staff     384 Dec 14 16:56 2ZYb3il6fxj78yXAYB0SJxHK3qg.zng
-rw-r--r--  1 phil  staff     355 Dec 14 16:58 2ZYbL2GB9IqGUs3rOs6zQPU8weI.snap.zng
-rw-r--r--  1 phil  staff  533210 Dec 14 16:58 2ZYbL2GB9IqGUs3rOs6zQPU8weI.zng
-rw-r--r--  1 phil  staff     181 Dec 14 16:59 2ZYbQ0Ble92zlYsH5UyxWI05fc9.snap.zng
-rw-r--r--  1 phil  staff     703 Dec 14 16:58 2ZYbQ0Ble92zlYsH5UyxWI05fc9.zng

This looks roughly similar to what was shown to us by the community zync user.

We can use the following Zed program to summarize the count and size of the three different kinds of files that are present across the branches/ and commits/ directories.

$ cat report.zed 
op ls_bytes(): (
  grok("%{NUMBER:bytes} %{MONTH}", this)
  | bytes:=uint64(bytes)
)

fork (
  => / [0-9]+\.zng$/ | ls_bytes() | put filetype:="branch"
  => / ...........................\.zng$/ | ls_bytes() | put filetype:="commit"
  => / ...........................\.snap\.zng$/ | ls_bytes() | put filetype:="snapshot"
)
| count(),sum_bytes:=sum(bytes),avg_bytes:=avg(bytes) by filetype
| put sum_MB:=sum_bytes/1024.0/1024.0,avg_MB:=avg_bytes/1024.0/1024.0

$ ls -lR branches/ commits/ | zq -i line -Z -I report.zed -
{
    filetype: "commit",
    count: 10992 (uint64),
    sum_bytes: 4502766 (uint64),
    avg_bytes: 409.64028384279476,
    sum_MB: 4.294172286987305,
    avg_MB: 0.0003906634176662395
}
{
    filetype: "snapshot",
    count: 10002 (uint64),
    sum_bytes: 2114648513 (uint64),
    avg_bytes: 211422.56678664268,
    sum_MB: 2016.685975074768,
    avg_MB: 0.2016282718531062
}
{
    filetype: "branch",
    count: 10003 (uint64),
    sum_bytes: 1210360 (uint64),
    avg_bytes: 120.999700089973,
    sum_MB: 1.1542892456054688,
    avg_MB: 0.00011539430626866627
}

In conclusion:

The contents of the branches/ directory and the commit files in commits/ make up ~20k files but since they only average a few hundred bytes apiece, they take up very little space
The snapshot files are indeed large taking up a total of 2 GB

The report looks similar for the community zync user that sent in similar ls -lR output from their environment.

$ cat out.txt | zq -i line -Z -I report.zed -
{
    filetype: "branch",
    count: 12522 (uint64),
    sum_bytes: 1515159 (uint64),
    avg_bytes: 120.99976042165788,
    sum_MB: 1.4449682235717773,
    avg_MB: 0.0001153943638054446
}
{
    filetype: "commit",
    count: 13421 (uint64),
    sum_bytes: 5128914 (uint64),
    avg_bytes: 382.1558751210789,
    sum_MB: 4.891313552856445,
    avg_MB: 0.00036445224296672717
}
{
    filetype: "snapshot",
    count: 11439 (uint64),
    sum_bytes: 2196035616 (uint64),
    avg_bytes: 191977.9365329137,
    sum_MB: 2094.302764892578,
    avg_MB: 0.18308442738810893
}

The Zed Lake Format doc goes into some more detail of the role of these different files. The bottom line is that if the user in this example truly only cares about the 5 most recent records and is ok losing the other history, there's not currently a zed command that does that automatically. And when that's implemented, it's TBD whether that would be done as a side effect of the vacuum command or a separate command.

In the meantime, after getting a refresher from the Dev team about the role of these different files, I was reminded of one short-term measure that users could safely take on their own to recoup some of this storage. Specifically, the snapshot files commits/*.snap.zng can be deleted by hand. This is because each snapshot just represents a point-in-time summary of history that would otherwise have to be constructed by walking the many non-snapshot files in that commits/ directory. Therefore if there's no need to revisit that history at peak performance, these files are no longer needed.

Dec 13 '23 19:12 philrz

We had a group discussion about this one. More specifics TBD, but some notes of thoughts thus far:

@mattnibs thought this kind of file cleanup should probably be handled by zed manage
There was consensus that the snapshot files could be moved to their own directory
1. Since finding snapshot files currently requires directory scanning, this should simplify that
2. It would also make it easier to communicate the short-term workaround to users (i.e., "it's ok to delete everything under snapshots/)
Think more about what's in branches/

Dec 15 '23 01:12 philrz