gossip icon indicating copy to clipboard operation
gossip copied to clipboard

Keep all the content related to the user when purging the DB

Open dtonon opened this issue 9 months ago • 13 comments

Advanced after implementing https://github.com/mikedilger/gossip/issues/986

When purging the DB, it would be nice to keep all of the user's content related to discussions they have participated in and notifications, as I do in https://github.com/dtonon/chronicle

dtonon avatar Apr 01 '25 19:04 dtonon

This is on unstable. See the new docs/PRUNING.md for pruning instructions.

mikedilger avatar Apr 02 '25 23:04 mikedilger

Great, I will treat it today.

dtonon avatar Apr 03 '25 05:04 dtonon

See the new docs/PRUNING.md 3. Run gossip prune_old_events

Damn, why I didn't RTFM?! I used the button in the settings, isn't it the same thing?

Now I reopened Gossip, it said that the password is wrong, and after few attempts it present me the wizard.

dtonon avatar Apr 03 '25 08:04 dtonon

I tried again, now it accepted the password but showed me again the wizard. And I'm stuck without any continue button at this step:

Image

Maybe I should try to recover from an old backup.

Edit: Pressing Enter I was able to go on, I suspect that the continue button was out of the screen on the right, because in the "follow users" step I noticed that.

dtonon avatar Apr 03 '25 08:04 dtonon

Now the account loads fine.

Some stats. Before pruning (using the settings button, with 90 days setting):

9580756992 Apr 3 09:59 data.mdb

After pruning:

Database has been pruned. 2782964 events removed. 12647088128 Apr 3 10:08 data.mdb

Something is wrong here, the space increased.

Running now gossip prune_old_events removed only 602 events, so the first pruning somehow worked.

So I run mdb_copy to compress the DB and I recovered some space:

6686621696 Apr 3 10:42 data.mdb

These are the stats:

General: 49152 bytes Events: 1214365696 bytes, 470190 events Event Index (Author + Kind): 459423744 bytes Event Index (Kind): 10928128 bytes Event Index (Tags): 3081912320 bytes Event Seen on Relay: 1266696192 bytes Event Viewed: 851968 bytes Hashtags: 2703360 bytes Relays: 2539520 bytes People: 33243136 bytes Person-Relays: 123797504 bytes Person-Lists: 114688 bytes Event Relationships By Id: 149831680 bytes Event Relationships By Addr: 31784960 bytes Nip46 Servers: 32768 bytes Followings: 9666560 bytes FoF: 2179072 bytes Handlers: 49152 bytes Configured Handlers: 49152 bytes

The total (6,388,198,744 bytes) is similar to the ls output.

The raw events occupy ~20% of the space, all the remaining is used by relations.

Are we sure that there is not bug purging Event Index (Tags)? 2935 MB seem too much. I would also check Event Seen on Relay (1208 MB) and Event Index (Author + Kind) (438 MB).

dtonon avatar Apr 03 '25 08:04 dtonon

So I lower the setting to 30 days and run gossip prune_old_events, the db increased:

7846035456 Apr 3 10:55 data.mdb

So I tried to compress the db again and this helped:

5855723520 Apr 3 10:58 data.mdb

Stats:

General: 49152 bytes

Events: 470106112 bytes, 197508 events Event Index (Author + Kind): 459423744 bytes Event Index (Kind): 10928128 bytes Event Index (Tags): 3081977856 bytes Event Seen on Relay: 1266696192 bytes Event Viewed: 360448 bytes Hashtags: 688128 bytes Relays: 2539520 bytes People: 33243136 bytes Person-Relays: 123813888 bytes Person-Lists: 114688 bytes Event Relationships By Id: 66797568 bytes Event Relationships By Addr: 31784960 bytes Nip46 Servers: 32768 bytes Followings: 9650176 bytes FoF: 2179072 bytes Handlers: 49152 bytes Configured Handlers: 49152 bytes

Event Index (Tags) (2937.5 MB) and Event Seen on Relay (1208.5 MB) have identical size. So there is definitively a bug.

Some other suggestions:

  • I would output the stats in MB, with separators, so they are easier to read
  • I would immediately compress the db after the prune

dtonon avatar Apr 03 '25 09:04 dtonon

Other note: all my DMs vanished, I suppose there is a bug here, too.

dtonon avatar Apr 03 '25 09:04 dtonon

I don't know what happened! How did the password not work? Why is it showing you the wizard? How did your DMs vanish? None of that makes any sense to me. Did you save the original?

The button in the settings is the same thing, but it competes with active gossip processes, so it is better to do it with command line. But the button should still work, just be slower. I'll go ahead and remove them though.

Also you'll get another shrink after you rebuild relationships.

mikedilger avatar Apr 03 '25 21:04 mikedilger

LMDB space always increases, even when deleting things. It was a design tradeoff to be super fast. You have to mdb_copy -c to reclaim space.

mikedilger avatar Apr 03 '25 21:04 mikedilger

Ok wow my prune code is bad. It has been bad for a long time. It really should not prune certain important kinds of events. I guess I always have newish ones so I never noticed. I'm making fixes.

mikedilger avatar Apr 03 '25 21:04 mikedilger

Ok I think pruning is fixed on unstable. If you still have your pre-pruned database maybe try again if you dare.

mikedilger avatar Apr 04 '25 00:04 mikedilger

I don't know what happened! How did the password not work? Why is it showing you the wizard? How did your DMs vanish? None of that makes any sense to me.

I really don't know! PS: Only vanished DM older than 3/4 weeks, so I suppose their are not preserved in the prune.

Did you save the original?

Stupidly, I did not back up in advance; but I have 1 month old backup, I will do a test with this one.

dtonon avatar Apr 04 '25 08:04 dtonon

I'm building a command to import the events from a different (backup) LMDB. It works but I'm just refining the commit now.

mikedilger avatar Apr 04 '25 20:04 mikedilger