self-hosted icon indicating copy to clipboard operation
self-hosted copied to clipboard

Sentry-data volume cleanup

Open kmentch opened this issue 1 year ago • 15 comments

Self-Hosted Version

22.11.0

CPU Architecture

x86_64

Docker Version

20.10.8

Docker Compose Version

1.29.2

Steps to Reproduce

  1. Check docker volume disk usage with du (695G ./sentry-data)
  2. Run cleanup with a number of days less than what gets run on cron daily (SENTRY_EVENT_RETENTION_DAYS=15), so choose 12 or 10 to cleanup more data.
  3. Expect sentry-data disk usage to go down.

Expected Result

Sentry-data disk usage is lower than previous

Actual Result

Sentry-data disk usage is exactly the same before running cleanup.

Would like to know whats actually stored in sentry-data/_data and if theres a way to clean it up, the same goes for the kafka volume too.

Event ID

No response

kmentch avatar Jul 31 '23 19:07 kmentch

Is it possible that you have a lot of attachments? The sentry cleanup command that this is executing only deletes rows in Postgres, it doesn't touch attached files.

One thing you could try (after backing up ./sentry-data, of course!) is just manually deleting everything in .sentry-data that is older than 15 days. If that solves your problem (ex, you see that folder drop by 100s of GBs), could you please report back? We can look into modifying the cleanup script to purge these old attachment files as well.

azaslavsky avatar Aug 02 '23 18:08 azaslavsky

Doing a cleanup of any file that was created longer than 100 days so far has cleaned up almost 500GB. I am reviewing with my team what these files could be so we can better understand what is being saved here. But it would be nice if this was included in the cleanup process as well.

kmentch avatar Aug 07 '23 16:08 kmentch

Cleanup for us is more about data retention for compliance in prod. If you'd like a job to handle the cleanup of disk space as well, you can create a job similar to vroom-cleanup that is defined in docker-compose.yml. It should look something like this

disk-cleanup:
<<: *restart_policy
    image: sentry-cleanup-self-hosted-local
    build:
      context: ./cron
      args:
        BASE_IMAGE: sentry-self-hosted-local
    entrypoint: "/entrypoint.sh"
    command: '"0 0 * * * find /_data -type f -mtime +$SENTRY_EVENT_RETENTION_DAYS -delete"'
    volumes:
      - sentry-data:/_data

hubertdeng123 avatar Aug 07 '23 18:08 hubertdeng123

What about changing -mtime to -atime? Maybe there are files which have been accessed frequently in this period, but they have not changed since +$SENTRY_EVENT_RETENTION_DAYS.

aminvakil avatar Aug 07 '23 18:08 aminvakil

Could these files be linked to source maps? We upload them during every release build for both iOS and Android. Where are source maps stored on the server?

techgerm avatar Aug 11 '23 18:08 techgerm

Definitely could use -atime instead for the cleanup job.

Could these files be linked to source maps?

Maybe! Not sure exactly where source maps are stored on the server, but that may be in sentry-data along with attachments

hubertdeng123 avatar Aug 14 '23 19:08 hubertdeng123

I executed find /var/lib/docker/volumes/sentry-data/_data -type f -atime +75 -delete (with atime) and after it took a week command got executed, nothing got removed.

I'm trying with mtime now, to see if that removes anything.

aminvakil avatar Aug 16 '23 06:08 aminvakil

On our instance, running find /var/lib/docker/volumes/sentry-data/_data -type f -atime +90 -delete saved ~8gb, about 12% of total disk space, 32% of the sentry-data volume. 90 matches our configured retention from .env.

This is after about 2 years of self-hosting sentry. The remaining 16gb was accessed/modified within the specified retention. If none of the files were properly deleted, the volume would be much bigger by now, so my guess is that at some point it did leave few files behind, or that only some specific kind of files are not cleaned up correctly.

wodCZ avatar Aug 21 '23 11:08 wodCZ

Thanks for reporting back folks

hubertdeng123 avatar Aug 23 '23 16:08 hubertdeng123

I forgot to report back, mtime didn't do anything either. We decreased retention days at last.

aminvakil avatar Sep 16 '23 21:09 aminvakil

Is it possible that you have a lot of attachments? The sentry cleanup command that this is executing only deletes rows in Postgres, it doesn't touch attached files.

I'm not a Sentry administrator or developer, so my knowledge of Sentry is very limited. I have found that the cleanup command is deleting files too, since 2016, getsentry/sentry#2504.

flora-five avatar Jan 27 '24 12:01 flora-five

While working to investigate a performance issue on a 23.9.1 self-hosted Sentry server, using the filesystem as back-end for the File Store, I have found that there were a lot of empty directories in the sentry-data volume, under /files. On this server, there were no files older than SENTRY_EVENT_RETENTION_DAYS, just empty directories. The cleanup command was deleting older files, but not directories.

These were the filesystem stats when I've started to investigate:

Filesystem      Inodes   IUsed   IFree IUse% Mounted on
/dev/nvme0n1   6553600 6121416  432184   94% /var/lib/docker

Filesystem     1M-blocks  Used Available Use% Mounted on
/dev/nvme0n1      100478 65451     30383  69% /var/lib/docker

After running find /var/lib/docker/volumes/sentry-data/_data/files -type d -empty | xargs -r rmdir, the filesystem stats were:

Filesystem      Inodes   IUsed   IFree IUse% Mounted on
/dev/nvme0n1   6553600 2684783 3868817   41% /var/lib/docker

Filesystem     1M-blocks  Used Available Use% Mounted on
/dev/nvme0n1      100478 51293     44541  54% /var/lib/docker

2.5M empty directories have been removed and 14GB of storage space has been freed.

Even if the filesystem backend is not suitable for production use, does it make sense to consider removing the empty directories, either with a cleanup Docker task in the self-hosted setup or directly in the filesystem back-end implementation of Sentry?

If this info might help other Sentry users, the filesystem mounted on /var/lib/docker was a regular ext4 filesystem. After a server reboot, initially it took about 40-50 minutes until all containers were running and Sentry was ready. After removing the empty directories, the server still needed about 20-30 minutes to be ready. I have replaced the filesystem with an xfs filesystem and the server now needs only 3-5 minutes to be ready. The server is running an EL 9 distribution, with Docker 25.0.1.

flora-five avatar Jan 27 '24 12:01 flora-five

@flora-five Wow great investigation. It does make sense to do this directly in the filesystem back-end implementation, because it seems like that should be handled automatically. I think in order to immediately get this working, you can potentially add

find /var/lib/docker/volumes/sentry-data/_data/files -type d -empty | xargs -r rmdir

to the docker compose file: https://github.com/getsentry/self-hosted/blob/fbf17503947371fcd153de15ca5a8dc7a9ad33f7/docker-compose.yml#L403

hubertdeng123 avatar Jan 30 '24 22:01 hubertdeng123

@flora-five Thank you very much for this!

Would you like to open a PR adding this like @hubertdeng123 said? Although we should think more about performance issues it might have on large instances running it every night.

aminvakil avatar Jan 31 '24 15:01 aminvakil

This issue has gone three weeks without activity. In another week, I will close it.

But! If you comment or otherwise update it, I will reset the clock, and if you remove the label Waiting for: Community, I will leave it alone ... forever!


"A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀

getsantry[bot] avatar Feb 23 '24 08:02 getsantry[bot]