self-hosted
self-hosted copied to clipboard
Add more details on full backup
Problem Statement
Hi,
We're interested in full backups of Sentry. While the docs lists volumes that should be backed up, I feel it lacks some details. A high-level workflow would be appreciated
Solution Brainstorm
Can the volumes be backed-up while Sentry is running? (That is probably no the case for sentry-postgres
, and maybe sentry-redis
?)
What containers needs to be down while each volume is being backed-up?
Exemple (but probably false) workflow that I am expecting:
The following volumes hold critical long-term data and are prefixed with
sentry-
:
sentry-data
sentry-postgres
sentry-redis
sentry-zookeeper
sentry-kafka
sentry-clickhouse
sentry-symbolicator
Of these, the following volumes can be backed-up without downtime:
sentry-data
sentry-zookeeper
sentry-clickhouse
sentry-symbolicator
For these, you can mount the volume in another container and read the data from it. For example, to backup the
sentry-data
volume:$ docker run --rm -v sentry-data:/data -v $(pwd):/dest alpine:latest tar -cpzf /dest/sentry-data.tar.gz /data
PostgreSQL data can be backed-up while Sentry is running using
pg_dump
:$ docker compose exec postgres pg_dump -U postgres --clean --if-exists postgres | bzip2 - > sentry-postgres.sql.bz2
Redis and Kafka needs to be down before they can be backed up. Doing so will make Sentry inaccessible:
$ docker compose down kafka clickhouse symbolicator $ docker run --rm -v sentry-kafka:/data -v $(pwd):/dest alpine:latest tar -cpzf /dest/sentry-kafka.tar.gz /data $ docker run --rm -v sentry-redis:/data -v $(pwd):/dest alpine:latest tar -cpzf /dest/sentry-redis.tar.gz /data
The reason why we don't have many details surrounding a full backup of Sentry is because that is not something we've been able to fully support in the past. It's a risky procedure, and it involves backing up docker volumes instead of using a built in tool within Sentry to export all the metadata needed. Happy to add some details to a full backup in our docs if you're able to provide some, but it is sparse because we ourselves have not gone through the process here.
That said, happy to leave this issue open to prioritize at a later time.
Are you not doing full backups of the SaaS env? In that case I'd understand if you want not to support full backups.
I guess the documentation could be updated with a disclamer in an alert box, something like:
While it may seem natural to want to backup all data, we believe that, in a disaster recovery scenarii, partial backups are enough. While there is some insight on how to do full backups in this document, we highly recommend that you implement partial backups, as this is the supported way. We will not help users to recover from a disaster with full backups.
Are you not doing full backups of the SaaS env
Data that is backed up there is handled quite a bit differently. Thanks for the suggestion, we can add a disclaimer like this