self-hosted icon indicating copy to clipboard operation
self-hosted copied to clipboard

Add more details on full backup

Open Mika56 opened this issue 11 months ago • 3 comments

Problem Statement

Hi,

We're interested in full backups of Sentry. While the docs lists volumes that should be backed up, I feel it lacks some details. A high-level workflow would be appreciated

Solution Brainstorm

Can the volumes be backed-up while Sentry is running? (That is probably no the case for sentry-postgres, and maybe sentry-redis?) What containers needs to be down while each volume is being backed-up?

Exemple (but probably false) workflow that I am expecting:

The following volumes hold critical long-term data and are prefixed with sentry-:

  • sentry-data
  • sentry-postgres
  • sentry-redis
  • sentry-zookeeper
  • sentry-kafka
  • sentry-clickhouse
  • sentry-symbolicator

Of these, the following volumes can be backed-up without downtime:

  • sentry-data
  • sentry-zookeeper
  • sentry-clickhouse
  • sentry-symbolicator

For these, you can mount the volume in another container and read the data from it. For example, to backup the sentry-data volume:

$ docker run --rm -v sentry-data:/data -v $(pwd):/dest alpine:latest tar -cpzf /dest/sentry-data.tar.gz /data

PostgreSQL data can be backed-up while Sentry is running using pg_dump:

$ docker compose exec postgres pg_dump -U postgres --clean --if-exists postgres | bzip2 - > sentry-postgres.sql.bz2

Redis and Kafka needs to be down before they can be backed up. Doing so will make Sentry inaccessible:

$ docker compose down kafka clickhouse symbolicator
$ docker run --rm -v sentry-kafka:/data -v $(pwd):/dest alpine:latest tar -cpzf /dest/sentry-kafka.tar.gz /data
$ docker run --rm -v sentry-redis:/data -v $(pwd):/dest alpine:latest tar -cpzf /dest/sentry-redis.tar.gz /data

Mika56 avatar Mar 12 '24 13:03 Mika56

The reason why we don't have many details surrounding a full backup of Sentry is because that is not something we've been able to fully support in the past. It's a risky procedure, and it involves backing up docker volumes instead of using a built in tool within Sentry to export all the metadata needed. Happy to add some details to a full backup in our docs if you're able to provide some, but it is sparse because we ourselves have not gone through the process here.

That said, happy to leave this issue open to prioritize at a later time.

hubertdeng123 avatar Mar 14 '24 22:03 hubertdeng123

Are you not doing full backups of the SaaS env? In that case I'd understand if you want not to support full backups.

I guess the documentation could be updated with a disclamer in an alert box, something like:

While it may seem natural to want to backup all data, we believe that, in a disaster recovery scenarii, partial backups are enough. While there is some insight on how to do full backups in this document, we highly recommend that you implement partial backups, as this is the supported way. We will not help users to recover from a disaster with full backups.

Mika56 avatar Mar 15 '24 14:03 Mika56

Are you not doing full backups of the SaaS env

Data that is backed up there is handled quite a bit differently. Thanks for the suggestion, we can add a disclaimer like this

hubertdeng123 avatar Mar 18 '24 22:03 hubertdeng123