self-hosted feat: Use S3 node store with garage

[!NOTE] This patch may or may not make it to the main branch so please do not rely on this yet. You are, however, free to use it as a blueprint for your own, custom S3 or S3-like variations.

Enables S3 node store using Garage and sentry-nodestore-s3 by @stayallive

This should alleviate all the issues stemming from (ab)using PostgreSQL as the node store.

[ ] We should implement the 90-day retention through S3 lifecycle options: https://garagehq.deuxfleurs.fr/
[ ] We should find a good size for the node store size and make it variable (currently hard-coded at 100G)
[x] We should have a proper migration path for existing installs

Dec 31 '24 13:12 BYK

Codecov Report

:white_check_mark: All modified and coverable lines are covered by tests. :white_check_mark: Project coverage is 99.49%. Comparing base (440b658) to head (95a4de2). :warning: Report is 1 commits behind head on master. :white_check_mark: All tests successful. No failed tests found.

Additional details and impacted files

@@           Coverage Diff           @@
##           master    #3498   +/-   ##
=======================================
  Coverage   99.49%   99.49%           
=======================================
  Files           3        3           
  Lines         197      197           
=======================================
  Hits          196      196           
  Misses          1        1

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

Dec 31 '24 13:12 codecov[bot]

Any reason why you didn't use SeaweedFS per what you said yesterday?

Dec 31 '24 14:12 aldy505

@aldy505

Any reason why you didn't use SeaweedFS per what you said yesterday?

Well I started with that and realized 3 things:

It really is not geared towards single-node setups and have nodes with different roles. This makes is more challenging to scale up or set up in our setup
It has this paid admin interface. Not a deal breaker but it is clear that it is geared towards more "professional" setups
Its S3 API interface support is not really great

Garage fits the bill much better as it is explicitly created for smaller setups like this, easy to expand without specialized roles, doesn't have any paid thing in it, and has much more decent and familiar S3 interface support.

Dec 31 '24 21:12 BYK

It really is not geared towards single-node setups and have nodes with different roles. This makes is more challenging to scale up or set up in our setup

when I tried seaweedfs last time (and I still use it for sourcemap/profile storage tbh) it had single node ability via weed server command. Like

weed server -filter=true -s3=true -master=true -volume=true

Some of them enabled by default.

May 25 '25 09:05 doc-sheet

I think garage/minio simpler for small setups, seaweedfs looks necessary for mid to high setups because all other services I know keep files as is.

And thousands of thousands small files like profiles not ideal to store on most popular filesystems i guess.

May 25 '25 09:05 doc-sheet

I think garage/minio simpler for small setups, seaweedfs looks necessary for mid to high setups because all other services I know keep files as is.

@doc-sheet Hey, I'm going to work on this PR. I'd think seaweed is better for self-hosted Sentry. One thing I don't like about Garage is that we need to specify the storage allocation beforehand, if we set it to 100GB, there might be some people that have more data than 100GB, I don't want that to cause any issues.

That said, since you said you've used seaweed before: How was your experience? How does it compare to MinIO or Ceph?

And thousands of thousands small files like profiles not ideal to store on most popular filesystems i guess.

Yeah if we set up an object storage, we might as well move filestore & profiles there too. But let's focus on nodestore first.

Jun 04 '25 01:06 aldy505

How was your experience? How does it compare to MinIO or Ceph?

It is a bit strange sometimes. But it is fine.

It has multiple options for filer store. I didn't try leveldb storage aiming to fault tolerance.

At first I tried redis it worked for several months and then... I just lost all data. It was there physically but wasn't available from API (s3 or web) - each list call returned different results.

I don't know if issue was in redis or weed itself. i suspect bug with ttl could be the reason too.

But after that incident I wiped cluster and started new one with scylla as a filer backend and it works fine for almost a year already despite that ttl bug.

Seaweedfs have multiple versions like

3.89
3.89_full
3.89_large_disk
3.89_large_disk_full

I suggest to use large_disk always. Documentation is not clear but it is easy to reach that limit https://github.com/seaweedfs/seaweedfs/wiki/FAQ#how-to-configure-volumes-larger-than-30gb

I don't know difference between full and normal and just use _large_disk_full builds :)

Also I don't use s3 auth - I was too lazy to set it up.

Other than all that I have no problems and barely touched it after initial setup. It just works. I added some volumes but not removed any yet.

As for minio and ceph. I never used ceph.

But minio was the reason to look for alternatives.

Tons of profiles from js-sdk stored as different files started to affect my monitoring script and soon it might start to affect minio performance too.

And it is not that easy to scale minio. And probably impossible to optimize for small-files storage. At least in my low-cost setup.

Jun 04 '25 07:06 doc-sheet

let's focus on nodestore first.

If seaweedfs would control ttl then there is another catch. I'm not sure if it is possible to control ttl with s3-api already.

weed have it's own settings for collections and it creates collection for each s3-bucket. https://github.com/seaweedfs/seaweedfs/wiki/S3-API-FAQ#setting-ttl

But if sentry itself would cleanup old data I guess there is no difference.

Jun 04 '25 07:06 doc-sheet

How was your experience? How does it compare to MinIO or Ceph?

It is a bit strange sometimes. But it is fine.

It has multiple options for filer store. I didn't try leveldb storage aiming to fault tolerance.

At first I tried redis it worked for several months and then... I just lost all data. It was there physically but wasn't available from API (s3 or web) - each list call returned different results.

I don't know if issue was in redis or weed itself. i suspect bug with ttl could be the reason too.

But after that incident I wiped cluster and started new one with scylla as a filer backend and it works fine for almost a year already despite that ttl bug.

Seaweedfs have multiple versions like

3.89

3.89_full

3.89_large_disk

3.89_large_disk_full

I suggest to use large_disk always. Documentation is not clear but it is easy to reach that limit https://github.com/seaweedfs/seaweedfs/wiki/FAQ#how-to-configure-volumes-larger-than-30gb

I don't know difference between full and normal and just use _large_disk_full builds :)

Also I don't use s3 auth - I was too lazy to set it up.

Other than all that I have no problems and barely touched it after initial setup. It just works. I added some volumes but not removed any yet.

Good to know about Seaweed

As for minio and ceph. I never used ceph.

But minio was the reason to look for alternatives.

Tons of profiles from js-sdk stored as different files started to affect my monitoring script and soon it might start to affect minio performance too.

And it is not that easy to scale minio. And probably impossible to optimize for small-files storage. At least in my low-cost setup.

Ah so everyone has the same experience with minio.

let's focus on nodestore first.

If seaweedfs would control ttl then there is another catch. I'm not sure if it is possible to control ttl with s3-api already.

weed have it's own settings for collections and it creates collection for each s3-bucket. https://github.com/seaweedfs/seaweedfs/wiki/S3-API-FAQ#setting-ttl

But if sentry itself would cleanup old data I guess there is no difference.

The sentry cleanup job only cleans up the one on filesystem. If we're using S3, it won't clean up anything. We need to configure S3 data cleanup on our own.

Jun 05 '25 05:06 aldy505

Looks like i missed that seaweedfs now have an ability to control ttl with s3 api. And I even linked to correct section of FAQ. :)

I'd like to look into new integraton with seaweedfs.

Nad by the way I like the idea of expanding sentry images.

I am myself install some packages and modules.

Like maybe an extra step in install to build user provided Dockerfiles.

Jun 06 '25 20:06 doc-sheet

Nad by the way I like the idea of expanding sentry images.

I am myself and install some packages and modules.

Like maybe an extra step in install to build user provided Dockerfiles.

Yes but I don't think people would go to non-default setup if they don't need anything.

Jun 07 '25 12:06 aldy505

This is interesting, very close to MinIO yet is far lightweight. https://github.com/rustfs/rustfs

Jul 07 '25 08:07 aldy505

I just tried trying to set things up with rustfs. It didn't work. I can't configure the administrative side of S3. See my PR here https://github.com/getsentry/self-hosted/pull/3821

Jul 20 '25 12:07 aldy505

@BYK Do you remember why we didn't use MinIO? Was it about licensing issue?

Jul 21 '25 14:07 aldy505

@BYK Do you remember why we didn't use MinIO? Was it about licensing issue?

Both licensing and reported performance and scalability issues reported by others.

Jul 21 '25 20:07 BYK

Now, what's missing is the retention days. We can't change the retention days mid installation, the bucket needs to be recreated.

Aug 06 '25 10:08 aldy505

Integration tests isn't passing. I think we should hold this off for a bit.

Aug 06 '25 13:08 aldy505

@aldy505 I also noticed that recently we've added objectstore. Perhaps related here? https://github.com/getsentry/sentry/pull/97271

Aug 06 '25 18:08 hubertdeng123

@aldy505 I also noticed that recently we've added objectstore. Perhaps related here? getsentry/sentry#97271

@hubertdeng123 I asked Jan last week, it's not being used on SaaS yet. Quoting him:

Right, we're planning to make this an intermediary layer to some backend - we do not have a strong story for self hosted yet. We wouldn't be using postgres for sure, instead offer two alternatives: Any S3-compatible backend or raw disk. Our first use case is event attachments followed, by release files and debug files. We do consider replacing nodestore, but it's not on our roadmap yet. Will likely take months to get to the point where we can plan that .

Aug 07 '25 03:08 aldy505

Yay it's green.

One bad thing is that we need to tell people to have at least another 16GB set aside for swapfile.

Aug 17 '25 09:08 aldy505

Ah right, we're missing the lifecycle thing.

Previously, I've seen folks that increase the number of retention days. So if we want to keep that behaviour, it wouldn't be possible by setting S3 retention days (and modifying it every once in a while during installation). I can think of setting 1 cron container, but would it be a sensible thing to do?

Aug 17 '25 09:08 aldy505

@aldy505

We should have a proper migration path for existing installs

Don't think we have this either?

Aug 18 '25 22:08 BYK

Config migration is done and is behind a prompt/flag. Next up is to think how to manage the retention. Should we do it with a cron/scheduled job (cons: probably heavy process, pros: can be configured dynamically), or a S3 lifecycle (cons: can't be configured dynamically, pros: won't be a heavy process).

Sep 06 '25 06:09 aldy505

I'm just gonna go forward with this https://github.com/seaweedfs/seaweedfs/wiki/S3-API-FAQ#setting-ttl

Sep 09 '25 14:09 aldy505

Great, I believe that's all.

Sep 09 '25 15:09 aldy505

@aminvakil @doc-sheet Hi, would you mind reviewing this PR?

Sep 09 '25 15:09 aldy505

Can we make it opt-in for a release? And get some feedback regarding this change?

@aminvakil You're saying this shouldn't be the default? And probably switch it as a default at a later stage? That works for me.

Sep 12 '25 00:09 aldy505

I'd say default for fresh installs. And since we prompt people for existing installs, it should be fine?

Sep 12 '25 07:09 BYK

@aldy505 Yes.

@BYK Yes.

I have zero knowledge about seaweedfs and very limited knowledge about S3 storages, therefore I'd suggest merging this after 15th, so people who use main branch can test it in different environments they have.

But again that's because my limited knowledge of S3 storages and I do not know who big this change is.

Sep 12 '25 11:09 aminvakil

@aldy505 Yes.

@BYK Yes.

I have zero knowledge about seaweedfs and very limited knowledge about S3 storages, therefore I'd suggest merging this after 15th, so people who use main branch can test it in different environments they have.

But again that's because my limited knowledge of S3 storages and I do not know who big this change is.

Great! I'll change it later tonight.

Sep 12 '25 14:09 aldy505

self-hosted self-hosted copied to clipboard

feat: Use S3 node store with garage

Codecov Report

self-hosted
self-hosted copied to clipboard