SeaweedFS doesn't seem production ready
Self-Hosted Version
25.10
CPU Architecture
x86_64
Docker Version
not relevant
Docker Compose Version
not relevant
Machine Specification
- [x] My system meets the minimum system requirements of Sentry
Steps to Reproduce
The seaweedfs docs do not include a full backup/restore strategy: https://github.com/seaweedfs/seaweedfs/wiki/Data-Backup
The existing solution requires downtime
This example requires your entire cluster to pause(stop receiving writes) while the volumes are being backed up/transferred
Expected Result
Stable local S3 provider
Actual Result
Unstable local S3 provider
Event ID
No response
Hi Max! This is a hard topic to discuss, most of it because of licensing issues, but other than that you can see it on this PR. This was kind of why we chose and why we didn't chose a certain provider:
- MinIO
- They use AGPL, whereas we want to use something more permissive (see this philosophy page).
- Closer to the time we want to merge Nodestore S3 changes, there was a drama. So... obviously no.
- Honestly if it wasn't about the licensing, I would go forward with this.
- Ceph
- Very hard to manage, and I'm not sure whether you can run it on a single node machine (or a single container).
- They use GPL, not really an issue, but I don't know if this would cause an issue in the future.
- Garage (https://garagehq.deuxfleurs.fr/)
- This was the second choice to Seaweed, the annoying part was the configuration, we need a config file and store the admin credentials on the config file, which is not really secure and requires some Bash magic to make it happen.
- License is AGPL. Similar case to MinIO, but at least Deuxfleurs is not a company (I think?).
- SeaweedFS
- Licensing is simple, they use Apache-2.0, so no problems.
- Previously we're skeptical about how we manage this, since we thought it's similar to Ceph where you need multiple nodes to get it up and running.
- On the benchmark, it looks like this one is better suited for us because it handles many small files better (nodestore is many small files on a single bucket).
- RustFS (https://github.com/rustfs/rustfs)
- Apache-2.0
- Still too early, don't think it's feasible.
Regarding the backup/restore, that will be my backlog to document that.
@aldy505 Okay I understand. Thanks for the detailed response and looking forward to the documentation. Right now, I'm trying to just tar gz the whole directory and putting it into S3 as a dump (ironic no? 😄 )
As a site administrator (KDE.org Sysadmin) who has been looking into the deployment of self-hosted S3 services for us,
Couple of additional items to note here:
- Garage:
- Experiences scalability issues with a large number of files - see https://git.deuxfleurs.fr/Deuxfleurs/garage/issues/851
- When deployed as a single instance with it's default database has a risk of corruption if Garage does not exit cleanly (system crash, etc), and must be configured to use SQLite instead to protect against this which reduces performance. See https://garagehq.deuxfleurs.fr/documentation/cookbook/real-world/
- MinIO:
- Has essentially been declared unmaintained, see https://github.com/minio/minio
- The owners of this have first stripped the admin UI, and then stripped the pre-built binaries, both in the last few months, so I wouldn't be trusting them to do any maintenance.
- Ceph:
- Can be run as a single instance, however it requires config changes to override default behavior.
@aldy505 Okay I understand. Thanks for the detailed response and looking forward to the documentation. Right now, I'm trying to just
tar gzthe whole directory and putting it into S3 as a dump (ironic no? 😄 )
@max-wittig Actually, it's not that ironic! At my company, I've been migrating from MinIO to SeaweedFS since last week, and discovered that backing up SeaweedFS to another S3-compatible API is the ideal backup solution: https://github.com/seaweedfs/seaweedfs/wiki/Async-Backup