self-hosted
self-hosted copied to clipboard
Reduce number of containers in use
This should make things faster to install (speeding up e2e tests) and reduce the overall resource usage (fewer instances of operating systems running).
Ideas on reducing container usage:
- Collapse all cron containers into one
- Collapse snuba containers into one (?)
Collapse all cron containers into one
I looked into this earlier but since they need to be using the same containers with the cleanup services they are running with, doesn't seem possible. Also, what we do currently is a giant hack. The ideal way to do this is to rely on an external scheduler to just spin up the relevant containers with a different CMD. This is done in single-tenant through some k8s magic and I think can be done even in Helm but not in docker-compose.
A possible way is to have a single scheduler cron container running but that requires access to /var/run/docker.sock which is not always possible, opens a can of other worms, and just introduces "docker-in-docker" which actually is just docker accessing the host system so if you have volume mounts etc, you get bitten very hard.
Collapse snuba containers into one (?)
Relevant Snuba ticket: getsentry/snuba#1670
A possible way is to have a single scheduler cron container running but that requires access to /var/run/docker.sock which is not always possible, opens a can of other worms, and just introduces "docker-in-docker" which actually is just docker accessing the host system so if you have volume mounts etc, you get bitten very hard.
My understanding is that this is not docker-in-docker. We would be creating sibling containers, not child containers so we would be able to reference the named volumes we create, and we won't run into a lot of the "volumes are inside our container" problems because the daemon will still be running on the host. I'm not sure when the docker socket wouldn't be available, could you elaborate?
Relevant Snuba ticket
Thanks this is very useful!
My understanding is that this is not docker-in-docker. We would be creating sibling containers, not child containers so we would be able to reference the named volumes we create, and we won't run into a lot of the "volumes are inside our container" problems because the daemon will still be running on the host.
So the daemon would be running inside a docker container (not the host I'd say) but I think the rest of what you say is true. I guess it's worth a shot at least?
I'm not sure when the docker socket wouldn't be available, could you elaborate?
The greatest example is Windows (without WSL). I don't think anyone in their right minds would use a Windows machine to run self-hosted for prod-like purposes but for testing/experimental/learning stuff it may be a hindrance. Might be avoided by using the new "profiles" feature of compose so crons are opt-in (or opt-out?) rather than mandatory.
So the daemon would be running inside a docker container (not the host I'd say) but I think the rest of what you say is true. I guess it's worth a shot at least?
The daemon would still be running on the host, we'd just be passing the daemon control socket inside a container.
The greatest example is Windows (without WSL). I don't think anyone in their right minds would use a Windows machine to run self-hosted for prod-like purposes but for testing/experimental/learning stuff it may be a hindrance.
I'm pretty sure that Linux-on-Windows containers do have a docker socket, since it runs a Linux VM in Hyper-V. But self-hosted also relies on various GNU utilities that aren't available on Windows, so that would be a bad time anyway. (In other words I don't think we need to worry about Windows)
The daemon would still be running on the host, we'd just be passing the daemon control socket inside a container.
Yes yes, you are 💯 right. Not sure what I was thinking while writing this 😅
I'm pretty sure that Linux-on-Windows containers do have a docker socket, since it runs a Linux VM in Hyper-V. But self-hosted also relies on various GNU utilities that aren't available on Windows, so that would be a bad time anyway. (In other words I don't think we need to worry about Windows)
Thinking again, yes I think you are right. My memory gets mixed with WSL1 vs WSL2. Since WSL2 runs in actual Linux, there is a real Linux socket that you can forward. That said even then you need to make sure you turn on the "WSL integration": https://github.com/docker/for-win/issues/5096#issuecomment-551426331
Anyway, no matter what, I feel like what I expressed earlier is mostly FUD and is not really based on real or at least up-to-date testing. Thanks a lot for making me think through and hearing me out.
Happy to help implementing this!
Anyway, no matter what, I feel like what I expressed earlier is mostly FUD and is not really based on real or at least up-to-date testing. Thanks a lot for making me think through and hearing me out.
No worries! I am all to familiar with the pains of WSL 1 and incompatibility, and have personally been bitten by DIND, so I understand the caution.
Happy to help implementing this!
Thanks! I'll definitely keep that in mind. I'll try to make a game plan before I start on implementation.
Chatted with @ethanhs on this. Can we fold all snuba containers into one, including the cleanups? If so that would fold 10 down to one, at which point we probably don't care that there are two additional long-running cleanups (symbolicator, sentry).
Second thought would be that we should understand the resource usage profiles of the various containers—snuba and beyond—to make an informed decision about which we can run on the same box (complement).
Will be great to simplify everything and make it more standalone/stateless. For example, we shouldn't build/configure everything on startup - it should be configured on build time. And all services configure via ENV vars, don't need to change metadata (compose file, docker, etc.).
So I can't migrate everything in my own infrastructure (for ex. with Ansible, Traefik, Postgres and other services) - I have to describe everything from the scratch.
So reducing amount of containers and runtime configurations will significantly simplify live for many people :)
Hrm, that sounds like a new ticket.
Any updates here?
Any updates here?
@Actticus only the "errors only mode" (docs: https://develop.sentry.dev/self-hosted/#errors-only-self-hosted-sentry). Other than that, nope.
Hi Team- If i plan to seperate out the various containers from self hosted sentry-docker compose.. what all could be the stateful components i should seperate out from self-hosted sentry? So that i dont have any data loss even if there is scaling of sentry or restarts? I have already seperated out postgres and using aws rds for this
Hi Team- If i plan to seperate out the various containers from self hosted sentry-docker compose.. what all could be the stateful components i should seperate out from self-hosted sentry? So that i dont have any data loss even if there is scaling of sentry or restarts? I have already seperated out postgres and using aws rds for this
@rahulbalubankar You can separate everything, but it's going to be complicated. You can see here for external Kafka: https://github.com/getsentry/sentry-docs/pull/11847/files, for ClickHouse see https://github.com/getsentry/self-hosted/issues/3339#issuecomment-2372853203
I've been thinking maybe the simplest way to get the container count down is to just cut the ones that dont add value. If the end-goal is a self-host that isn’t sprawling, we could talk about actually removing whole pieces
- Snuba/Kafka can go if you don’t need perf data or real-time queries. plain Postgres covers error events fine for small-to-mid installs.
- The cron containers don’t need to be a fleet; one lightweight scheduler could run periodic jobs directly.
- Symbolicator and similar helpers could be optional instead of mandatory.
- With just Postgres + Redis as stateful stores, everything else becomes pluggable rather than always-on.
I've been running Bugsink this way. install gives you a DB and a worker, nothing more, and it’s been a relief not to carry the Kafka/Snuba baggage. Different product, sure, but it proves you can strip error tracking down if you start from that assumption. Any chance self-hosted sentry will take a similar route? I would be willing to explore working towards it, but not if sentry-the-company will neve r accept the PR anyway
@karelvandiepen thanks a lot for the thoughtful suggestions! The main issue with many containers is each consumer requiring its own container. This is an issue we are actively working on. Your suggestion regarding a single Cron container is what I initially wanted but unfortunately we cannot run "tasks" using other services in Docker Compose. Since each of the crons need to run in their respective containers the only two options I'm aware of are as follows:
- Expose the
/var/run/docker.socksocket to the cron image so it can rundocker compose <service>commands itself. This limits the places and methods people can deploy self-hosted. It also requires elevated priviledges if I'm not mistaken. - Find a way to create an "omni image" which has the dependencies and binaries from all the services that has a cron. We actually tried to create an omnibus self-hosted image like this last year as part of a hackweek and let's just say it was not a pleasant experience to build or try to change/maintain it.
I think the main difference between Bugsink and Sentry is one is explicitly focusing on a much narrower use case: self-hosted error tracking whereas Sentry is a SaaS performance platform first. That said we have a similar offering called "errors-only": https://develop.sentry.dev/self-hosted/experimental/errors-only/