Implementation questions
Hi! I'm porting the syncing logic stuff to Nix so that I can deploy it declaratively, which has caused me to dig through your code. I have a few questions:
-
When downloading the tar.gz, I think we could lower the required disk space by piping
curldirectly intounpigzinstead of saving to the disk first. It should lower requirements by about 90GB. 1a. The 300GB requirement in self-host.md seems a bit low to me. The two copies of tiles we're saving is 300GB alone, but when we download a new copy it's going to take another 150GB (on top of the 90GB for the temp file) -
Why do we serve two copies of the tiles at a given time? Is it so that if someone leaves a client open, or the index JSON is cached, they won't hit now-broken links? Just want to make sure I understand this correctly. 2a. If yes, maybe we could delete the oldest version before we download the new version? Seems like we could truly have a 300GB disk requirement at that point.
-
Could you help me wrap my head around the assets and sprites a bit? Where do they come from? Are they specific to each weekly tile generation or could they be deployed once? Just wondering if I could wrap these up as part of my deployment process instead of downloading them. One of the things I like about self-hosting is that I don't have to worry too much about things breaking randomly, but downloading a bunch of stuff regularly increases this risk. If I could avoid it that would be ideal.
-
Is there a reason why we have Python calling Python here? Couldn't we just call it like a normal method?
Thanks, and awesome project btw! I have a server with >128GB of RAM and 4TB of flash storage that I'm hoping to retire as a result of this. I would definitely be happy to merge in the Nix changes if you're interested, or keep it separate if you'd prefer.
Hi! Thanks for the nice insights, really appreciated!
-
Yes, unpigz could work. The reason is mostly so that I can use aria and parallel downloads, as Cloudflare buckets are really slow and unreliable. Usually 8 parts are started and then only 1-2 survives by the end of the download. Something funky going on in Cloudflare bucket hosting, but it's outside of this project. Even with aria it'll fail from time to time, but the cron-job nature of this project doesn't really care about some failures, eventually it'll get it. But you are right, this is one part where disk space requirements can be saved.
-
So two copies are required that the http server is always functioning. Downloading + uncompressing can take hours. We cannot have downtime during this time.
Actually, how this works is the tiles are generated on Wednesday, they are synced on Thursday and they are activated on Saturday. So there is like 2-3 days when 2 versions are needed on the server.
The other, practical reason for having the previous version is rollback, if there is a bug in the map data, we can manually change the version back and the servers roll back in like a few seconds.
-
Assets and sprites come from this repo: https://github.com/hyperknot/openfreemap-styles They rarely change.
-
I just wanted to make this a standalone script. We could change it to a hybrid one, I mean a
if __name__ == "__main__"could probably handle this case.
Please put the Nix specific things in a separate repo for a start, I'm happy to have a look but prefer not to integrate it at this point. If you have improvements to this repo, feel free to add a PR of course!
Hey, thanks for your quick response! Just have a few followups before I go to bed.
- Makes sense. You're using unpigz in the repo as far as I can tell - I assume the aria thing is a local change?
- Seems like you could do the following flow:
- Say we're currently serving version A and version B, and version C becomes available
- First, delete version A (version B is still accessible)
- Second, download/setup up version C
- Third, switch over the metadata or whatever so that we're now pointing at version C As far as I can tell there shouldn't be any downtime here, no matter how long it takes to set up version C.
- Just to confirm, do these assets get used during tile generation at all? Just wondering if there's a possibility that a new tile set could require the assets to be updated or cause weird rendering.
- Makes sense. It looks like you're already doing the
if __name__thing but maybe click causes issues.
Thanks again!
- I mean I'm using it, but with a file on disk, not with a pipe. For this repo, I really prefer to keep it a temporary file + uncompress, I believe it's more reliable like this. But if you make a Nix version of this, feel free to use a pipe.
- You are correct, right now it's 2 permanent copies + 1 temporary. You could lower it to 1 permanent + 1 temporary. I just like the fact that we can rollback to 2-3 weeks ago at any time. Again, if you make a Nix version of this, you can lower it to 1 permanent copy. For production setup I prefer to keep it like this.
- No, those assets are only used in client / MapLibre side, they could even be hosted in a public bucket if we really wanted.
- Yes, it should work.
Thanks, makes sense!
One more question, sorry. Why do we keep the old sprites around when we download new ones?
Oh, yes, that one is a special, because they are versioned in the filename. If someone creates a custom style using these links:
They'll be stuck on the old version of a sprite forever. So we have to keep downloading all sprite versions and keep them forever. They are tiny, so it's not a bother.
If you are only targeting personal use case and are sure that the sprites can be fixed, you don't need to worry about old versions.
I see, makes sense. In that case should we be keeping all previous sprites in the Git repo? Seems a bit weird to me otherwise to expect the server to hold onto them. Ideally I think the server should mirror the Git repo 1:1
You are right, but it is actually kept in the repo:
https://github.com/hyperknot/openfreemap-styles/tree/main/sprites/sprites
f384 is the only version we have currently.
Awesome, thanks!
Alright, I have something decently usable now: https://gitlab.scd31.com/stephen/openfreemap-nix
I noticed that the download failed a lot due to the connection getting reset. I am wondering if there's some kind of timeout I'm hitting on your end - I'm running this in a colo and my connection is normally extremely solid.
The nginx config lives in nix/nixos.nix and the packaging (including the download script) lives in packages. I did change some logic around. I think the most significant thing is that the two copies of the tiles are mounted at the same time and are served by nginx at paths that include the version. This should prevent issues around cache invalidation (imagine if a client had cached some tiles from an older version while also requesting new tiles - they could have discontinuities between them)