stactools
stactools copied to clipboard
New command: `stac archive <href> <outfile>`
As brought up in Gitter by @schwehr, it would be a neat feature to crawl an entire catalog/collection, make all hrefs relative, and save it in an archive format (e.g. zip, tarball, etc).
Discussion is here: https://gitter.im/SpatioTemporal-Asset-Catalog/Lobby?at=62eaf7427ccf6b6d45c17803
Quoting myself:
I was playing with pystac_client earlier this week with Earth Engine's catalog. It's hard to not notice how long it takes load the entire catalog (more than 90 seconds), which is only 28MB total. Aside from having a STAC API setup (which I would like to do), I was wondering what people would think of having a zip of the catalog next to the full tree (and/or .tar.gz, .tar.bz2, .tar.xz cause they are even smaller... 640K for the xz)? This would not be replacing the regular STAC tree. Pulling a single file that small from GCS is pretty fast.
Thanks Pete for the feedback! async would definitely help too. I think the one thing I need to do is to make sure I rewrite the json inside the zip to all be relative. Otherwise all clients are likely to go right back to the separate files after they see the top level catalog with links to the children as separate independent json files via http.
Does the existing stac-fastapi server have compressed HTTP (gzip,...) enabled?
I don't know, I'd recommend asking over there: https://github.com/stac-utils/stac-fastapi/issues