diving-into-pygeoapi
diving-into-pygeoapi copied to clipboard
add next generation formats to exercises
Add content to exercises for data publishing.
GeoParquet Zarr FlatGeoBUF PyArrow
For geoparquet additional dependencies are needed on gdal, what would be the best approach?
Think for pygeoapi only pyarrow and deps like GeoPandas are required. But the latter depends on GDAL. Looks like pyarrow is not in the pygeoapi Docker image? (as requirements-provider.txt is not included).
I hope this issue is not in the wrong repo: in the context of "Doing Geospatial in Python" WS meeting yesterday, we talked having "new/upcoming" formats included, if feasible, after investigation. Ok, I see there is:
https://github.com/geopython/geopython-workshop/issues/193 ...Open for suggestions...
Did some research, accessing (Geo)Parquet with (geo)arrow. See https://github.com/justb4/parquet-research , most in the README.md.
Findings a.o.
- GDAL not required
pygeoapiDockerfileonly needs to includepython3-arrowand GeoPandas to support the Parquet Provider. Pandas is already in the Docker Image. Not sure what GeoPandas adds, 3.5 MB Python, but also depends on Fiona (?).- We held off including
pyarrowas it would add around 110MB. See PR Review using wheels. Maybepython3-arrowis more compact... - so we could make a simple exercise where we add a Parquet Provider
- Overturemaps Python CLI allows quickly downloading data by BBOX, e.g. from Mostar centre.
- the Python programming examples are more for https://github.com/geopython/geopython-workshop/issues/193
More research: geoarrow-rs (Rust) with Python bindings by @kylebarron Kyle Barron et al. looks like a lighter-weight modular alternative. We already discussed with him while working on the pygeoapi parquet Provider PR. Now project is more mature. Footprint about 28M plus 6.8M for arro3 minimal Apache arrow dep package. Other libs are standard and/or included in pygeoapi deps like pyproj.
du -sh /Users/just/.pyenv/versions/geoarrow-rs/lib/python3.12/site-packages/*
6.8M /Users/just/.pyenv/versions/geoarrow-rs/lib/python3.12/site-packages/arro3
16K /Users/just/.pyenv/versions/geoarrow-rs/lib/python3.12/site-packages/arro3_core-0.5.1.dist-info
308K /Users/just/.pyenv/versions/geoarrow-rs/lib/python3.12/site-packages/certifi
24K /Users/just/.pyenv/versions/geoarrow-rs/lib/python3.12/site-packages/certifi-2025.4.26.dist-info
928K /Users/just/.pyenv/versions/geoarrow-rs/lib/python3.12/site-packages/charset_normalizer
60K /Users/just/.pyenv/versions/geoarrow-rs/lib/python3.12/site-packages/charset_normalizer-3.4.2.dist-info
28M /Users/just/.pyenv/versions/geoarrow-rs/lib/python3.12/site-packages/geoarrow <=========
16K /Users/just/.pyenv/versions/geoarrow-rs/lib/python3.12/site-packages/geoarrow_rust_io-0.3.0.dist-info
648K /Users/just/.pyenv/versions/geoarrow-rs/lib/python3.12/site-packages/idna
28K /Users/just/.pyenv/versions/geoarrow-rs/lib/python3.12/site-packages/idna-3.10.dist-info
12M /Users/just/.pyenv/versions/geoarrow-rs/lib/python3.12/site-packages/pip
104K /Users/just/.pyenv/versions/geoarrow-rs/lib/python3.12/site-packages/pip-25.1.1.dist-info
18M /Users/just/.pyenv/versions/geoarrow-rs/lib/python3.12/site-packages/pyproj
68K /Users/just/.pyenv/versions/geoarrow-rs/lib/python3.12/site-packages/pyproj-3.7.1.dist-info
472K /Users/just/.pyenv/versions/geoarrow-rs/lib/python3.12/site-packages/requests
36K /Users/just/.pyenv/versions/geoarrow-rs/lib/python3.12/site-packages/requests-2.32.3.dist-info
984K /Users/just/.pyenv/versions/geoarrow-rs/lib/python3.12/site-packages/urllib3
28K /Users/just/.pyenv/versions/geoarrow-rs/lib/python3.12/site-packages/urllib3-2.4.0.dist-info
2025-07-01
- Zarr: @tomkralidis
- GeoParquet: @justb4
By the way, I just released a new version of the Python bindings to geoarrow-rs, v0.4: https://geoarrow.org/geoarrow-rs/python/latest/