pygris icon indicating copy to clipboard operation
pygris copied to clipboard

parquet in cache

Open knaaptime opened this issue 2 years ago • 5 comments

once the files are read into memory, what do you think about caching them as parquet files instead of shapefiles? would make IO much faster and the footprint a lot smaller on disk. If you're into it, it wouldnt take much to add a conditional and I could send a PR. No worries if youd rather keep things simple though :)

(been using that approach over in geosnap since well before the geoparquet spec existed and it works really well!)

knaaptime avatar Jan 25 '23 17:01 knaaptime

I like this idea! Honestly I did it the way I did because that's how I did it in tigris. Would there be a way to set parquet as an option globally users could specify? I have seen some problems with people installing arrow across some platforms, have you observed this in geosnap?

walkerke avatar Jan 26 '23 20:01 walkerke

sweet.

yeah i was thinking something like a try/except on the arrow import and fallback to shapefile if not. For setting a global option, it would probably be straightforward to add an optional config file to the cache or something that stores global user preferences?

knaaptime avatar Jan 30 '23 21:01 knaaptime

tbh, i dont think ive ever personally seen an install issue before but im sure they're out there. Given the gdal deps and stuff the CI installs everything from conda (and i do the same when i teach workshops) so that usually has no problem solving the deps for any kind of environment

knaaptime avatar Jan 30 '23 21:01 knaaptime