Joris Van den Bossche

Results 844 comments of Joris Van den Bossche

@jnh5y I am not aware of a good description / docs about GEOS' memory model, and I am also not an expert on GEOS' inner details. So probably the best...

Even when there are no exercises, you can remove the content so people have to follow along with running the code. Of course, if you want them to focus mainly...

I suppose we could actually even start with a `dask_geopandas.read_file` that only supports this use case, as it seems simpler than chunking one file (#11). In dask the logic behind...

Yeah, so the problem is that the *dataframe* has "geometry" dtype, while what we actually write is "object" dtype (binary type in arrow/parquet). I ran into essentially the same problem...

From the point of view of the storage (Arrow memory layout, Parquet file format), missingness in general is certainly supported, I think, and several missingness levels can also be possible...

> Theoretically the innermost child array (a big flat buffer of doubles) containing coordinate values can also be nullable and have null elements, but I think here NaN would be...

Some time ago I diagnosed an issue with fiona + arrow combination (https://github.com/conda-forge/gdal-feedstock/issues/592) where importing fiona messed up some symbols for pyarrow. But so that was the other way around,...

One question that comes up here (and it's the same question for spatial repartitioning a dask.dataframe to conform to given regions): how to deal with possible duplicates? If you specify...

> I am just afraid that it can be expensive. When starting from an existing dask.dataframe (not necessarily in memory, can also be backed by reading in from eg parquet),...

> Are we actually able to put this logic to the SQL query? If we can*not* (and thus doing the "dumb reading from PostGIS and filtering/repartitioning on the dask side"),...