topojson icon indicating copy to clipboard operation
topojson copied to clipboard

Excessive memory usage with prequantization enabled

Open Zaczero opened this issue 7 months ago • 5 comments

I am primarily posting this issue for future people facing a similar problem.

In my case, when the prequantize option is enabled (which is the default setting), the toposimplify method consumes 25GB of memory. However, when I disable the prequantize option, memory usage peaks at just 5GB. I utilize shapely for simplification.


Reproduction steps

  1. Download both parts of the archive: countries1.zip countries2.zip

  2. Combine the archives:

cat countries1.zip countries2.zip > countries.zip
  1. Unzip it.

  2. Execute the following Python code snippet:

with open('countries.geojson', 'rb') as f:
    features = json.load(f)['features']
countries_geoms = [shape(f['geometry']) for f in features]
topo = tp.Topology(countries_geoms)
topo.toposimplify(0.00001, inplace=True)
  1. Monitor memory usage.

  2. To resolve the issue, replace topo with:

topo = tp.Topology(countries_geoms, prequantize=False)

By the way, should prequantization be enabled by default? I personally find it odd that the library performs certain calculations by default, even if they don't apply to my use case and don't provide any benefit. I can only understand such default behavior if it benefits everyone. Otherwise, this should be an opt-in operation (the same as simplification is opt-in).

Zaczero avatar Nov 30 '23 23:11 Zaczero