topojson
topojson copied to clipboard
Excessive memory usage with prequantization enabled
I am primarily posting this issue for future people facing a similar problem.
In my case, when the prequantize option is enabled (which is the default setting), the toposimplify method consumes 25GB of memory. However, when I disable the prequantize option, memory usage peaks at just 5GB. I utilize shapely for simplification.
Reproduction steps
-
Download both parts of the archive: countries1.zip countries2.zip
-
Combine the archives:
cat countries1.zip countries2.zip > countries.zip
-
Unzip it.
-
Execute the following Python code snippet:
with open('countries.geojson', 'rb') as f:
features = json.load(f)['features']
countries_geoms = [shape(f['geometry']) for f in features]
topo = tp.Topology(countries_geoms)
topo.toposimplify(0.00001, inplace=True)
-
Monitor memory usage.
-
To resolve the issue, replace
topo
with:
topo = tp.Topology(countries_geoms, prequantize=False)
By the way, should prequantization be enabled by default? I personally find it odd that the library performs certain calculations by default, even if they don't apply to my use case and don't provide any benefit. I can only understand such default behavior if it benefits everyone. Otherwise, this should be an opt-in operation (the same as simplification is opt-in).