hale Add TopoJSON Writer

TopoJSON is an extension of GeoJSON that encodes topology for Linestring and Polygon data. Rather than representing geometries discretely, geometries in TopoJSON files are stitched together from shared line segments called arcs. TopoJSON is most effective for coverage-style datasets consisting of many polygons that touch eahc other, like administrative units or cadastral parcels.

In hale studio, a minimal implementation would be to implement non-shared arcs. A more advanced implementation would detect shared arcs and encode them once, and then reference them from any feature where they form part of the geometry.

There are several implementations of such tools available in Java or JS:

https://github.com/bouviervj/topojson-j https://github.com/topojson/topojson

This issue is created due to a customer project. Additional requirements and test data will be made available in an internal ticket.

Nov 16 '21 16:11 thorsten-reitz

@stempler would it also be possible/easier to implement this support for the new/existing JSON IO component of hale studio?

Aug 29 '22 09:08 thorsten-reitz

@thorsten-reitz An initial version of this was shipped with 5.0, I believe. Would you say that this can be closed?

Jan 11 '23 11:01 florianesser

@florianesser I haven't really tested that. Will do so now.

Jan 11 '23 11:01 thorsten-reitz

It doesn't seem to be working well. Here is my testcase:

Load the Basic Hydrography Example
Export the transformed data
Inspect the data in an editor

What I see is that all the non-spatial string-type attributes are incorrectly encoded. Here is an example:

"identifier":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"

Jan 11 '23 14:01 thorsten-reitz

@thorsten-reitz Yes, that's one of the limitations of the current implementation (see related commit message).

Jan 11 '23 14:01 florianesser

@florianesser ah OK, that led to the dataset being hugely inflated (about 80% of the whole file were such markers) and even being unreadable in notepad++. TBH I would consider that to be a blocker for closing this ticket. Empty fields are quite common.

Jan 11 '23 15:01 thorsten-reitz

@thorsten-reitz We are grooming this issue. Do you have an example of non-spatial string attributes that are causing this issue?

Aug 31 '23 08:08 Kate-Lyndegaard

hale hale copied to clipboard

Add TopoJSON Writer

hale
hale copied to clipboard