hale icon indicating copy to clipboard operation
hale copied to clipboard

Add TopoJSON Writer

Open thorsten-reitz opened this issue 3 years ago • 1 comments

TopoJSON is an extension of GeoJSON that encodes topology for Linestring and Polygon data. Rather than representing geometries discretely, geometries in TopoJSON files are stitched together from shared line segments called arcs. TopoJSON is most effective for coverage-style datasets consisting of many polygons that touch eahc other, like administrative units or cadastral parcels.

In hale studio, a minimal implementation would be to implement non-shared arcs. A more advanced implementation would detect shared arcs and encode them once, and then reference them from any feature where they form part of the geometry.

There are several implementations of such tools available in Java or JS:

https://github.com/bouviervj/topojson-j https://github.com/topojson/topojson

This issue is created due to a customer project. Additional requirements and test data will be made available in an internal ticket.

thorsten-reitz avatar Nov 16 '21 16:11 thorsten-reitz

@stempler would it also be possible/easier to implement this support for the new/existing JSON IO component of hale studio?

thorsten-reitz avatar Aug 29 '22 09:08 thorsten-reitz

@thorsten-reitz An initial version of this was shipped with 5.0, I believe. Would you say that this can be closed?

florianesser avatar Jan 11 '23 11:01 florianesser

@florianesser I haven't really tested that. Will do so now.

thorsten-reitz avatar Jan 11 '23 11:01 thorsten-reitz

It doesn't seem to be working well. Here is my testcase:

  • Load the Basic Hydrography Example
  • Export the transformed data
  • Inspect the data in an editor

What I see is that all the non-spatial string-type attributes are incorrectly encoded. Here is an example:

"identifier":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"

thorsten-reitz avatar Jan 11 '23 14:01 thorsten-reitz

@thorsten-reitz Yes, that's one of the limitations of the current implementation (see related commit message).

florianesser avatar Jan 11 '23 14:01 florianesser

@florianesser ah OK, that led to the dataset being hugely inflated (about 80% of the whole file were such markers) and even being unreadable in notepad++. TBH I would consider that to be a blocker for closing this ticket. Empty fields are quite common.

thorsten-reitz avatar Jan 11 '23 15:01 thorsten-reitz

@thorsten-reitz We are grooming this issue. Do you have an example of non-spatial string attributes that are causing this issue?

Kate-Lyndegaard avatar Aug 31 '23 08:08 Kate-Lyndegaard