jts icon indicating copy to clipboard operation
jts copied to clipboard

TWKB read and write implementation

Open groldan opened this issue 5 years ago • 9 comments

I have picked up where James and Jody left for TWKBReader and TWKBWriter and finished it with full compatibility with the implementation available in PostGIS.

I've also added a couple optimizations that are present in the spec but that the PostGIS version seems not to follow, for which I would still like to write some more unit tests, and hence I'm submitting this PR as a draft for the time being, for consideration or any discussion needed.

Most test cases use pre-computed data from PostGIS. Check the generate_test_data.sql file, which generates .csv files with expected inputs and output for all combinations of geometry type, xy/z/m precision, and other encoding parameters.

I'll be working on adding more documentation and test cases for the optimizations and then move the PR out of the draft state.

Being my first contribution to JTS I'm looking forward to comments on any aspect of it.

Cheers, Gabriel.

groldan avatar Jul 19 '19 02:07 groldan

Fixes #246 BTW

groldan avatar Jul 19 '19 02:07 groldan

@aaime I just found there's a TWKB reader in gt-postgis, didn't know that. Made some tests, seems it has some issues though. You might be interested in this implementation that seems more robust.

The test case bellow parses a geometry with the JTS and the GeoTools TWKB readers, with this output:

input: POINT ZM(2147483647.1234567 2147483648.123457 1 2)
JTS TWKB reader: POINT ZM(2147483647.1234567 2147483648.123457 1 2)
GT  TWKB reader: POINT Z(213.8718216 0.1234568 1)
public @Test void checkJtsVsGtReader() throws Exception {
        long largeOrdinate = Integer.MAX_VALUE;
        String wkt = String.format("POINT ZM(%s.12345678 %s.12345678 1 2)", largeOrdinate,
                largeOrdinate + 1);
        WKTReader wktReader = new WKTReader();
        wktReader.setIsOldJtsCoordinateSyntaxAllowed(false);
        Geometry geom = wktReader.read(wkt);
        System.err.println("input: " + new WKTWriter(4).write(geom));

        twkbWriter.setXYPrecision(7).setZPrecision(7).setMPrecision(7);

        byte[] twkb = twkbWriter.write(geom);

        Geometry twkbRead = twkbReader.read(twkb);
        System.err.println("JTS TWKB reader: " + new WKTWriter(4).write(twkbRead));

        Geometry gtTwkbRead = geoToolsTWKBReader.read(twkb);
        System.err.println("GT  TWKB reader: " + new WKTWriter(4).write(gtTwkbRead));

        assertEquals(geom, twkbRead);
        assertEquals(geom, gtTwkbRead);
    }

The GT parser will also choke on TWKB byte strings that contain an "idlist", like one generated with the following SQL:

    SELECT ST_AsText(ST_GeomFromTWKB( ST_AsTWKB(ARRAY['POINT(0 0)'::Geometry, 'POINT(1 1)'::Geometry], '{1,2}'::bigint[], 0, 0, 0, false, false) )),
    ST_AsTWKB(ARRAY['POINT(0 0)'::Geometry, 'POINT(1 1)'::Geometry], '{1,2}'::bigint[], 0, 0, 0, false, false);
      st_astext      |      st_astwkb       
    ---------------------+----------------------
    MULTIPOINT(0 0,1 1) | \x040402020400000202

Ok, it won't com like that from the queries made by gt-postgis, but still. The following test case:

public @Test void testParseWithIdList() throws Exception {
        byte[] twkb = WKBReader.hexToBytes("040402020400000202");
        Geometry expected = new WKTReader().read("MULTIPOINT(0 0, 1 1)");
        Geometry parsedJTS = twkbReader.read(twkb);
        assertEquals(expected, parsedJTS);
        Geometry parsedGT = geoToolsTWKBReader.read(twkb);
        assertEquals(expected, parsedGT);
    }

Fails at the last line with expected:<MULTIPOINT ((0 0), (1 1))> but was:<MULTIPOINT ((1 2), (1 2))>

groldan avatar Jul 19 '19 05:07 groldan

Indeed not interested in either cases for the moment. We could switch regardless to the JTS version if it has the same or better performance characteristics (calculations, allocations), one less bit to maintain! :-D

aaime avatar Jul 19 '19 06:07 aaime

By the way, both @jodygarnett and @jnh5y were made aware of the GeoTools implementation, you did not talk to the JTS folks before starting this work?

aaime avatar Jul 19 '19 06:07 aaime

Indeed not interested in either cases for the moment.

Oh, the first one is a case of correctness, though it should be an easy fix. Basically where the spec says "varint" it means "varlong", and the gt reader is discarding the longer values, chopping ordinates at 32-bit ints. See how it parsed to 213.876..:

input: POINT ZM(2147483647.1234567 2147483648.123457 1 2)
JTS TWKB reader: POINT ZM(2147483647.1234567 2147483648.123457 1 2)
GT  TWKB reader: POINT Z(213.8718216 0.1234568 1)

We could switch regardless to the JTS version if it has the same or better performance characteristics (calculations, allocations), one less bit to maintain! :-D

Being the first iteration, been more focused on correctness, so we'll see. Didn't have time to compare parsing performance nor optimize yet.

groldan avatar Jul 19 '19 06:07 groldan

By the way, both @jodygarnett and @jnh5y were made aware of the GeoTools implementation, you did not talk to the JTS folks before starting this work?

I did not! started as a weekend/bored time playground and as I've seen the branch was incomplete I just kept going. In any case the GT one is only a parser, no encoder, and it'd be good to have both in JTS just like there exist for WKT and WKB.

groldan avatar Jul 19 '19 06:07 groldan

Nice to see a PR from you @groldan !

Will be a while until I can give this a thorough review. But one thing I think is that this should be moved down into a jts.io.twkb package. The base jts.io package is getting a bit crowded...

dr-jts avatar Jul 19 '19 20:07 dr-jts

Hi @dr-jts, thanks, long time no see.

one thing I think is that this should be moved down into a jts.io.twkb package. The base jts.io package is getting a bit crowded...

Agreed and done.

groldan avatar Jul 20 '19 00:07 groldan

Hi! Are there chance this PR being reviewed and merged? Having this feature in JTS would be nice :smiley:

murdos avatar Jan 03 '22 10:01 murdos

I think this can be closed? considering https://github.com/locationtech/jts/pull/854

mprins avatar Sep 26 '22 16:09 mprins