MapServer icon indicating copy to clipboard operation
MapServer copied to clipboard

Duplicated rings in WFS

Open pedrogamez opened this issue 4 years ago • 19 comments

Expected behavior and actual behavior.

I expect this GeoJSON file to be handled correctly on the WFS. This contains many complex MultiPolygons with holes. However, after processing on the server, the WFS returns a modified GeoJSON file with several errors.

The original file has been validated successfully with no errors or warnings. I've done a validation of the output file from mapserver with QGIS and it reports: "duplicated rings"

You can see several triangles (artifacts) by rendering the output file on geojson.tools:

Captura de pantalla de 2022-04-14 15-24-18

Steps to reproduce the problem.

Unzip the attachment:

error.zip

You will get 2 files (main.map and test.json). Then run:

mapserv -nh QUERY_STRING='map=main.map&service=wfs&request=GetFeature&version=1.1.0&TYPENAME=test&outputFormat=geojson&srsName=EPSG:4326'

Operating system

Tested on:

  • Debian 10.2 64 bit
  • Arch Linux x86_64

MapServer version and installation method

  • Instalation: compiled
/opt/mapserver/bin/mapserv -v
MapServer version 7.6.4 OUTPUT=PNG OUTPUT=JPEG SUPPORTS=PROJ SUPPORTS=AGG SUPPORTS=FREETYPE SUPPORTS=CAIRO SUPPORTS=ICONV SUPPORTS=WMS_SERVER SUPPORTS=WFS_SERVER SUPPORTS=WCS_SERVER SUPPORTS=FASTCGI SUPPORTS=GEOS SUPPORTS=POINT_Z_M INPUT=JPEG INPUT=POSTGIS INPUT=OGR INPUT=GDAL INPUT=SHAPEFILE
  • Installation: conda (channel conda-forge)
mapserv -v
MapServer version 7.6.4 OUTPUT=PNG OUTPUT=JPEG SUPPORTS=PROJ SUPPORTS=AGG SUPPORTS=FREETYPE SUPPORTS=CAIRO SUPPORTS=ICONV SUPPORTS=FRIBIDI SUPPORTS=WMS_SERVER SUPPORTS=WMS_CLIENT SUPPORTS=WFS_SERVER SUPPORTS=WFS_CLIENT SUPPORTS=WCS_SERVER SUPPORTS=SOS_SERVER SUPPORTS=GEOS SUPPORTS=POINT_Z_M INPUT=JPEG INPUT=POSTGIS INPUT=OGR INPUT=GDAL INPUT=SHAPEFILE

pedrogamez avatar Apr 14 '22 13:04 pedrogamez

@pedrogamez, do you see the same issue if you don't set an output format? That should return GML.

sdlime avatar Apr 20 '22 13:04 sdlime

Hi @sdlime, thanks,

Indeed, the GML output file file shows the same issue.

mapserv.zip

Moreover, even changing the input format does not seem to prevent the problem (I've tested with geopackage)

Greetings, Pedro

pedrogamez avatar Apr 20 '22 13:04 pedrogamez

How about an input format that takes OGR out of the equation - a shapefile?

sdlime avatar Apr 20 '22 14:04 sdlime

I've checked and same result unfortunately.

Input format: shapefile, output format: GML (default) The input files are here: shapefile.zip

pedrogamez avatar Apr 20 '22 15:04 pedrogamez

@pedrogamez, if I load your input JSON data (test.json) into https://geojson.tools/ I see the following error message:

15 feature(s) added successfully and 1 feature(s) failed to load.

If I load the output generated from your sample command-line call then I get:

14 feature(s) added successfully.

I do see the rendering artifacts from your screenshot. That said if I use a different tool (e.g. https://geojsonlint.com/) I don't see the rendering artifact and there are no complaints when parsing the output (or input) JSON files.

This does make me wonder if there is an issue with the input...

--Steve

sdlime avatar Apr 20 '22 15:04 sdlime

The last feature in the file is Geometry: null. That causes the message

15 feature(s) added successfully and 1 feature(s) failed to load.

If you remove the last feature you will get the artifacts and no error message in geojson.tools:

14 feature(s) added successfully.

Let me check geojsonlint.com ...

pedrogamez avatar Apr 20 '22 15:04 pedrogamez

If I load the input JSON and the output JSON files in QGIS directly as vector layers with no errors.

sdlime avatar Apr 20 '22 15:04 sdlime

If I load the input JSON and the output JSON files in QGIS directly as vector layers with no errors.

You have to execute "Geometry tools -> Check validity" to show the errors. This is the output of that command:

{'ERROR_COUNT': 3,
'ERROR_OUTPUT': 'Salida_err_nea_22d359ec_7e35_4d23_8f7d_a4adc27c3bea',
'INVALID_COUNT': 3,
'INVALID_OUTPUT': 'Salida_no_v_lida_bc1da633_3b85_4fb3_83e5_6ee19b80451f',
'VALID_COUNT': 11,
'VALID_OUTPUT': 'Salida_v_lida_336a4902_066a_430d_8cfe_919a4915cb6a'}

That said if I use a different tool (e.g. https://geojsonlint.com/) I don't see the rendering artifact and there are no complaints when parsing the output (or input) JSON files.

I think the problem is that you cannot see the colors in geojsonlint. In QGIS you will see the difference: two adjacent contours appears with the same color -- orange. This is an error because they must have different colors, and it's due to the presence of duplicated rings (according with QGIS).

Original file (correct)

orig

Processed file (wrong)

processed

pedrogamez avatar Apr 20 '22 15:04 pedrogamez

Ok, knowing how you were detecting the errors is helpful because visual inspection wasn't obvious.

sdlime avatar Apr 20 '22 16:04 sdlime

Drawing the output.json w/MapServer looks the same as QGIS so at least that's consistent. I'm trying to think about what single process would be at play with such diverse processing paths:

  • reading from shapefile, OGR/JSON and OGR/GeoPackage
  • writing to GML and OGR/JSON

I'll try against the 8.0 (main branch) and see what happens.

--Steve

sdlime avatar Apr 20 '22 16:04 sdlime

OpenJUMP does not report about duplicated rings but ring self-intersections at three locations:

POINT ( 5.448455 -75.233739 ) POINT ( 4.704127 -75.158865 ) POINT ( 4.884098 -75.098321 )

However, I do not totally trust that the self-intersection is the real or only error. For example in this image one part of the polygon test.10 overlaps feature test.11. I suppose that test.11 should have a hole at the same place so that there would not be an overlap.

image

jratike80 avatar Apr 20 '22 16:04 jratike80

With a query operation there shouldn't be much that acts on geometries in terms of transformations - makes me wonder if projection could be involved.

sdlime avatar Apr 20 '22 18:04 sdlime

Same output in 8.0...

sdlime avatar Apr 21 '22 14:04 sdlime

So I'm trying to come up with a more succinct test case. If you look at the test.json and extract OBJECTID=8 using ogr2ogr and just work off that then it's a little simpler. That feature has an island with a hole that seems to be causing problems (see screenshot). The OBJECTID=5 feature does have islands but no holes in the islands and works ok via WFS output.

Note that both OBJECTID=8 and OBJECTID=5 features are invalid IF you use the QGIS method in the Check Validity tool rather than GEOS.

image

sdlime avatar Apr 21 '22 16:04 sdlime

Here's a simpler test case that I think mirrors what's going on with OBJECTID=8 in the original JSON data. The second "outer" ring is seems to be the problem. When I constructed this test case (using https://geojsonlint.com/) any of the the rings after the first one needed to constructed in reverse order - even the second outer ring - or I received a parsing error. You can see where the duplicate is created - the innermost hole shows up as a hole for for the first outer ring and the second outer ring.

Input:

{
    "type": "FeatureCollection",
    "features": [
        {
            "type": "Feature",
            "properties": { "OBJECTID": 1, "fill": "#0022ff" },
            "geometry": {
                "type": "MultiPolygon",
                "coordinates": [
                    [
                        [[50,50],[60,50],[60,60],[50,60],[50,50]],
                        [[51,51],[51,59],[59,59],[59,51],[51,51]],
                        [[52,52],[52,58],[58,58],[58,52],[52,52]],
                        [[53,53],[53,57],[57,57],[57,53],[53,53]]
                    ]
                ]
            }
        }
    ]
}

Output via WFS:

{
    "type": "FeatureCollection",
    "name": "simple",
    "features": [
        {
            "type": "Feature",
            "properties": { "OBJECTID": "1", "fill": "#0022ff" },
            "geometry": {
                "type": "MultiPolygon",
                "coordinates": [
                    [
                        [ [ 50.0, 50.0 ], [ 60.0, 50.0 ], [ 60.0, 60.0 ], [ 50.0, 60.0 ], [ 50.0, 50.0 ] ],
                        [ [ 51.0, 51.0 ], [ 51.0, 59.0 ], [ 59.0, 59.0 ], [ 59.0, 51.0 ], [ 51.0, 51.0 ] ],
                        [ [ 53.0, 53.0 ], [ 53.0, 57.0 ], [ 57.0, 57.0 ], [ 57.0, 53.0 ], [ 53.0, 53.0 ] ]
                    ],
                    [
                        [ [ 52.0, 52.0 ], [ 52.0, 58.0 ], [ 58.0, 58.0 ], [ 58.0, 52.0 ], [ 52.0, 52.0 ] ],
                        [ [ 53.0, 53.0 ], [ 53.0, 57.0 ], [ 57.0, 57.0 ], [ 57.0, 53.0 ], [ 53.0, 53.0 ] ]
                    ]
                ]
            }
        }
    ]
}

sdlime avatar Apr 21 '22 20:04 sdlime

I believe that the simplified test data is not good for debugging this issue.

The issue with "input" is that while OpenJUMP (internally JTS) validates the object 8 and the whole original GeoJSON, for the geometry in "input" it gives an error about nested holes. PostGIS gives similar error. OpenJUMP would like to model the "input" as a two-part multipolygon this way:

MULTIPOLYGON ((( 49 70, 49 62, 60 62, 60 70, 49 70 ), ( 50 69, 59 69, 59 63, 50 63, 50 69 )), (( 51 68, 51 64, 58 64, 58 68, 51 68 ), ( 52 67, 57 67, 57 65, 52 65, 52 67 )))

and the same as GeoJSON

{
"type": "FeatureCollection",
"features": [
{ "type": "Feature", "properties": null, "geometry": {"type":"MultiPolygon","coordinates":[[[[49,70],[49,62],[60,62],[60,70],[49,70]],[[50,69],[59,69],[59,63],[50,63],[50,69]]],[[[51,68],[51,64],[58,64],[58,68],[51,68]],[[52,67],[57,67],[57,65],[52,65],[52,67]]]]} }
]

}

There seems to be a bug in geojsonlint because is really accepts the "input" with nested holes. It does correctly error your original plan to write two outer rings into a single polygon part of a multipolygon.

jratike80 avatar Apr 21 '22 20:04 jratike80

@jratike80, you're right. Here's a fixed test case. The same output though...

Input:

{
    "type": "FeatureCollection",
    "features": [
        {
            "type": "Feature",
            "properties": {
                "OBJECTID": 1,
                "fill": "#0022ff"
            },
            "geometry":{
                "type": "MultiPolygon",
                "coordinates": [
                    [
                        [[50,50],[60,50],[60,60],[50,60],[50,50]],
                        [[51,51],[51,59],[59,59],[59,51],[51,51]]
                    ],[
                        [[52,52],[52,58],[58,58],[58,52],[52,52]],
                        [[53,53],[53,57],[57,57],[57,53],[53,53]]
                    ]
                ]
            }
        }
    ]
}

Output:

{
    "type": "FeatureCollection",
    "name": "simple",
    "features": [
        {
            "type": "Feature",
            "properties": { "OBJECTID": "1", "fill": "#0022ff" },
            "geometry": {
                "type": "MultiPolygon",
                "coordinates": [
                    [
                        [ [ 50.0, 50.0 ], [ 60.0, 50.0 ], [ 60.0, 60.0 ], [ 50.0, 60.0 ], [ 50.0, 50.0 ] ],
                        [ [ 51.0, 51.0 ], [ 51.0, 59.0 ], [ 59.0, 59.0 ], [ 59.0, 51.0 ], [ 51.0, 51.0 ] ],
                        [ [ 53.0, 53.0 ], [ 53.0, 57.0 ], [ 57.0, 57.0 ], [ 57.0, 53.0 ], [ 53.0, 53.0 ] ]
                    ],
                    [
                        [ [ 52.0, 52.0 ], [ 52.0, 58.0 ], [ 58.0, 58.0 ], [ 58.0, 52.0 ], [ 52.0, 52.0 ] ],
                        [ [ 53.0, 53.0 ], [ 53.0, 57.0 ], [ 57.0, 57.0 ], [ 57.0, 53.0 ], [ 53.0, 53.0 ] ]
                    ]
                ]
            }
        }
    ]
}

sdlime avatar Apr 22 '22 13:04 sdlime

I can verify the behavior outside of WFS, so just a mode=nquery produces the same output:

mapserv -nh -conf test.conf QUERY_STRING='map=test.map&mode=nquery&qformat=geojson' > output_nquery.json

That helps narrow down things down a bit, probably to something called by this function... https://github.com/MapServer/MapServer/blob/8af82513f8be8a380acbb5d92f9c5dec360ee5f1/mapquery.c#L1015

sdlime avatar Apr 22 '22 16:04 sdlime

I've identified the source of the issue - there's definitely a problem with msGetInnerList() where an island in an island can be duplicated. Basically the island gets associated with multiple outer rings. It may not actually be a problem with output from that function (it's relative to one ring) but rather subsequent processing of the list. However, it would be best to address in that function so no other downstream changes would be necessary.

I think we'd need to look at each outer ring that contains the inner ring and count the number of other outer rings that contain it and only associate the inner ring with the one nested the deepest. That vast majority of the time there will only be outer ring but in rare circumstances that is not true.

sdlime avatar Apr 01 '24 16:04 sdlime