geojson-rbush icon indicating copy to clipboard operation
geojson-rbush copied to clipboard

Explode MultiPolygons when loading

Open mojodna opened this issue 6 years ago • 4 comments

I'm using martinez to intersect features w/ candidates from rbush in order to do strict intersections:

COUNTRY_INDEX.search(feature)
    .features.filter(x => {
      try {
        const i = martinez.intersection(
          x.geometry.coordinates,
          feature.geometry.coordinates
        );

        return i != null && i.length > 0;
      } catch (err) {
        // buffer to convert to a Polygon
        const i = martinez.intersection(
          x.geometry.coordinates,
          buffer(feature, 0.000001, { units: "degrees" }).geometry.coordinates
        );

        return i != null && i.length > 0;
      }
    })
    .map(x => x.properties.ADM0_A3);

COUNTRY_INDEX is Natural Earth admin 0 boundaries, many of which are MultiPolygons that include territories. As a result, the bboxes produced for the USA, Canada, and Russia cover much of the world. By splitting the MultiPolygons into multiple Polygon features, I get a substantial speed-up because fewer indexed features match:

const { featureCollection, polygon } = require("@turf/helpers");
const geojsonRbush = require("geojson-rbush");

const countries = require("./countries.json");

// explode multipolygons for smaller bounding boxes within the rtree
const polys = countries.features.filter(f => f.geometry.type === "Polygon");
const mps = countries.features.filter(f => f.geometry.type === "MultiPolygon");

const features = mps.reduce(
  (acc, mp) =>
    acc.concat(
      mp.geometry.coordinates.map(rings => polygon(rings, mp.properties))
    ),
  polys
);

const COUNTRY_INDEX = geojsonRbush();

COUNTRY_INDEX.load(featureCollection(features));

Doing this automatically within geojson-rbush seems like it would provide a transparent speed-up for those using MultiPolygons.

mojodna avatar Feb 08 '18 21:02 mojodna

👍 Agreed, we can include flattenEach from @turf/meta when using the search method.

By splitting the MultiPolygon

This shouldn't be applied to the initial loading of the FeatureCollection since MultiPolygon is treated as a single feature and not multiple features.

@mojodna If you feel like doing a PR, I can definitely help review it 👍

DenisCarriere avatar Feb 08 '18 22:02 DenisCarriere

we can include flattenEach from @turf/meta when using the search method.

Can you elaborate a bit on this? It sounds like I can simplify the second block of code ☝️ by using that in the absence of a patch to search.

This shouldn't be applied to the initial loading of the FeatureCollection since MultiPolygon is treated as a single feature and not multiple features.

Yup, that makes sense. I was realizing after I opened this issue that a side-effect is that the features that come out of the index aren't necessarily the ones expected.

I will try to find time to do a PR (🙏 for the review offer), but I'm not sure I'll have time before disappearing again.

mojodna avatar Feb 08 '18 23:02 mojodna

Sounds like a plan! 👍

DenisCarriere avatar Feb 09 '18 15:02 DenisCarriere

I was thinking about this a bit more; I think the bulk of the performance improvement I'm seeing is actually related to intersecting against Polygons rather than MultiPolygons w/ martinez. In other words, reducing the number of candidates is less helpful than considerations outside of this module's scope.

mojodna avatar Feb 09 '18 17:02 mojodna