mapbox-java icon indicating copy to clipboard operation
mapbox-java copied to clipboard

Low level GeoJson implementation

Open tobrun opened this issue 7 years ago • 13 comments

This is a request for either a rewrite or alternative implementation of the current GeoJson module found here. We need a high performance implementation that has a minimal memory footprint. The reasoning behind that is that geojson models should be scalable (eg. downstream we query the map for 1000 of features, the path currently taken to create those and the intermediate objects created for doing that isn't optimal atm).

Couple of thoughts to improve this:

  • use primitive instead of objects where possible
  • no need for AutoValue, implementation is based on a specification that doesn't change (often)
  • if geojson models are immutable, use arrays instead of List
  • limit method invocations & don't allocate memory if you can avoid it
  • don't over abstract concepts if they can be expressed more simple

More information on performance tips here RFC of GeoJSON can be found here.

cc @mapbox/android

tobrun avatar Feb 02 '18 13:02 tobrun

Thank you @tobrun for framing the issue and proposing some solutions. By looking at your list, looks like at list the following items would change the public API of the library:

use primitive instead of objects where possible no need for AutoValue, implementation is based on a specification that doesn't change (often) if geojson models are immutable, use arrays instead of List

Am I reading it right? If so, we need to 1) either wait until the next MAS SEMVER release (4.x) or 2) develop this new library as a new product with a different namespace.

zugaldia avatar Feb 07 '18 16:02 zugaldia

Notes in OP were mostly about the internal management so probably we can keep the same public api though this is something that needs to be determined when scoping out. As first steps I would recommend writing up a test suite to tracks memory allocation / cpu usage for 1 feature up to a million features and compare that with the 2.2.9 version. After, we can scoop out what the impact would be to bring the notes from OP into play and use that same test suite as reference during implementation.

tobrun avatar Feb 08 '18 11:02 tobrun

I did a bit of testing on how much memory Point takes in 2.x.x based code vs 3.0 (with AutoValue).

Number of points size in 2.x.x mapbox-java size in 3.0 mapbox.java
1 48 76
100 4800 7600
10000 48000 76000

osana avatar May 15 '18 15:05 osana

Also tried comparing loading of sample json with FeatureCollection:

{ "type": "FeatureCollection", "features": [ { "type": "Feature", "geometry": { "type": "Point", "coordinates": [ 102, 0.5 ] }, "properties": { "prop0": "value0" } }, { "type": "Feature", "geometry": { "type": "LineString", "coordinates": [ [ 102, 0 ], [ 103, 1 ], [ 104, 0 ], [ 105, 1 ] ] }, "properties": { "prop0": "value0", "prop1": 0 } }, { "type": "Feature", "geometry": { "type": "Polygon", "coordinates": [ [ [ 100, 0 ], [ 101, 0 ], [ 101, 1 ], [ 100, 1 ], [ 100, 0 ] ] ] }, "properties": { "prop0": "value0", "prop1": { "this": "that" } } } ] }

Here is Heap dump for 2.0 and 3.0 based mapbox-java

samplejson - 2 x mas samplejson - 3 0 mas

osana avatar May 15 '18 16:05 osana

@osana Thanks for kicking off this work - what's your take on the results in https://github.com/mapbox/mapbox-java/issues/710#issuecomment-389204544 and https://github.com/mapbox/mapbox-java/issues/710#issuecomment-389222070 and what are the next steps?

zugaldia avatar May 23 '18 18:05 zugaldia

Is this still in progress?

lilykaiser avatar Jul 11 '18 19:07 lilykaiser

@zugaldia @tobrun I think the next steps could be as @tobrun points out

  • remove AutoValue from geoJson as there is no real need
  • If we want to go further, we can modify api for coordinates - that would be a semver-major change though. At the moment we have List<Double> for Point and List<Point>, List<List<Point>> for other Geometries. We can go back to double[] or something else. Ideally the solution should be immutable, which is not the case with either double[] or List<Double>

osana avatar Dec 14 '18 14:12 osana

Thank you @osana. Let's start by having an implementation that doesn't rely on AutoValue to avoid the overhead. This could be based of the MAS v2.x code to speed things up. If SEMVER is a concern, let's build the prototype under a new namespace (e.g. com.mapbox.geojson2) in the same module and package. This comes with the advantage of being able to measure performance of both implementations side by side.

zugaldia avatar Dec 17 '18 20:12 zugaldia

@zugaldia good idea to look at v2.x code and I agree that having two packages next to each other is a good way to compare.

osana avatar Dec 17 '18 20:12 osana

Capturing from @mordag in https://github.com/mapbox/mapbox-gl-native/issues/14555 to look into replacing GSON with Jackson as parser which makes a lot of sense as our data model doesn't change (the geojson RFC makes sure of that).

tobrun avatar May 02 '19 07:05 tobrun

Android JSON Parsers Comparison: https://medium.com/@IlyaEremin/android-json-parsers-comparison-2017-8b5221721e31

Looks like Jackson would perform best for our needs.

osana avatar May 03 '19 18:05 osana

This issue has been automatically detected as stale because it has not had recent activity and will be archived. Thank you for your contributions.

stale[bot] avatar Oct 30 '19 19:10 stale[bot]

This issue has been automatically detected as stale because it has not had recent activity and will be archived. Thank you for your contributions.

stale[bot] avatar Oct 30 '19 22:10 stale[bot]