StreetComplete icon indicating copy to clipboard operation
StreetComplete copied to clipboard

Spatial cache for OsmQuestController, OsmNoteQuestController and MapDataController

Open Helium314 opened this issue 2 years ago • 64 comments

A first version for the OsmQuestController cache discussed in #4079. I left the comments to clarify my motivations.

Main benefits

  • much faster get for quests in view (called on opening and solving quests)
  • much faster getAllVisibleInBBox when tiles are cached, otherwise marginally slower This translates to a ~30% improvement of QuestPinsManager.onNewTilesRect, as creating and updating pins is rather slow.

What is missing:

  • unit tests
  • decide cache size
  • clear / trim cache on low memory
  • maybe bugs?

Helium314 avatar Jun 16 '22 10:06 Helium314

Reading vertically through the code:

  • given that there should also be a cache for the MapData, the Cache class should be generic (just like Android's LruCache is). At least in regard what is in the tiles that are grouped by TilePos
  • it is a very bad idea to let the cache class extend LinkedHashMap. This exposes all the interface of the LinkedHashMap to the outside and that class is not threadsafe. If you are not well versed in writing threadsafe code, better not stride far from the reference implementation (LruCache)
  • there is no place for deferring updates to the cache to the background, there is no place for coroutines in the cache class (i.e. it does not belong there). It does not make sense at all, actually. The cache must always be up-to-date, only the updates to the database may be deferred (because the cache is always up to date). It is however the job of the *Controller class to coordinate this, not the cache
  • the cache class is doing far too much. It is not the job of the cache to translate between bbox and tilepos forth and back. It should really be just a get/put store with e.g. a create-lambda passed in the constructor

westnordost avatar Jun 16 '22 13:06 westnordost

So:

  • move the map to an private val
  • move most code to the controller
  • with those changes I don't see much use for a create-lambda, as this would have to query the db for each tile separately (no bbox - tilepos translation!), and that was the first thing I tried and found too slow (not surprising)

Helium314 avatar Jun 16 '22 20:06 Helium314

with those changes I don't see much use for a create-lambda, as this would have to query the db for each tile separately (no bbox - tilepos translation!

Hmm right. I guess there are two options:

  • cache has TilesRect (not TilePos) on interface
  • leave BoundingBox on interface

For the former case, the code in the controller would look like this:

fun get(bbox) {
  val tilesRect = bbox.asTileRect(zoom=16)
  val data = cache.get(tilesRect)
  return data.filter { it in bbox }
}

This means that the cache first returns all the data from all the tiles that intersect the bbox, the controller filters it out afterwards. I reckon this may be a little unperformant, especially if the bboxes queried do not align with the tiles (thinking about map data cache here).

For the latter, the code above would basically be in the cache. Though, it can already not add the element within each cached tile to the result set and have some additional logic that would not even check for each element if it is inside the bbox or not for tiles that are completely contained within the bbox. Hence, even faster.

So overall, it looks like your original solution to which to use on the interface (bbox or tiles) is more performant. Plus, thinking about that there is the same cache class for all the controllers that manage geospatial data (quests, map data, notes), it is certainly less code because the above logic would not have to be repeated in each controller.

westnordost avatar Jun 16 '22 21:06 westnordost

fun get(bbox) {
  val tilesRect = bbox.asTileRect(zoom=16)
  val data = cache.get(tilesRect)
  return data.filter { it in bbox }
}

This means that the cache first returns all the data from all the tiles that intersect the bbox, the controller filters it out afterwards. I reckon this may be a little unperformant, especially if the bboxes queried do not align with the tiles (thinking about map data cache here).

For the latter, the code above would basically be in the cache. Though, it can already not add the element within each cached tile to the result set and have some additional logic that would not even check for each element if it is inside the bbox or not for tiles that are completely contained within the bbox. Hence, even faster.

The cache could return a Map<TilePos, Data> instead of a Set<Data>, which would allow the get(bbox) to avoid filter { it in bbox } for fully contained tiles. Or maybe a lambda for combining data and another one for filtering by bbox could be given to the cache... but then again the same functions could also be called in get(bbox), which would give a bit more flexibility (e.g. combining the Map<OsmQuestKey, Quest> actually doesn't need to create another map, a flattened list of values is sufficient).

A problem I see with a generic cache with lambdas is that the cache for OsmQuestController needs a lambda (bbox, questTypeNames) -> T for decent performance, but a generic cache would only have (bbox) -> T.

So overall, it looks like your original solution to which to use on the interface (bbox or tiles) is more performant. Plus, thinking about that there is the same cache class for all the controllers that manage geospatial data (quests, map data, notes), it is certainly less code because the above logic would not have to be repeated in each controller.

Actually I'm not sure how much really can be done here in a generic way, the logic for combining and filtering data will still need to be in the controller, either in get(bbox), or in the lambda.

Helium314 avatar Jun 17 '22 06:06 Helium314

A problem I see with a generic cache with lambdas is that the cache for OsmQuestController needs a lambda (bbox, questTypeNames) -> T for decent performance, but a generic cache would only have (bbox) -> T.

IMO that's preliminary optimization. The controller can just cache.get(...).filter { it.name in questTypeNames }. Later, one can think about subclassing (or rather: wrapping) the generic cache if this really makes a big difference. (Or have the spatial cache in VisibleQuestsSource instead and let the OsmQuestController just have a cache for the get(id) query.

Actually I'm not sure how much really can be done here in a generic way, the logic for combining and filtering data will still need to be in the controller, either in get(bbox), or in the lambda.

I don't understand, why would the combining (of the data in possibly several tiles) and filtering (the elements whether they are contained in the bbox) need to be done in the controller for the latter case, i.e. the SpatialCache.get(bbox)?

westnordost avatar Jun 17 '22 10:06 westnordost

A problem I see with a generic cache with lambdas is that the cache for OsmQuestController needs a lambda (bbox, questTypeNames) -> T for decent performance, but a generic cache would only have (bbox) -> T.

IMO that's preliminary optimization. The controller can just cache.get(...).filter { it.name in questTypeNames }. Later, one can think about subclassing (or rather: wrapping) the generic cache if this really makes a big difference. (Or have the spatial cache in VisibleQuestsSource instead and let the OsmQuestController just have a cache for the get(id) query.

I actually went from caching all quests to caching only visible quests because it makes a considerable difference when fetching from DB. So I guess the cache needs to move to VisibleQuestsSource then.

A get(id) cache could then be a simple LRU cache (containing a fixed maximum number of quests) that could also use the quests returned by getAllVisibleInBBox.

Helium314 avatar Jun 17 '22 21:06 Helium314

I actually went from caching all quests to caching only visible quests because it makes a considerable difference when fetching from DB.

I don't understand why though. I guess it must be about the extra quests returned that are not visible. Are you sure that you create a HashSet for the collection questTypes passed into getAllVisibleInBBox? If it is just a list, it is no wonder why it is slow (100+ iterations per retrieved quest).

Also, if this is still slow, there there is another solution than moving the cache to VisibleQuestsSource:

  1. memorize with what questTypes getAllVisibleInBBox has been called last
  2. if the questTypes on the next call differ, clear the cache

westnordost avatar Jun 17 '22 22:06 westnordost

I guess it must be about the extra quests returned that are not visible.

That's what I concluded. The slow part is getAllVisibleInBBoxFromDB, which is exactly the same as getAllVisibleInBBox without this PR. When I tried the cache with getting all quests, I simply passed null instead of a quest type list/set.

Also, if this is still slow, there there is another solution than moving the cache to VisibleQuestsSource:

1. memorize with what `questTypes` `getAllVisibleInBBox` has been called last
2. if the `questTypes` on the next call differ, clear the cache

Right, I didn't think of simply moving this from cache to the controller.

Helium314 avatar Jun 18 '22 13:06 Helium314

Some semi-related comments: I can only use the position in bbox you suggested if I change https://github.com/streetcomplete/StreetComplete/blob/c20190bc8df62d299f7122501be89e14e5db0e38/app/src/main/java/de/westnordost/streetcomplete/util/math/SphericalEarthMath.kt#L468 to

operator fun BoundingBox.contains(pos: LatLon): Boolean {

OsmQuest.key is used frequently. Does this create a new OsmQuestKey every time? https://github.com/streetcomplete/StreetComplete/blob/c20190bc8df62d299f7122501be89e14e5db0e38/app/src/main/java/de/westnordost/streetcomplete/data/osm/osmquests/OsmQuest.kt#L20 And if yes, would it make sense to reduce calls e.g. by making OsmQuest.key by lazy?

I general I find measuring performance easy enough, but for things that may or may not waste memory I have

  • little background knowledge
  • no estimation / feeling for how much memory something uses
  • no way of measuring e.g. memory used by a map and by its contents

Is there some way Android Studio can help on the last point?

Helium314 avatar Jun 21 '22 05:06 Helium314

I did some comparison between the current cache(s) for MapDataController and my old attempt of the one cache (LinkedHashMap<TilePos, Map<ElementKey, Pair<Element, ElementGeometry>>>), focusing on getMutableMapDataWithGeometry. There is not much difference in terms of performance, except that the one cache is roughly twice as fast when retrieving entire tiles (maybe because it already contains "everything" for that tile).

There is some difference in memory usage, judging by java memory part of the Android Studio memory profiler. Loading 8 tiles into cache increases java from ~26 to 32 for one cache, while for the cache in this PR it goes up to 44. That seems like a pretty large difference, did I do something completely wrong? Either in "measuring" or with the caches.

Helium314 avatar Jun 21 '22 10:06 Helium314

OsmQuest.key is used frequently. Does this create a new OsmQuestKey every time?

yes

And if yes, would it make sense to reduce calls e.g. by making OsmQuest.key by lazy?

only if it is the same OsmQuest on which key is called frequently. Currently, quests are being fetched anew from the database all the time. Also, if key is kind of guaranteed to be called at least once, it could also be just instead of an indirection added via lazy.

override val key = OsmQuestKey(elementType, elementId, questTypeName)

Is there some way Android Studio can help on the last point?

Yes, it's quite powerful, too: https://developer.android.com/studio/profile/memory-profiler To measure the size of the cache at one point, you could implement a button or something that will trigger all the caches being cleared (i.e. emulate onLowMemory system call) and see how the memory use changes after that.

westnordost avatar Jun 21 '22 12:06 westnordost

only if it is the same OsmQuest on which key is called frequently. Currently, quests are being fetched anew from the database all the time. Also, if key is kind of guaranteed to be called at least once, it could also be just instead of an indirection added via lazy.

Currently the questCache holds all quests by key, and the spatialCache holds all quest positions by key. But if the quest is fetched directly from DB (get(questKey) when not cached), or created outside the cached area or for hidden quest types (on upload), the key is not needed. So I think lazy would be most suitable.

Yes, it's quite powerful, too: https://developer.android.com/studio/profile/memory-profiler To measure the size of the cache at one point, you could implement a button or something that will trigger all the caches being cleared (i.e. emulate onLowMemory system call) and see how the memory use changes after that.

Ok, thanks! So I can conclude the size of the cache is just the difference before and after clearing caches?

Helium314 avatar Jun 21 '22 12:06 Helium314

So I can conclude the size of the cache is just the difference before and after clearing caches?

Probably. I am not sure how quickly the garbage collector will collect it. IIRC there is a call you can make to force garbage collecting now (IMO useful only for debugging).

westnordost avatar Jun 21 '22 14:06 westnordost

I've now reviewed the SpatialClass class only so far. Looks quite good already, I only found one conceptual issue. I will review the rest later.

westnordost avatar Jun 21 '22 14:06 westnordost

So I can conclude the size of the cache is just the difference before and after clearing caches?

Probably. I am not sure how quickly the garbage collector will collect it. IIRC there is a call you can make to force garbage collecting now (IMO useful only for debugging).

Looks like GC triggers ~10 sec after putting the app to background. At least that's the point where java usage may drop considerably (and does not increase once the app is in foreground again). Memory comparison, usage of java after putting app to background, with overlay on and 4 tiles loaded No cache: 16.7 SpatialCache + others for MapDataController, before clearing: 33.7 SpatialCache + others for MapDataController, after clearing: 18.1 -> 15.6 MB for cache one cache for MapDataController, before clearing: 26.3 one cache for MapDataController, after clearing: 16.8 -> 9.5 MB for cache

I actually hope there is some problem with the caches in MapDataController, looks like my current implementation wastes a lot of memory...

[Edit: I'm blind and didn't see the GC button in profiler. Anyway, results are the same]

Helium314 avatar Jun 22 '22 11:06 Helium314

TBH I find 33MB (and a 7MB difference) of in-memory cache not really problematic. But I'll look at the rest of the code at some later time. It's a matter of course that using more hashmaps will result in higher memory consumption due to all the additional references. Though, 1000 additional references should just be roughly around 8 byte * 1000 = 8KB

westnordost avatar Jun 22 '22 14:06 westnordost

It was still in a todo, but done now

Helium314 avatar Jun 25 '22 13:06 Helium314

please do not change anything right now, I am experimenting something and will post this as a PR on this PR

westnordost avatar Jun 27 '22 13:06 westnordost

So this is now merged, I'll have a look how to make the changed spatial cache work with MapDataController, but I think it will not be possible without further changing or extending SpatialCache.

Helium314 avatar Jun 29 '22 12:06 Helium314

So I think I can offer:

  • a comeback of onKeysRemoved
  • a spatial cache containing all data, not only nodes
  • completely unrelated caches, so the non-spatial caches would simply be some access-ordered LinkedList<ElementKey, Element(Geometry)> (or similar for waysByNodeId) which get trimmed to a maximum size

Helium314 avatar Jun 29 '22 21:06 Helium314

I really don't like onKeysRemoved, and having unrelated caches with a maximum size might cause trouble, e.g. requiring db access while persisting data asynchronously. So my currrent preference would be the cache containing everything (or maybe something else I didn't think about).

Helium314 avatar Jun 30 '22 04:06 Helium314

Hmm it will be difficult to properly extend that class though, and copying everything would be quite dirty. I'm all for encapsulating all the cache logic into one class if possible, but composition beats inheritance.

i thought i posted this already but a solution could be to on each trim iterate through waysbynodeid and remove all ways where at least one node is missing from the spatial cache. Since this is all lookups in HashMaps, this should run in O(n). (n=Number of nodes in cache. n lookups in hashmaps)

Not sure how that would affect the runtime if on ~every bbox access, the cache is trimmed. But maybe we are overdesigning the trim cache thing. In the real application use, trimming doesn't need to happen that often as long as the cache doesn't always operate at full capacity. E.g. whenever it is trimmed, it could cut itself down to 1/2 or 2/3 or whatever so its some time until it fills up again. OR the bbox cache will trim itself immediately because it's cheap but the rest will do it only every so often.

westnordost avatar Jun 30 '22 08:06 westnordost

This is now done, except for deciding when to trim non-spatial caches. trimNonSpatialCaches is mostly faster than getting all data in a tile, but clearly slower than getting data in a "normal" size, i.e. for creating quests or showing highlighted elements.

  • get data for element highlighting (default 30m radius): 5-20 ms
  • get data in whole tile: 120-180 ms for one tile, 220-300 ms for 4 tiles
  • trimNonSpatialCaches: 70-90 ms for one tile in cache, 500-700 ms for 12 tiles

[edit: the times are on actual device, where all these calls are done asynchonously. Thus they may appear slower than they are, and fluctuate quite a bit]

trimNonSpatialCaches can definitely be optimized, but probably not to the point where it can be called on every bbox access.

Btw with the changes the size of the cache is now only 10% more than the old one cache, probably because we don't have the nodes in 3 HashMaps (elements, geometries, byKey), but only in byKey. Speed is about the same, though some 30% slower than one cache when getting data by bbox.

Helium314 avatar Jul 01 '22 15:07 Helium314

Since I kept coming back to the one cache for comparing memory use and performance, I cleaned it up a bit and put it to https://github.com/Helium314/StreetComplete/commit/8b122cbb112c5644ee10cb2cd571cdfcf9c17cef

Helium314 avatar Jul 01 '22 21:07 Helium314

On where to put the quest cache: I did some more tests comparing different places of caching quests. Results are approximate, as usual, and sometimes a bit unexpected, probably caused by some random fluctuations. For getting BBoxes the measured times are for getting quests and updating pins, i.e. this is the delay the user sees.

what current cache same with lazy key and pin positions caching all quests same with hidden quests cache VisibleQuestsSource cache
getting small bbox (not cached) 350 ms 340 ms 450 ms 390 ms 290 ms (why?)
getting large bbox (not cached) 1100 ms 1080 ms 1880 ms 1790 ms 1080 ms
getting small bbox (cached) 140 ms 140 ms 200 ms 190 ms 140 ms
getting large bbox (cached) 570 ms 530 ms 700 ms 630 ms 450 ms
getting single quest (cached) 0 ms 0 ms 8-25 ms 5-15 ms 0 ms

The hidden quests cache is essentially a HashMap<OsmQuestKey, Long> in OsmQuestController containing the entire hidden quests table, to allow faster quest filtering.

The cache in VisibleQuestsSource is the smallest, the most simple to implement and the fastest. Plus there is no need for caches in NoteController, OsmQuestController, and ideally a hidden notes cache in OsmNotesQuestController

The time for persisting 400 new quests is just 0.2s, so caching all quests in OsmQuestController just to be able to persist asynchronously is imo not worth it.

Helium314 avatar Jul 02 '22 09:07 Helium314

Should I review again or is there still something work in progress? There is no rush from my side because I do not plan to include this into v45. The potential for difficult-to-locate bugs is just too high for things like that (in-memory caching + keeping it in sync with database), so I want to make sure this is as thoroughly unit tested and reviewed as possible before this goes live.

westnordost avatar Jul 03 '22 21:07 westnordost

What I plan to definitely change (if you agree) is moving the quest cache to VisibleQuestsSource.

With MapDataController I'll go through my comments/todos once more, and maybe do some small changes. What I consider doing:

  • cache Pair<Element, ElementGeometry?> instead of 2 separate HashMaps for elements and geometry entries, mostly for reducing memory use
  • add data to non-spatial caches whenever there is a miss on the non-spatial caches. This could be quite useful when accessing data that is not in view, e.g. in MapDataWithEditsSource.rebuildLocalChanges or when loading geometries of recent edits on opening edit history (not really compatible with the change above, and usefulness heavily depends on how often the cache is trimmed)
  • think about when to trim the non-spatial caches, and try to improve trim performance
  • some renaming

Btw I noticed MapDataWithEditsSource.applyEdit always creates a new geometry, even if the edit action is a UpdateElementTagsAction. Is this really necessary, or could it be replaced by getGeometry in that case? Asking because this and getting the element are the main parts making rebuildLocalChanges rather slow.

Helium314 avatar Jul 03 '22 22:07 Helium314

What I plan to definitely change (if you agree) is moving the quest cache to VisibleQuestsSource.

sure

cache Pair<Element, ElementGeometry?> instead of 2 separate HashMaps for elements and geometry entries, mostly for reducing memory use

preliminary/unnecessary optimization in my opinion. IMO it is better to keep close to how the data is organized in the DB.

Btw I noticed MapDataWithEditsSource.applyEdit always creates a new geometry, even if the edit action is a UpdateElementTagsAction. Is this really necessary, or could it be replaced by getGeometry in that case?

It could be replaced. But that is an optimization and for every optimization it must be asked whether it is worth the additional complexity. TBH I would think that createGeometry is not slow per se, but because it needs to acquire the necessary nodes from the database (mapDataController.getNodes) to be able to construct the geometry. Would that not be solved by the cache?

westnordost avatar Jul 03 '22 22:07 westnordost

One main think I forgot: Asynchronous database updates! Is there any need to defer database accesses until writing is done?

Other things are done now: switching to cache ElementGeometry, adding data on non-spatial cache miss, updating names and comments.

Would it be reasonable to use ConcurrentHashMaps instead of "normal" HashMaps and synchronized?

It could be replaced. But that is an optimization and for every optimization it must be asked whether it is worth the additional complexity. TBH I would think that createGeometry is not slow per se, but because it needs to acquire the necessary nodes from the database (mapDataController.getNodes) to be able to construct the geometry. Would that not be solved by the cache?

Right, though nodes outside the spatial cache tiles will not be cached, which may again make it slower. I'll re-check after the PR is done.

Helium314 avatar Jul 04 '22 08:07 Helium314

So what is left:

  • decide when to trim non-spatial caches
  • update db asynchronously in MapDataController
  • fix failing tests
  • add tests for SpatialCache

But those changes should be small, so I think they can be done after review.

Helium314 avatar Jul 04 '22 09:07 Helium314