spkrepo icon indicating copy to clipboard operation
spkrepo copied to clipboard

Improve caching

Open ymartin59 opened this issue 6 years ago • 5 comments

According to @Diaoul there is space for cache improvements. Aim is to cache same resource according to relevant request parameters only. Should set explicitly as own-made cache entry key where some parameters have to be ignored. References:

  • https://damyanon.net/post/flask-series-optimizations/
  • https://pythonhosted.org/Flask-Cache/

ymartin59 avatar May 08 '18 20:05 ymartin59

@hgy59 Thanks again for your work updating spkrepo. You looks like comfortable with Flask (python is not my cup of tea...). It would be really helpful if you may at least prototype on a single example what should be done here to customize default caching strategy.

ymartin59 avatar May 08 '18 20:05 ymartin59

@ymartin59 I am out for holiday for another week, so this will need some time. Neither am I familiar with python or flask (only took a pluralsight course for an introduction into flask).

hgy59 avatar May 08 '18 21:05 hgy59

Here are examples of requests for which caching is inefficient at the moment:

GET /?package_update_channel=stable&unique=synology_evansport_214play&build=24922&language=enu&major=6&micro=2&arch=evansport&minor=2&timezone=Amsterdam&nano=0 => generated 56269 bytes in 19 msecs (HTTP/1.1 200) 2 headers in 74 bytes (1 switches on core 0)
GET /?package_update_channel=stable&unique=synology_cedarview_1512%2B&build=23824&language=enu&major=6&micro=1&arch=cedarview&minor=2&timezone=Amsterdam&nano=4 => generated 59672 bytes in 22 msecs (HTTP/1.1 200) 2 headers in 74 bytes (1 switches on core 0)
GET /?package_update_channel=stable&unique=synology_braswell_416play&build=24922&language=enu&major=6&micro=2&arch=braswell&minor=2&timezone=Amsterdam&nano=0 => generated 58537 bytes in 22 msecs (HTTP/1.1 200) 2 headers in 74 bytes (1 switches on core 0)
GET /?package_update_channel=stable&unique=synology_armadaxp_ds414&build=23739&language=enu&major=6&micro=6.2&arch=armadaxp&minor=2&timezone=Brussels&nano=0 => generated 56279 bytes in 19 msecs (HTTP/1.1 200) 2 headers in 74 bytes (1 switches on core 0)
GET /?package_update_channel=stable&unique=synology_armada370_213j&build=23824&language=enu&major=6&micro=1&arch=armada370&minor=2&timezone=Amsterdam&nano=6 => generated 56269 bytes in 25 msecs (HTTP/1.1 200) 2 headers in 74 bytes (1 switches on core 0)
GET /?package_update_channel=stable&unique=synology_88f6282_211%2B&build=24922&language=enu&major=6&micro=2&arch=88f6282&minor=2&timezone=Brussels&nano=0 => generated 56006 bytes in 18 msecs (HTTP/1.1 200) 2 headers in 74 bytes (1 switches on core 0)
GET /?package_update_channel=beta&unique=synology_bromolow_3615xs&build=5644&language=chs&major=5&arch=bromolow&minor=2&timezone=Beijing => generated 95643 bytes in 3407 msecs (HTTP/1.1 200) 2 headers in 74 bytes (1 switches on core 0)

It is required to create specific key for cache entry, so that variant parameters are not considered as discriminant.

ymartin59 avatar Jun 12 '19 19:06 ymartin59

Quick confirmations:

  1. "Here are examples of requests for which caching is inefficient at the moment:" -- are these explicitly cache-misses, for which the cache has no value, or is this the list of queries that take longer than some threshold to produce? This would suggest whether it's a case of optimizing cache vs optimizing the backend datastore (auditing indices in the PG)

  2. These GET traces, are they the original path components of the original queries, or are they taken somewhere in the caching ?(ie if you're using redis, at an API call). Assuming the nas.get_catalog() is the wrapped function where the caching is either slow or cache-miss, what's the get_catalog() getting for its values? If the GET is actually from the cache query, then the values for language are indeed sneaking in somewhere, but I'm assuming which function needs help. "It is required to create specific key " if the get_catalog (or whichever function is the rate-determining) isn't getting any information outside of { arch, build, language, beta } then you're already narrowed as far as you can go. You could try to send in an object with a __repr() but if the actual cache query is already down to just the four key items, then you can't get anything from a __repr()

  3. I assume there's been no result in simply extending the cache timeout from 10 minutes? Are you limited in the RAM you can exploit on the backend service?

  4. I assume we don't want to address this by asking Nginx to narrow the query to the critical things (ie a rewrite ... redirect; rule resulting in a more concise URL) -- but that's more of a macroscopic approach to optimizing cache as per question 2, but at a different place in the stack. My primary day-job involves Katran and Nginx so I fallback on this.

  5. I assume the ext.py is setting cache = Cache(), which would normally default to NullCache, is actually getting CACHE_TYPE to be set to redis via the salt orchestration? I'm assuming so, because you talk of Redis-server on Postgres.

chickenandpork avatar Jun 12 '19 21:06 chickenandpork

For a single request, the main problem is that we have to pull the whole SQL database to be able to generate the catalog entry for a specific NAS. And the main reason is the version numbers (DSM and packages) as SQL cannot compare versions the way we need to do it. I've put some caching (in app and CDN), this has gotten us that far but it's just hiding the real issue.

I think caching is deficient somewhere but maybe this is just that there are too many different requests and the backend cannot handle the load. This needs investigating, maybe there is a quick win.

I'm not that sure of the added value of a relational database for this kind of use cases, we rely on relationships for common stuff (displayname, description, etc.) but we could choose not to do so. We also rely on it for administration, activating or deactivating a whole package, or a whole version. But again that could be done with some applicative code instead.

I would suggest a flat approach, schema or schemaless but flat. With a ton of validation to ensure consistency.

Diaoul avatar Jun 13 '19 04:06 Diaoul