hgvs untangle lru_cache and persistent cache

The current lru_cache code is complicated because it mixes several intentions. Let's untangle them.

The two goals are 1) in-memory memoization layer to reduce remote data fetches; 2) persistent caching layer, primarily so that tests do not require network access.

Consequences of mixing these concerns are:

Can't use included lru_cache code (incl. in 3.x)
Configuration is confusing. uta connect() requires a cache mode that's different than the lru_cache mode, and neither checks whether the supplied value is legit.
As implemented, the hdp interface is also entangled in caching.

Outcomes that I'd like to see:

interface module should be about the interface only. It shouldn't mix caching, etc.
use existing caching and persistence tools where possible
separate caching from the actual data provider (uta, cdot, etc)
portable cache file format (across Python versions and platforms)
enable caches to be used as sole-source for data for testing

Jun 26 '17 09:06 reece

Also: Investigate whether we can pin the pickle protocol version to 2 so that the same cache works for Python 2 and 3.

Jul 21 '18 21:07 reece

Outcomes that I'd like to see:

interface module should be about the interface only. It shouldn't mix caching, etc.
use existing caching and persistence tools where possible
separate caching from the actual data provider (current implementation achieves this)
same cache file for all Python versions
enable caches to be used as sole-source for data
support using NCBI gff3 alignment files directly

Ideas:

Provide distinct LRU (memory) and PersistentCache classes that implement the interface. Use these to provide layers, where nested layers are invoked for cache misses.
Stick to the mix-in model, where sequence fetching, gene data, and transcript data might come from pluggable sources

Jul 22 '18 18:07 reece

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days.

Dec 28 '23 01:12 github-actions[bot]

This issue was closed because it has been stalled for 7 days with no activity.

Jan 04 '24 01:01 github-actions[bot]

This issue was closed by stalebot. It has been reopened to give more time for community review. See biocommons coding guidelines for stale issue and pull request policies. This resurrection is expected to be a one-time event.

Feb 19 '24 17:02 reece