sphinx-needs icon indicating copy to clipboard operation
sphinx-needs copied to clipboard

✨ Add `needimport` caching and `needs_import_cache_size` configuration

Open chrisjsewell opened this issue 1 year ago • 3 comments

This PR introduces "lazy and size-bounded" caching for the reading of needs.json in the needimport directive.

This reads/writes to an in-memory cache, keyed on the path and mtime, and is bounded by the needs_import_cache_size (configurable by the user), which sets the maximum number of needs allowed in the cache, to ensure we do not have large increases in build memory usage.


Note, in #1148 there was discussion of "centralised, pre-caching", however, that is problematic because:

  1. It means all import sources have to be read in for every build/re-build, irrespective of whether they actually may be used
  2. this can introduce a noticeable increase in memory usage and time for re-builds
  3. for parallel builds, all of this data will be copied to every process, irrespective of whether that processes actually uses it, again meaning potentially large multipliers of memory usage

chrisjsewell avatar Sep 12 '24 21:09 chrisjsewell

Codecov Report

Attention: Patch coverage is 90.90909% with 4 lines in your changes missing coverage. Please review.

Project coverage is 86.99%. Comparing base (4e10030) to head (3997aee). Report is 53 commits behind head on master.

Files with missing lines Patch % Lines
sphinx_needs/directives/needimport.py 90.47% 4 Missing :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1297      +/-   ##
==========================================
+ Coverage   86.87%   86.99%   +0.11%     
==========================================
  Files          56       60       +4     
  Lines        6532     6998     +466     
==========================================
+ Hits         5675     6088     +413     
- Misses        857      910      +53     
Flag Coverage Δ
pytests 86.99% <90.90%> (+0.11%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Sep 12 '24 21:09 codecov[bot]

Hi @chrisjsewell , first of all thanks for tackling this topic, even before I had conducted the performance measurement :-)

Note, in #1148 there was discussion of "centralised, pre-caching", however, that is problematic because:

1. It means all import sources have to be read in for every build/re-build, irrespective of whether they actually may be used

How do we handle that the needs.json import source has changed? Sphinx will not consider the importing document changed, so in this case do we need to trigger a full re-build? Or does sphinx-needs have a way to handle this?

Is this maybe a general problem of needimport?

arwedus avatar Sep 13 '24 08:09 arwedus

How do we handle that the needs.json import source has changed? Sphinx will not consider the importing document changed

@arwedus It informs sphinx of the documents dependency on the file (similar to e.g. literalinclude): https://github.com/useblocks/sphinx-needs/blob/fcf40f8efdd87faab9e1ba00c4d8b6e3d8652c3d/sphinx_needs/directives/needimport.py#L144

so yes sphinx will check if it's mtime is changed and, if so, re-build all dependant documents

chrisjsewell avatar Sep 13 '24 09:09 chrisjsewell