sphinx
sphinx copied to clipboard
[8.x] Remove Intersphinx Legacy Format and Fix intersphinx cache loading in incremental builds
Fix #11466.
- Add test for intersphinx incremental builds.
So, the issue was that were ordering lists of (str, dict)
. However, this only occurred if we changed the URL of an intersphinx inventory and if there exists a cached inventory for the same project, e.g., building a project with intersphinx_mapping = {"foo": (url1, None)}
and then building it again with intersphinx_mapping = {"foo": (url2, None)}
was the issue. This bug only appeared in incremental build and by using -E
once, it would disappear so it could have been hard to track for users in general.
:white_check_mark: checked that the unit test does replicate the dict
comparison problem reported (first, second) in the bugreport, and that the code change resolves the problem.
So I think this is good - I'd like to understand a bit more of the functionality of this component while we have the opportunity if possible (hence the other comments/questions).
So I think this is good - I'd like to understand a bit more of the functionality of this component while we have the opportunity if possible (hence the other comments/questions).
Fine by me. I acknowledge that there are many obscure components and I only know about them by actually fighting against them. But feel free to ask questions.
I still have one unanswered question about whether to remove or not old cached entries that will not be used at runtime. As I said, building with URI_A, then URI_B, then URI_A again would actually lead to inconsistencies where we would never use the data from the latest URI_A and instead always use the data from URI_B.
I need to think about it this week-end, possibly layout out the corner cases. For now, I'll convert it into a draft PR and I'll make it a real PR when ready.
I've reviewed my comments and I don't understand them anymore. I think I either forgot some things or did not update them correctly. So I'll review from the beginning
Ok, I've read what I did and there are some unclear parts. I'll do something tomorrow because it's still something I need to check with the rest of the codebase. I'm no more sure about the invariants and uniqueness.
By working on this PR, I observe that having the legacy format for intersphinx mapping is making everything more complicated. As such, I'll simply assume that this PR is for 8.x and remove the old intersphinx format at the same time (making it work for both the old and the new formats is extremely hard and I prefer having breaking changes since it eases my life).
I think I'm done with this one ! my previous analysis was too complex and a bit faulty so here is the updated solution:
- Whenever we change a configuration value, the old entries are still kept if they are not overwritten. For instance, if I have
intersphinx_mapping = {'foo': ('foo.com', None), 'bar': ('bar.com', None)}
my cache will have the (project, uri)
entries ('foo', 'foo.com')
and ('bar', 'bar.com')
. If I change my config and rebuild but uses
intersphinx_mapping = {'foo': ('bar.com', None)}
my cache will have the (project, uri)
entries ('foo', 'foo.com')
and ('foo', 'bar.com')
. It does not make sense to keep the old link for the project, so we can remove it from the cache, even before querying the inventories (so no more race condition or whatever).
In addition, it does not make sense to have two projects with the same URI:
intersphinx_mapping = {'foo': ('foo.com', None), 'bar': ('foo.com', None)}
Currently we don't even ensure verify so when we do cache[uri] = ...
we have no guarantee that the operation is correct depending on which thread is doing the operation. So this is a new precondition on the intersphinx_mapping
that is now verified during the normalization (a warning will be emitted as well).
The legacy format was scheduled for deprecation in 6.2 for 8.x and since I removed what I could find related to that, this PR should only be either merged as the first one for 8.x (since it's deprecating an important format) or a separate PR focused on the validating and normalizing entries assuming the new format only.
This is next on my review list (I'd like to try minifying/reducing the changes to explore the differences).
I can refactor this PR and first remove the intersphinx legacy format in a separate one?
I can refactor this PR and first remove the intersphinx legacy format in a separate one?
That would help me, although it's more work for you - I'm OK with either approach!
I think that 'either' was unclear. I mean that it's either OK to separate the legacy-format-removal, or OK not to and I'll review this code as-is. Just let me know what you prefer and have time for.
Simce this PR still requires to get to 8.x either way, I think it's preferable to split it into two. That way we can always deprecate the things but still reconsider how to fix the other issue.
Ok, closing this one for now. I'll open a separate one based on #12083.