sphinx icon indicating copy to clipboard operation
sphinx copied to clipboard

[WIP] Re-implementation of the test suite.

Open picnixz opened this issue 11 months ago • 9 comments

Closes #11285.

Documentation will be in a separate PR. Note that this PR must be merged after #12017.

picnixz avatar Feb 26 '24 16:02 picnixz

(I'll convert it into a draft so that I don't accidentely merge it)

picnixz avatar Feb 26 '24 17:02 picnixz

I'm not entirely done with everything because I don't know whether there are still side-effects or not. For people who want to review:

  • First check the test files and see whether the removal of some flags make sense for you or not. Most of the time, keep in mind that:

    • tests should not modify their sources, unless isolated
    • tests that are parametrized and modify their sources either use test_params() or grouped isolation
  • After that, have a look at the public API of the fixtures.

  • Then, have a look at the private API. When we want to determine the test environment, we will do multiple things:

    • Try to find which testroot we need to use (by default, it's an empty one but Sphinx needs a testroot even if we don't build anything).
    • Find the destination where all the test files will be copied to. This is where the "isolation" processing occurs. I deprecated the usage of "srcdir" and "freshenv" because they can be entirely replaced by "isolate=True" or "isolate='grouped'" in a parametrized test (have a look at #12019 about its usage).
    • Then, you construct the arguments that will be used as is by SphinxTestApp. You can always construct manually your application but you must know what you are doing. If you use the app fixture, then you agree that a processing (possibly with dependencies) is done (and this was the idea of that fixture btw).
  • AFAICT, random order of serial execution is fine. I haven't tried to see if random order + parallel testing is fine because I'm not sure that the pytest stash is correctly shared. I'll need to add more tests for that.

  • If the PR is stalled for too long, I'll keep the new API but revert to a sequential execution. I tried to minimize the runtime using the new API so it should not affect the overall time. Currently, the suite sequentially executes on my machine in 70s. In parallel mode (12 workers), it boils down to ~10s. But since we want to speed-up the CI/CD and not the tests locally, I need to try with another branch perhaps only with sequential tests maybe.

picnixz avatar Feb 26 '24 17:02 picnixz

@jayaddison I think the PR is not ready yet, sorry for the false alarm. There are a bunch of weird errors that are still under fix.

picnixz avatar Feb 27 '24 15:02 picnixz

After a bit of investigation, I am bit confused on which direction to really take.

Until now, the testing fixtures were not documented at all and there were real bugs. We also didn't have any explicit public API so I'm not sure whether to introduce real breaking changes to fix everything is fine or not. Maybe we can delay this PR for 8.x since technically the testing plugin was still marked as experimental...

Personally, I think the usage of 'shared_result', 'srcdir' etc is very confusing. The same for the shared output. And also the same for the fact that the tests are physically not in the same place in the file. Combine all this and you end up with a test suite really hard to maintain.

I think I will do another PR where I completely break from what we are doing and just wait for 8.x. The best way to do it is probably to have another module and our test suite would just use that module. For compatibility reasons, we could keep the old modules but we'll need to refactor the test suite (depending on where new tests are added you can break the test suite without understanding !)

picnixz avatar Feb 27 '24 16:02 picnixz

Until now, the testing fixtures were not documented at all and there were real bugs. We also didn't have any explicit public API so I'm not sure whether to introduce real breaking changes to fix everything is fine or not. Maybe we can delay this PR for 8.x since technically the testing plugin was still marked as experimental...

I think we have a bit more flexibility to adjust the test suite between releases because it's not going to break downstream applications.

It's slightly more likely to break downstream packaging (distros that run the tests before redistributing Sphinx). Whether downstream maintainers prefer to do that at the same time as figure out other breaking API changes, or prefer to do it in isolation from public-facing changes.. I'm not certain, but I expect that keeping it separate might be easier for them (do we know distro maintainers for Sphinx?).

Also: the possible speedups for the test suite - both in continuous integration and locally - seem quite compelling to me. The time and cost reduction feel significant - but only if the results are understandable to maintainers and contributors.

Personally, I think the usage of 'shared_result', 'srcdir' etc is very confusing. The same for the shared output. And also the same for the fact that the tests are physically not in the same place in the file. Combine all this and you end up with a test suite really hard to maintain.

I experimented with removing the srcdir=... argument from all of the @pytest.mark.sphinx entries in the test suite a few moments ago; all except one continued to succeed on their own.. but running the entire test suite resulted in a large number of test failures.

So I guess srcdir is also providing some kind of isolation between unit tests?

I think I will do another PR where I completely break from what we are doing and just wait for 8.x. The best way to do it is probably to have another module and our test suite would just use that module. For compatibility reasons, we could keep the old modules but we'll need to refactor the test suite (depending on where new tests are added you can break the test suite without understanding !)

I don't understand this yet but I'll read more of the code, documentation and changes.

jayaddison avatar Feb 27 '24 19:02 jayaddison

Well sorry for being the harbinger of bad news but the way xdist schedules tests was not compatible with the current implementation. So the isolation will need to be rethought again...

Yes srcdir gives an isolation but the issue is that you need to give a unique ID... which comes at the cost of knowing all the existing IDs... or changing the pytest collection...

Anyway, I need more magic for this PR :( I thought about something else last night so perhaps it could work.

picnixz avatar Feb 29 '24 07:02 picnixz

I don't understand this yet but I'll read more of the code, documentation and changes.

Actually, the plugin we use is sphinx.testing.fixtures. So I'll just create something called sphinx.testing.plugin and use this one instead. But I'll keep the other one in releases so that people depending on it are not bothered by that.

picnixz avatar Feb 29 '24 08:02 picnixz

In addition:

  • some tests cannot be done offline (and honestly it took me a while to spot them...).
  • xdist is poorly documented and not typed at all which makes it very hard to understand.

I'll rewrite (again) the isolation logic to make it compatible with xdist =/

picnixz avatar Feb 29 '24 08:02 picnixz

@jayaddison I won't work on this for a few days and I probably open (another) PR since xdist worked differently to what I expected.

I have faced some issues with it and there are some subtleties that I needed to work around so ... it'll just be another implementation.

Good thing, however, is that I can reduce the number of files since I'll provide in the first time another testing plugin. Then, in a separate PR, I can change the tests bit by bit, by specifying the plugin to use in a local conftest. Finally, I'll remove those local conftests once the migration is done.

picnixz avatar Mar 04 '24 06:03 picnixz

Superseeded for now by #12089 and #12093.

We will gradually change the test suite using smaller PRs, until we eventually switch to xdist when everything is fine.

picnixz avatar Mar 14 '24 16:03 picnixz