pytest
pytest copied to clipboard
Fix collection of short paths on Windows
Passing a short path in the command line was causing the matchparts check to fail, because Path(short_path) != Path(long_path).
Using os.path.samefile as fallback ensures the comparison works on Windows when comparing short/long paths.
Fix #11895
The initial commit was used just to demonstrate the failing test on CI, however we should squash/merge this in the end.
Each call to samefile adds 2 stats, and stats are generally reputed to cause slowness. I wonder then, is it possible in Python to "expand" a short path to a long path, then we can run it on the inputs and proceed as usual, instead of using samefile?
I wonder then, is it possible in Python to "expand" a short path to a long path, then we can run it on the inputs and proceed as usual, instead of using samefile?
Should be possible, I will look into it, however I think it will change the resulting node ids.
Done, it was a bit more code but definitely won't impact performance.
@bluetech just stumbled into this:
https://github.com/pytest-dev/pytest/blob/7690a0ddf15785f0af0e9a4cf8524379c53cd061/src/_pytest/pathlib.py#L595-L605
This suggests we might be not considering other cases if we try to just expand the short names during collection. Perhaps we should bite the bullet and just go with the stat solution? Less code and will be the safest one, while I understand there's the performance concern.
@bluetech what do you think regarding my last point?
@nicoddemus I'll try to make an informed review, but if I take too long (like before 8.0.2) then please don't wait for me, I don't mean to block this.
I used https://github.com/nicoddemus/collection-test-bed to measure the performance so we can have some numbers to decide.
Using that configuration we have 10.5k tests 4 levels deep, with test files containing simple parametrized tests, here are the numbers in my Windows using pytest --co:
| branch | timing |
|---|---|
| main | 8.71s |
| stat solution | 9.23s |
| convert to long solution | 8.70s |
So indeed the stat call has some performance penalty, as expected. Keep in mind however that this is for a test suite which does not import anything else, it only imports pytest itself.
Testing in a more realist test suite from a project I have around with 1.3k tests, I couldn't measure any difference between the stat aproach and main, both clocking around the same 6.6s for the full collection.
So while we can see some difference in a synthetic test suite, with 10k tests but without any other imports, using a realistic test suite, with 1.3k tests and which imports application code, the difference is not perceptible at all.
I'm leaning towards the stat solution as it is simpler and also more correct in my opinion.
Going ahead and merge this then, we can revisit later if needed. :+1: