pytest Fix collection of short paths on Windows

Passing a short path in the command line was causing the matchparts check to fail, because Path(short_path) != Path(long_path).

Using os.path.samefile as fallback ensures the comparison works on Windows when comparing short/long paths.

Fix #11895

Feb 05 '24 21:02 nicoddemus

The initial commit was used just to demonstrate the failing test on CI, however we should squash/merge this in the end.

Feb 05 '24 22:02 nicoddemus

Each call to samefile adds 2 stats, and stats are generally reputed to cause slowness. I wonder then, is it possible in Python to "expand" a short path to a long path, then we can run it on the inputs and proceed as usual, instead of using samefile?

Feb 06 '24 12:02 bluetech

I wonder then, is it possible in Python to "expand" a short path to a long path, then we can run it on the inputs and proceed as usual, instead of using samefile?

Should be possible, I will look into it, however I think it will change the resulting node ids.

Feb 06 '24 12:02 nicoddemus

Done, it was a bit more code but definitely won't impact performance.

Feb 09 '24 21:02 nicoddemus

@bluetech just stumbled into this:

https://github.com/pytest-dev/pytest/blob/7690a0ddf15785f0af0e9a4cf8524379c53cd061/src/_pytest/pathlib.py#L595-L605

This suggests we might be not considering other cases if we try to just expand the short names during collection. Perhaps we should bite the bullet and just go with the stat solution? Less code and will be the safest one, while I understand there's the performance concern.

Feb 10 '24 14:02 nicoddemus

@bluetech what do you think regarding my last point?

Feb 14 '24 18:02 nicoddemus

@nicoddemus I'll try to make an informed review, but if I take too long (like before 8.0.2) then please don't wait for me, I don't mean to block this.

Feb 16 '24 10:02 bluetech

I used https://github.com/nicoddemus/collection-test-bed to measure the performance so we can have some numbers to decide.

Using that configuration we have 10.5k tests 4 levels deep, with test files containing simple parametrized tests, here are the numbers in my Windows using pytest --co:

branch	timing
main	8.71s
stat solution	9.23s
convert to long solution	8.70s

So indeed the stat call has some performance penalty, as expected. Keep in mind however that this is for a test suite which does not import anything else, it only imports pytest itself.

Testing in a more realist test suite from a project I have around with 1.3k tests, I couldn't measure any difference between the stat aproach and main, both clocking around the same 6.6s for the full collection.

So while we can see some difference in a synthetic test suite, with 10k tests but without any other imports, using a realistic test suite, with 1.3k tests and which imports application code, the difference is not perceptible at all.

I'm leaning towards the stat solution as it is simpler and also more correct in my opinion.

Feb 17 '24 14:02 nicoddemus

Going ahead and merge this then, we can revisit later if needed. :+1:

Feb 23 '24 10:02 nicoddemus

pytest pytest copied to clipboard

Fix collection of short paths on Windows

pytest
pytest copied to clipboard