cpython icon indicating copy to clipboard operation
cpython copied to clipboard

`os.path.relpath()` needlessly calls `abspath()` when given two paths with matching anchors

Open barneygale opened this issue 3 years ago • 3 comments

relpath() unconditionally calls abspath() to make both its arguments absolute before comparing them. This is necessary if the supplied arguments have different anchors, but it is not necessary when the paths' anchors match. We can save a call to getcwd() / _getfinalpathname() by skipping the abspath() call in this case. This should improve performance a little.

For example, neither of the following should require a call to abspath() internally:

relpath('foo/bar', 'baz/ding')
relpath('/etc/hosts', '/usr/local/bin')

barneygale avatar Nov 07 '22 12:11 barneygale

On Windows, abspath() also normalizes a path, such as stripping any trailing spaces and dots from the final component. For example:

>>> ntpath.relpath('spam/eggs. . .', 'spam/eggs/foo')
'..'

If we want to retain the ".." result here, then trailing dots and spaces will have to be stripped from the final component of each path. Otherwise the result will be "..\..\eggs. . .". There are also cases with DOS device names that will change if abspath() isn't called. For example, "foo/bar/nul" resolves to "\\.\nul".

eryksun avatar Nov 07 '22 23:11 eryksun

Funnily enough normpath() doesn't apply that normalisation. The support for Windows normalisation rules in Python seems pretty patchy IMO. There's no provision for it in pathlib, for example.

barneygale avatar Nov 09 '22 20:11 barneygale

Funnily enough normpath() doesn't apply that normalisation. The support for Windows normalisation rules in Python seems pretty patchy IMO. There's no provision for it in pathlib, for example.

I'd prefer for ntpath.normpath() to strip dots and spaces from the final component, except for a ".." name. This would also improve ntpath.abspath() on POSIX[^1].

Normalizing DOS device names is problematic because it depends on the Windows version. There's still consistent behavior for bare device names, such as "con" -> "\\.\con", but not "./con"[^2] or "con.txt".

[^1]: The fallback on POSIX also needs better support for drives, and to use "C:" as the working drive. It should get drive, path = splitdrive(path), and then resolve the absolute path as join(drive or 'C:', cwd, path). [^2]: ntpath.abspath() is implemented as _getfullpathname(normpath(path)). normpath() aggressively normalizes "./con" as "con", so abspath('./con') returns "\\.\con" even if the current Windows version doesn't reserve "./con". In general, removing a leading "." component is wrong on Windows. For example, "./C:spam" (a file named "C" with a data stream named "spam") differs from the drive-relative path "C:spam".

eryksun avatar Nov 09 '22 22:11 eryksun

Ack, this doesn't work due to differences in collapsing leading ../ segments:

======================================================================
ERROR: test_relpath (test.test_ntpath.TestNtpath.test_relpath)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/barney.gale/code/cpython/Lib/test/test_ntpath.py", line 689, in test_relpath
    tester('ntpath.relpath("a", "../b")', '..\\'+currentdir+'\\a')
  File "/Users/barney.gale/code/cpython/Lib/test/test_ntpath.py", line 57, in tester
    raise TestFailed("%s should return: %s but returned: %s" \
test.support.TestFailed: ntpath.relpath("a", "../b") should return: ..\@test_90826_tmpæ\a but returned: ..\..\a

======================================================================
FAIL: test_relpath (test.test_posixpath.PosixPathTest.test_relpath)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/barney.gale/code/cpython/Lib/test/test_posixpath.py", line 610, in test_relpath
    self.assertEqual(posixpath.relpath("a", "../b"), "../"+curdir+"/a")
AssertionError: '../../a' != '../bar/a'
- ../../a
+ ../bar/a

barneygale avatar Nov 12 '22 05:11 barneygale

Closing this issue as the handling of ../ segments in os.path and pathlib can't be reconciled in this case.

barneygale avatar Nov 12 '22 06:11 barneygale