pathlib .suffix, .suffixes, .stem unexpected behavior for pathname with trailing dot
| BPO | 38624 |
|---|---|
| Nosy | @pitrou |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
assignee = None
closed_at = None
created_at = <Date 2019-10-28.22:54:26.232>
labels = ['3.8', 'type-bug', 'library']
title = 'pathlib .suffix, .suffixes, .stem unexpected behavior for pathname with trailing dot'
updated_at = <Date 2019-10-29.01:59:29.319>
user = 'https://bugs.python.org/inyeollee'
bugs.python.org fields:
activity = <Date 2019-10-29.01:59:29.319>
actor = 'xtreak'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation = <Date 2019-10-28.22:54:26.232>
creator = 'inyeollee'
dependencies = []
files = []
hgrepos = []
issue_num = 38624
keywords = []
message_count = 1.0
messages = ['355600']
nosy_count = 2.0
nosy_names = ['pitrou', 'inyeollee']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue38624'
versions = ['Python 3.8']
Linked PRs
- gh-118952
Python3.8 pathlib treats dot between path stem and suffix as part of suffix in general:
>>> a = pathlib.Path('foo.txt')
>>> a.stem, a.suffix
('foo', '.txt')
>>> a.with_suffix('')
PosixPath('foo')
However, if pathname ends with dot, it treats the trailing dot as part of stem, not part of suffix:
>>> b = pathlib.Path('bar.')
>>> b.stem, b.suffix
('bar.', '')
This looks like a bug. It should return ('bar', '.'). There are couple of unexpected behavior related to this:
>>> pathlib.Path('foo.txt').with_suffix('.')
...
ValueError: Invalid suffix '.' <== Why not PosixPath('foo.') ?
>>> c = pathlib.Path('foo..')
>>> c.stem, c.suffix, c.suffixes
('foo..', '', [])
I think above should return ('foo.', '.', ['.', '.'])
Tested with macOS 10.15 and Python3.8. Python3.7 behaves the same.
In my attempt to work on this bug, I found it is not possible to make a fix without deciding whether files with trailing dots meaningfully contain a suffix. Currently, Python implicitly defines strings with trailing dots to be an invalid suffix, but does not error:
# Yes, obvious case
>>> Path("foo.tar").suffix
'.tar'
# Trailing dot is not a suffix
>>> Path("foo.tar.").suffix
''
# Consistent with before, but confusingly you can create one with `with_suffix()`
>>> Path("foo").with_suffix(".tar.")
PosixPath('foo.tar.')
>>> PosixPath('foo.tar.').with_suffix(".tar.").suffix
''
So, this isn't a bug if "trailing dots is not a valid suffix". However, this rule is seemingly ignored in with_suffix() since we are able to pass in a string with a trailing dot. At the same time it is not true that p.with_suffix(s).suffix == s since you can pass in compound suffixes.
My proposed solution (outside the scope of this issue) is to change with_suffix such that it raises an exception when passed a string with trailing dots.
I agree with the original bug report - Path("foo.tar.").suffix should give you '.'. That would match os.path.splitext() behaviour.
@barneygale Good point, then it must be that
>>> Path("foo.tar.").suffixes
[".tar", "."]
As well as we needing to remove the restriction with with_suffix("."):
>>> Path("foo.tar").with_suffix(".")
Path("foo.tar.")
More discussion in #100157, which I've closed as a duplicate.
PR available: https://github.com/python/cpython/pull/118952