cpython icon indicating copy to clipboard operation
cpython copied to clipboard

pathlib .suffix, .suffixes, .stem unexpected behavior for pathname with trailing dot

Open dd5ce2ff-95a1-4e28-ace2-6d841dd59913 opened this issue 6 years ago • 6 comments

BPO 38624
Nosy @pitrou

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2019-10-28.22:54:26.232>
labels = ['3.8', 'type-bug', 'library']
title = 'pathlib .suffix, .suffixes, .stem unexpected behavior for pathname with trailing dot'
updated_at = <Date 2019-10-29.01:59:29.319>
user = 'https://bugs.python.org/inyeollee'

bugs.python.org fields:

activity = <Date 2019-10-29.01:59:29.319>
actor = 'xtreak'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation = <Date 2019-10-28.22:54:26.232>
creator = 'inyeollee'
dependencies = []
files = []
hgrepos = []
issue_num = 38624
keywords = []
message_count = 1.0
messages = ['355600']
nosy_count = 2.0
nosy_names = ['pitrou', 'inyeollee']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue38624'
versions = ['Python 3.8']

Linked PRs

  • gh-118952

Python3.8 pathlib treats dot between path stem and suffix as part of suffix in general:

>>> a = pathlib.Path('foo.txt')
>>> a.stem, a.suffix
('foo', '.txt')
>>> a.with_suffix('')
PosixPath('foo')

However, if pathname ends with dot, it treats the trailing dot as part of stem, not part of suffix:

>>> b = pathlib.Path('bar.')
>>> b.stem, b.suffix
('bar.', '')

This looks like a bug. It should return ('bar', '.'). There are couple of unexpected behavior related to this:

>>> pathlib.Path('foo.txt').with_suffix('.')
...
ValueError: Invalid suffix '.' <== Why not PosixPath('foo.') ?
>>> c = pathlib.Path('foo..')
>>> c.stem, c.suffix, c.suffixes
('foo..', '', [])

I think above should return ('foo.', '.', ['.', '.'])

Tested with macOS 10.15 and Python3.8. Python3.7 behaves the same.

In my attempt to work on this bug, I found it is not possible to make a fix without deciding whether files with trailing dots meaningfully contain a suffix. Currently, Python implicitly defines strings with trailing dots to be an invalid suffix, but does not error:

# Yes, obvious case
>>> Path("foo.tar").suffix
'.tar'

# Trailing dot is not a suffix
>>> Path("foo.tar.").suffix
''

# Consistent with before, but confusingly you can create one with `with_suffix()`
>>> Path("foo").with_suffix(".tar.")
PosixPath('foo.tar.')
>>> PosixPath('foo.tar.').with_suffix(".tar.").suffix
''

So, this isn't a bug if "trailing dots is not a valid suffix". However, this rule is seemingly ignored in with_suffix() since we are able to pass in a string with a trailing dot. At the same time it is not true that p.with_suffix(s).suffix == s since you can pass in compound suffixes.

My proposed solution (outside the scope of this issue) is to change with_suffix such that it raises an exception when passed a string with trailing dots.

ketozhang avatar Aug 14 '22 23:08 ketozhang

I agree with the original bug report - Path("foo.tar.").suffix should give you '.'. That would match os.path.splitext() behaviour.

barneygale avatar Aug 17 '22 05:08 barneygale

@barneygale Good point, then it must be that

>>> Path("foo.tar.").suffixes
[".tar", "."]

As well as we needing to remove the restriction with with_suffix("."):

>>> Path("foo.tar").with_suffix(".")
Path("foo.tar.")

ketozhang avatar Aug 18 '22 05:08 ketozhang

More discussion in #100157, which I've closed as a duplicate.

barneygale avatar Jan 13 '24 20:01 barneygale

PR available: https://github.com/python/cpython/pull/118952

barneygale avatar May 11 '24 20:05 barneygale