rure-python icon indicating copy to clipboard operation
rure-python copied to clipboard

groupdict() returns only first character of the match if there is a single named capture group

Open simonw opened this issue 2 years ago • 2 comments

This line:

[m.groupdict() for m in rure.compile('(?P<word>\w+)').finditer("hello there")]

Returns this:

[{'word': 'h'}, {'word': 't'}]

I would expect it to return this:

[{'word': 'hello'}, {'word': 'there'}]

Adding a second named group fixes this issue for some reason:

[m.groupdict() for m in rure.compile('(?P<name>\w+)(?P<nomatch>)').finditer("hello there")]

Returns:

[{'name': 'hello', 'nomatch': ''}, {'name': 'there', 'nomatch': ''}]

simonw avatar Jul 20 '22 17:07 simonw

I think the bug may be in this code: https://github.com/davidblewett/rure-python/blob/024bf4b8139c159fe2c9156c5acc9b61c9505866/rure/regex.py#L235-L251

simonw avatar Jul 20 '22 17:07 simonw

This happens without .finditer() too:

>>> rure.compile('(?P<name>\w+)').match("hello").groupdict()
{'name': 'h'}
>>> rure.compile('(?P<name>\w+)(?P<nomatch>)').match("hello").groupdict()
{'name': 'hello', 'nomatch': ''}

simonw avatar Jul 20 '22 17:07 simonw