rure-python
rure-python copied to clipboard
groupdict() returns only first character of the match if there is a single named capture group
This line:
[m.groupdict() for m in rure.compile('(?P<word>\w+)').finditer("hello there")]
Returns this:
[{'word': 'h'}, {'word': 't'}]
I would expect it to return this:
[{'word': 'hello'}, {'word': 'there'}]
Adding a second named group fixes this issue for some reason:
[m.groupdict() for m in rure.compile('(?P<name>\w+)(?P<nomatch>)').finditer("hello there")]
Returns:
[{'name': 'hello', 'nomatch': ''}, {'name': 'there', 'nomatch': ''}]
I think the bug may be in this code: https://github.com/davidblewett/rure-python/blob/024bf4b8139c159fe2c9156c5acc9b61c9505866/rure/regex.py#L235-L251
This happens without .finditer()
too:
>>> rure.compile('(?P<name>\w+)').match("hello").groupdict()
{'name': 'h'}
>>> rure.compile('(?P<name>\w+)(?P<nomatch>)').match("hello").groupdict()
{'name': 'hello', 'nomatch': ''}