cpython icon indicating copy to clipboard operation
cpython copied to clipboard

Documentation for str.count() should mention the empty string case

Open MrHaxtar opened this issue 2 years ago • 4 comments

This is the bug in python count() function.

>>> a="I Iove python"
>>> b=a.count("")
>>> print(b)

So normally count function is used to check specific word into the string. So if I am using "" it will giving output as 14 instead of showing error. Anyone knows how to handle this bug or may be it's not fixed till now by www.python.org

  • PR: gh-99287
  • PR: gh-99339

MrHaxtar avatar Nov 07 '22 04:11 MrHaxtar

This is intentional, you can see that this case is explicitly handled here: https://github.com/python/cpython/blob/df3a6d9beb8a7a3fe87a6d4126384fd3e0213853/Objects/stringlib/count.h#L22

And this preserves behaviour going back at least 22 years, e.g. see: https://github.com/python/cpython/blob/d57fd91488212f5b891da5caf6bc04a907659cbd/Objects/unicodeobject.c#L1864

In case it helps explain the behaviour, "" is a substring of all strings and e.g. if it's true that "" in string, then it makes sense that string.count("") > 0. While the code today is a little complicated, if you look at my second link from the code 23 years ago, it's pretty easy to understand what count is doing and why it gets 14.

hauntsaninja avatar Nov 07 '22 06:11 hauntsaninja

This is the bug in python count() function.

It is not a bug. The empty string matches 14 positions of your string:

  1. An empty string matches before the "I"
  2. And after the "I" and before the first space.
  3. And after the space and before "l".
  4. And between the "l" and the "o".

and so on. If you count them, there are 14 positions where an empty string matches.

See also this Stackoverflow answer.

stevendaprano avatar Nov 07 '22 10:11 stevendaprano

This issue comes up fairly regularly and many people seem to be surprised by it. I think it might help for the docs to explicitly mention that the empty string matches every position in the string, and so str.count("") returns one more than the length of the string.

stevendaprano avatar Nov 07 '22 10:11 stevendaprano

This is the bug in python count() function.

It is not a bug. The empty string matches 14 positions of your string:

  1. An empty string matches before the "I"
  2. And after the "I" and before the first space.
  3. And after the space and before "l".
  4. And between the "l" and the "o".

and so on. If you count them, there are 14 positions where an empty string matches.

See also this Stackoverflow answer.

Thanks for the information

MrHaxtar avatar Nov 08 '22 02:11 MrHaxtar