pygit2 icon indicating copy to clipboard operation
pygit2 copied to clipboard

Index.__getitem__ breaks with bytes on Python 3

Open jnrbsn opened this issue 5 years ago • 1 comments

Index.__getitem__ fails on Python when you pass it bytes.

>>> import pygit2
>>> index = pygit2.Index()
>>> index['foo']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.7/site-packages/pygit2/index.py", line 94, in __getitem__
    raise KeyError(key)
KeyError: 'foo'
>>> index[b'foo']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.7/site-packages/pygit2/index.py", line 88, in __getitem__
    elif not key >= 0:
TypeError: '>=' not supported between instances of 'bytes' and 'int'

This is because it calls is_string, which on Python 2 checks if it's basestring and on Python 3 checks if it's str. So both text and bytes work on Python 2, but on Python 3, bytes will fall through to elif not key >= 0:, which fails because you can't compare bytes and int like that. Here's an excerpt of the relevant code.

...
if is_string(key):
    centry = C.git_index_get_bypath(self._index, to_bytes(key), 0)
elif not key >= 0:
    raise ValueError(key)
...

The way I would expect this to work is that if you pass it bytes, it would match directly against the raw bytes of the path in the index. After all, the data in the index file is binary. So passing in bytes should be more explicit than passing in text so it should be allowed.

jnrbsn avatar Aug 16 '19 21:08 jnrbsn

We don't support Python 2 any more.

Which is the real use case to accept bytes? Maybe there're index files out there that don't use UTF-8?

jdavid avatar Oct 26 '19 09:10 jdavid