pygit2 icon indicating copy to clipboard operation
pygit2 copied to clipboard

utf error when listing branches

Open lb-ronyeh opened this issue 2 years ago • 4 comments

Hi, when listing branches, we get

"/usr/local/lib/python3.8/site-packages/pygit2/repository.py", line 1526, in iter for branch_name in self._repository.listall_branches(self._flag): | for branch_name in self._repository.listall_branches(self._flag): UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 7-8: invalid continuation byte

is it possible to output in the error the invalid branch name? or have an option to skip the invalid ones?

lb-ronyeh avatar Mar 30 '23 21:03 lb-ronyeh

Same here, so how to fix it?

long2ice avatar Jun 01 '23 06:06 long2ice

Don't use python.

It's not designed to deal with strings in different encodings.

hramrach avatar Apr 29 '24 09:04 hramrach

There is raw_listall_branches(flag: BranchType = BranchType.LOCAL)→ list[bytes] which should make it possible to get any garbage there is, and get the error in your application as opposed to inside pygit2 where it's not handled.

hramrach avatar Apr 29 '24 17:04 hramrach

import tempfile
import pygit2
import subprocess
import shutil
import sys

print(f"python: {sys.version}")
print(f"libgit2: {pygit2.LIBGIT2_VERSION}")
print(f"pygit2: {pygit2.__version__}")


repodir = tempfile.mkdtemp()
repo = pygit2.init_repository(repodir, bare=True)

sig = pygit2.Signature('Test User', '[email protected]')

data = 'blah blah master'
tree = repo.TreeBuilder()
tree.insert('file', repo.create_blob(data.encode()), pygit2.GIT_FILEMODE_BLOB)

master_commit_oid = repo.create_commit('HEAD', sig, sig, 'master commit', tree.write(), [])

repo.lookup_branch('master').set_target(master_commit_oid)

data = 'blah blah feature'
tree = repo.TreeBuilder()
tree.insert('file', repo.create_blob(data.encode()), pygit2.GIT_FILEMODE_BLOB)

feature_commit_oid = repo.create_commit('HEAD', sig, sig, 'feature commit', tree.write(), [master_commit_oid])

subprocess.run([b'git', b'--git-dir', repodir.encode(), b'branch', b'feature\xc0\xc1', feature_commit_oid.hex.encode()])

for branch_name in repo.raw_listall_branches():
    print(branch_name)

try:
    for branch_name in repo.listall_branches():
        print(branch_name)
except Exception as e:
    print(e)

shutil.rmtree(repodir)
python: 3.11.8 (main, Feb 29 2024, 12:19:47) [GCC]
libgit2: 1.8.0
pygit2: 1.14.1
b'feature\xc0\xc1'
b'master'
'utf-8' codec can't decode byte 0xc0 in position 7: invalid start byte

hramrach avatar May 08 '24 07:05 hramrach