GitPython icon indicating copy to clipboard operation
GitPython copied to clipboard

Encoding issue with tags in packed-refs file

Open benmss opened this issue 4 months ago • 1 comments

Tags found in the packed-refs file that can be created by the Git command git pack-refs do not have to be UTF8 encoded. In these cases, GitPython fails to read these tags due to assuming they should be UTF8 compatible.

The source for this issue is found here: https://github.com/gitpython-developers/GitPython/blob/main/git/refs/symbolic.py#L124

with open(cls._get_packed_refs_path(repo), "rt", encoding="UTF-8") as fp:

A working example (using a repository with a non-utf8 tag):

git clone https://github.com/ACRA/acra
cd acra

Create a Python script with the following content:

import git
repo = git.Repo(".")
print(repo.tags)

Execute the script:

python script.py

Result: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc3 in position 6216: invalid continuation byte

benmss avatar Aug 06 '25 03:08 benmss

Thanks for reporting!

GitPython absolutely has a problem with encodings, in that it should use bytes in most places where it now uses strings, and the same is true for filesystem paths.

Byron avatar Aug 06 '25 03:08 Byron