TagStudio
TagStudio copied to clipboard
[Bug]: Crash when adding file to library with Japanese filename
Checklist
- [x] I am using an up-to-date version.
- [x] I have read the documentation.
- [x] I have searched existing issues.
TagStudio Version
Main branch
Operating System & Version
Windows 10 22H2
Description
When adding a file/creating a new library that includes a file with a japanese name the program freezes and an error appears in the console.
Expected Behavior
The file should add just fine
Steps to Reproduce
- Create a new library
- Include a file in the library called "こんにちは.png" (the offending name in my case)
- Refresh directories
Logs
2025-11-10 21:33:32 [info ] [Ignore] No updates to the .ts_ignore detected last_mtime=1762809066.321023 library=WindowsPath('C:/Users/user/Pictures') new_mtime=1762809066.321023
2025-11-10 21:33:32 [info ] [Refresh: Using ripgrep for scanning]
Exception in thread Thread-46 (_readerthread):
Traceback (most recent call last):
File "c:\python312\Lib\threading.py", line 1075, in _bootstrap_inner
self.run()
File "c:\python312\Lib\threading.py", line 1012, in run
self._target(*self._args, **self._kwargs)
File "c:\python312\Lib\subprocess.py", line 1599, in _readerthread
buffer.append(fh.read())
^^^^^^^^^
File "c:\python312\Lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 1: character maps to
I've had similar issues when adding files that contained Unicode apostrophes (’ instead of ') and Em Dashes (– instead of -), except instead of crashing, it displayed broken entries with wired looking paths
The problem disappears when using wcmatch instead of ripgrep. My first guess would be, that on windows there's an issue with the encoding of the subprocess used to call ripgrep?
I found an issue related to subprocess handling under windows at https://github.com/python/cpython/issues/105312 which could be related, but haven't looked into it beyond that since I don't have python setup under windows.
Using Linux 6.17.7-arch1-1 and ripgrep, I'm unable to replicate the issue. The Japanese characters, Unicode apostrophe, and em dash display as expected for me (although my font doesn't support Japanese characters, so they just display as as missing characters).