TagStudio icon indicating copy to clipboard operation
TagStudio copied to clipboard

[Bug]: Crash when adding file to library with Japanese filename

Open SkeleyM opened this issue 1 month ago • 2 comments

Checklist

  • [x] I am using an up-to-date version.
  • [x] I have read the documentation.
  • [x] I have searched existing issues.

TagStudio Version

Main branch

Operating System & Version

Windows 10 22H2

Description

When adding a file/creating a new library that includes a file with a japanese name the program freezes and an error appears in the console.

Expected Behavior

The file should add just fine

Steps to Reproduce

  1. Create a new library
  2. Include a file in the library called "こんにちは.png" (the offending name in my case)
  3. Refresh directories

Logs

2025-11-10 21:33:32 [info ] [Ignore] No updates to the .ts_ignore detected last_mtime=1762809066.321023 library=WindowsPath('C:/Users/user/Pictures') new_mtime=1762809066.321023 2025-11-10 21:33:32 [info ] [Refresh: Using ripgrep for scanning] Exception in thread Thread-46 (_readerthread): Traceback (most recent call last): File "c:\python312\Lib\threading.py", line 1075, in _bootstrap_inner self.run() File "c:\python312\Lib\threading.py", line 1012, in run self._target(*self._args, **self._kwargs) File "c:\python312\Lib\subprocess.py", line 1599, in _readerthread buffer.append(fh.read()) ^^^^^^^^^ File "c:\python312\Lib\encodings\cp1252.py", line 23, in decode return codecs.charmap_decode(input,self.errors,decoding_table)[0] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 1: character maps to Error calling Python override of QRunnable::run(): Traceback (most recent call last): File "C:\Users\user\Documents\TagStudio\src\tagstudio\qt\utils\custom_runnable.py", line 18, in run self.function() File "C:\Users\user\Documents\TagStudio\src\tagstudio\qt\utils\function_iterator.py", line 21, in run for i in self.iterable(): ^^^^^^^^^^^^^^^ File "C:\Users\user\Documents\TagStudio\src\tagstudio\qt\ts_qt.py", line 1005, in lambda lib=unwrap(self.lib.library_dir): tracker.refresh_dir(lib) # noqa: B008 ^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\user\Documents\TagStudio\src\tagstudio\core\library\refresh.py", line 71, in refresh_dir dir_list: list[str] | None = self.__get_dir_list(library_dir, ignore_patterns) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\user\Documents\TagStudio\src\tagstudio\core\library\refresh.py", line 116, in __get_dir_list return result.stdout.splitlines() # pyright: ignore [reportReturnType] ^^^^^^^^^^^^^^^^^^^^^^^^ AttributeError: 'NoneType' object has no attribute 'splitlines'

SkeleyM avatar Nov 10 '25 21:11 SkeleyM

I've had similar issues when adding files that contained Unicode apostrophes ( instead of ') and Em Dashes ( instead of -), except instead of crashing, it displayed broken entries with wired looking paths Image

The problem disappears when using wcmatch instead of ripgrep. My first guess would be, that on windows there's an issue with the encoding of the subprocess used to call ripgrep?

I found an issue related to subprocess handling under windows at https://github.com/python/cpython/issues/105312 which could be related, but haven't looked into it beyond that since I don't have python setup under windows.

Sola-ris avatar Nov 10 '25 22:11 Sola-ris

Using Linux 6.17.7-arch1-1 and ripgrep, I'm unable to replicate the issue. The Japanese characters, Unicode apostrophe, and em dash display as expected for me (although my font doesn't support Japanese characters, so they just display as as missing characters).

TrigamDev avatar Nov 10 '25 23:11 TrigamDev