tartube
tartube copied to clipboard
On MS Windows, Tartube cannot detect videos with Japanese/Korean/unicode characters
This issue replaces https://github.com/axcore/tartube/issues/153, https://github.com/axcore/tartube/issues/251, https://github.com/axcore/tartube/issues/304, https://github.com/axcore/tartube/issues/290, https://github.com/axcore/tartube/issues/318 and others.
The problem
On MS Windows, Tartube can download a video containing Japanese characters, Korean characters or emoji unicode characters. But Tartube does not recognise the downloaded video, because youtube-dl's output is garbled.
An example of a video containing Korean characters
A screenshot of the garbled output:
On Linux/*BSD, everything works perfectly. It is an MS-Windows only issue.
Workaround
Click Edit > General download options... > Files > Filesystem, and then select Restrict filenames to ASCII characters. The videos should be downloaded and added to Tartube's database, but with garbled names (which is better than nothing.)
Attempted solution The problem is described concisely here. Unfortunately, the solutions suggested there don't work with Tartube installations, probably because they are running under MSYS2, rather than on Windows directly.
In tartube/downloads.py, at about line 4851 (Tartube v2.3.306), the solution should be this:
try:
self.child_process = subprocess.Popen(
cmd_list,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
preexec_fn=preexec,
startupinfo=info,
encoding="mbcs",
)
...with the addition being the encoding="mbcs" line, but that doesn't work, and neither do values like utf-8, utf-16 or cp1252.
I don't have any thoughts on it being done automatically, but I tried to manually "set the file" under video properties (using my example post yesterday). The file picker shows the file with the emoji but when I clicked open it errored with "The replacement video/audio file must be in the same channel, playlist or folder."
Would it be possible to get that code working since that doesn't deal with the console/yt-dl or is it the same issue of MSYS2?
Would it be possible to have the contents of the info.json for the currently downloading file be mirrored to something with a constant filename like "thread1.json" that could allow the proper file info to be read in and stored in the database?
On a side note, when I did a refresh on the channel to fix the issue in #321 all of the files were brought in with their emojis!
I then went to the folder with the test video for my #318 emoji post a few days ago, refreshed it and it brought in a new entry properly linked!
How does the refresh option work?
Could it be possible when doing a refresh, before "adding" a video as a new entry to check and see if there's already one in there with the same URL and just update that video's properties, rather than adding a "new" video?
Otherwise if there was a view option that shows all videos with no size so we can manually delete them before doing/after a refresh?
Also it looks like I still can't manually replace the video file in the properties dialogue - it says "the replacement video/audio file must be in the same channel, playlist or folder" when it already is. I tried to replace it with a test.mkv file so it looks like this is happens irrespective of any unicode characters in the filename.
Would it be possible to get that code working since that doesn't deal with the console/yt-dl or is it the same issue of MSYS2?
It is fixed in v2.3.321. It was a completely unrelated issue.
Would it be possible to have the contents of the info.json for the currently downloading file be mirrored
I thought about that, but it would not be reliable, and would not resolve the underlying issue. And it would not work at all when checking videos, because Tartube receives the .info.json data via STDOUT, so that too is garbled.
How does the refresh option work?
Here's the source code, knock yourself out :)
Could it be possible when doing a refresh, before "adding" a video as a new entry to check and see if there's already one in there with the same URL and just update that video's properties, rather than adding a "new" video?
That's what happens, the code gets a list of video files in the folder, and tries to match them against Tartube's database. But that won't work if the database is garbled, in that case the video files are added to the database is separate videos.
I know that these issue happens a year ago. but for now, It seems that download unicode titled videos are work without any issue in my environment.
My OS: Windows 10 1809 (Korean ver) My Version: 08 Aug 2021
I spent a few hours on this today but I still don't have a solution.
Workaround: click Edit > General download options... > Files > Filesystem, and then select Restrict filenames to ASCII characters. The videos should be downloaded and added to Tartube's database, but with garbled names (which is better than nothing.)
Fixed in v2.5.0.