subclean
subclean copied to clipboard
[Feature Request] Support for other character encodings
Right now the tool will fail when trying to parse files with this character encoding. For a viable solution the tool should be able to detect the character encoding and convert to UTF-8 when required.
The converted data should be written even if nodes were not modified, this will remove the need to convert a file multiple times when running subclean on an entire library as a scheduled task.
See this https://github.com/DrKain/subclean/issues/7#issuecomment-948572760 for information on a temporary solution for the current problem.
Unfortunately this will require a dependency like utf8.
Test files:
- UCS-2 BE BOM: subtitle.zip
- UTF-8-BOM: subtitle.zip
If you're using Bazarr, you can avoid this issue with the setting:
Settings → Subtitles → Post-Processing → Encode Subtitles To UTF8