bazarr icon indicating copy to clipboard operation
bazarr copied to clipboard

Bazarr embedded sub extractor removing words

Open anonyme123 opened this issue 2 years ago • 6 comments

Describe the bug Hello, it seems that Bazarr's embedded subs extractor is removing some words. We can see that it extracted the embedded English subs: drawing

But it removed the word "SPAN" from the sub file: drawing

I re-downloaded the original file outside of Bazarr and the word was not removed: 001009-IINA-Curb Your Enthusiasm S11E08 What Have I Done 1080p HMAX WEB-DL DD5 1 x264-NTb mkv

Here are my Post-Processing settings, I don't know if they are also used during embedded subs extraction: drawing

To Reproduce Steps to reproduce the behavior:

  1. Configure Bazarr to extract embedded subs
  2. Download a video that has embedded subs with the word 'SPAN' in them
  3. Let Bazarr extract the embedded subs

Software (please complete the following information):

  • Bazarr: 1.1.0
  • Radarr version: 4.1.0.6175
  • Sonarr version: 3.0.8.1507
  • OS: Linux-3.10.0-1160.11.1.el7.x86_64-x86_64-with

anonyme123 avatar Jul 28 '22 21:07 anonyme123

Embedded Provider won't touch any subtitles. This issue is related to post-processing logic.

vitiko98 avatar Jul 28 '22 22:07 vitiko98

Ok so the embedded provider just extracts the subs as-is, and then the post-processing logic runs:

  • encode to UTF-8
  • remove hearing impaired
  • OCR fixes
  • common fixes
  • fix uppercase

so one of those is removing the 'SPAN' word?

anonyme123 avatar Jul 28 '22 22:07 anonyme123

Yes, I assume the hearing impaired processor is making the change.

vitiko98 avatar Jul 28 '22 22:07 vitiko98

Will you be able to fix it?

anonyme123 avatar Jul 28 '22 22:07 anonyme123

I'll try.

Please send the untouched subtitle file if you can.

vitiko98 avatar Jul 28 '22 22:07 vitiko98

Here you go: https://demo.lufi.io/r/AS3jQ9Mc6z#V2fXl71NM5mEZTQYK/AJTRsBYIFIVHC/VBy5pcJaIrQ=

anonyme123 avatar Jul 28 '22 23:07 anonyme123

Should be fixed in upcoming beta. Thanks!

vitiko98 avatar Oct 12 '22 23:10 vitiko98