bazarr icon indicating copy to clipboard operation
bazarr copied to clipboard

RegieLive provider is regularily returning BadZipFile

Open jungla0 opened this issue 1 year ago • 8 comments

Describe the bug RegieLive is some time returing BadZipFile, even if during a manual search is working and it downloads a proper zip file.

To Reproduce Steps to reproduce the behavior:

  1. Add RegieLive as a provider
  2. Add movie/series from Sonarr/Radarr
  3. Manually search for a subtitle.
  4. Click on the provider name link and see the zip being downloaded
  5. Click on the download button in manual search window and notice that the subtitle is not downloaded and the provider returns BadZipFile

Expected behavior BadZipFile should not appear for actual zip files

Screenshots badzipfile

Software (please complete the following information):

  • Bazarr: 1.2.1
  • Radarr version: 4.6.0.7439
  • Sonarr version: 4.0.0.535
  • OS: DSM 7.2

jungla0 avatar Jun 06 '23 14:06 jungla0

Is that na issue only with RegieLive?

morpheus65535 avatar Jun 06 '23 14:06 morpheus65535

Based on what providers I used so far, seem like it, yes. Don't have the biggest knowledge, but looking over the log, it seems to me that it cache that error sometime ago and now it displays it every time. Log:

Unexpected error in provider 'regielive', Traceback: Traceback (most recent call last): File "/volume2/@appstore/bazarr/share/bazarr/bazarr/../libs/subliminal_patch/core.py", line 398, in download_subtitle self[subtitle.provider_name].download_subtitle(subtitle) File "/volume2/@appstore/bazarr/share/bazarr/bazarr/../libs/subliminal_patch/providers/regielive.py", line 129, in download_subtitle archive = zipfile.ZipFile(io.BytesIO(_zipped.content)) File "/var/packages/python310/target/lib/python3.10/zipfile.py", line 1269, in __init__ self._RealGetContents() File "/var/packages/python310/target/lib/python3.10/zipfile.py", line 1336, in _RealGetContents raise BadZipFile("File is not a zip file")zipfile.BadZipFile: File is not a zip file

image

Also, one thing that I didn't mentioned is that, RegieLive used to work, and was one of the best for me but for the past couple of day I'm constantly receiving that error. I've even removed it for 1-2 days just in case it had some timeouts or anything but once I add it back, it returns the above at the first search

jungla0 avatar Jun 06 '23 14:06 jungla0

It seems that regielive implemented reCaptcha on their download pages.

image

@alexandrucatalinene are you in the mood to look into this one? You've done a great job when we had issues with this provider before.

Thanks!

morpheus65535 avatar Jun 08 '23 00:06 morpheus65535

Ok, I'll try and tackle this one.

alexandrucatalinene avatar Jun 08 '23 07:06 alexandrucatalinene

Just an update: I didn't have time to properly look into the issue but I did try an reproduce it (both in bazarr but also in browser) and I couldn't do it.

Everything downloaded fine and unzipped, all zipped files were valid, never got a recaptcha challenge (even though I changed UAs, reset all cookies, private browsing etc).

So, my guess right now is that it's something that happens only for certain devices (IPs maybe) that triggered some sort of rule on RL.

alexandrucatalinene avatar Jun 12 '23 07:06 alexandrucatalinene

@alexandrucatalinene What I can tell is that I've used Bazarr first to search for subtitles in batch then got throttled. When accessing the URL that get throttled (the zip file download), I get redirected to the html page to solve the recaptcha.

morpheus65535 avatar Jun 12 '23 11:06 morpheus65535

There is a limit of subtitles that can be downloaded in a certain period of time. Problems can occur with series with many episodes or if someone downloads many subtitles. Normally, that limit is not reached.

IonutNeagu avatar Jul 24 '23 11:07 IonutNeagu

There is a limit of subtitles that can be downloaded in a certain period of time. Problems can occur with series with many episodes or if someone downloads many subtitles. Normally, that limit is not reached.

So, my take on this is that we should intercept this error and return a Throttled exception. Any clue on on definite way of detecting this limit (http return code, page data etc).

I want to avoid trying to add a bypass just for some edge cases.

alexandrucatalinene avatar Jul 24 '23 11:07 alexandrucatalinene