sacad icon indicating copy to clipboard operation
sacad copied to clipboard

Server Timeout Errors Connecting to Amazon Behind VPN

Open lazerfloyd opened this issue 4 years ago • 11 comments

Hi,

When using sacad from behind a VPN, I get the following errors whenever it tries to connect to Amazon (I included a sample search, but the results are the same no matter what artist/album combination is used):

sacad "King Crimson" "In the Court of the Crimson King" 800 file.png

Cover: Failed to get file metadata for URL 'https://m.media-amazon.com/images/I/91BIv6BCjFL.jpg' (ServerTimeoutError Connection timeout to host https://m.media-amazon.com/images/I/91BIv6BCjFL.jpg)
Cover: Failed to get file metadata for URL 'https://m.media-amazon.com/images/I/91RdP5S1NML.jpg' (ServerTimeoutError Connection timeout to host https://m.media-amazon.com/images/I/91RdP5S1NML.jpg)
Cover: Failed to get file metadata for URL 'https://m.media-amazon.com/images/I/5128npoNfFL.jpg' (ServerTimeoutError Connection timeout to host https://m.media-amazon.com/images/I/5128npoNfFL.jpg)
Cover: Failed to get file metadata for URL 'https://m.media-amazon.com/images/I/5128npoNfFL.jpg' (ServerTimeoutError Connection timeout to host https://m.media-amazon.com/images/I/5128npoNfFL.jpg)
Cover: Failed to get file metadata for URL 'https://m.media-amazon.com/images/I/717CGWwNQQL.jpg' (ServerTimeoutError Connection timeout to host https://m.media-amazon.com/images/I/717CGWwNQQL.jpg)
Cover: Failed to get file metadata for URL 'https://m.media-amazon.com/images/I/515chlrscwL.jpg' (ServerTimeoutError Connection timeout to host https://m.media-amazon.com/images/I/515chlrscwL.jpg)
Cover: Failed to get file metadata for URL 'https://m.media-amazon.com/images/I/51rcjhgFc0L.jpg' (ServerTimeoutError Connection timeout to host https://m.media-amazon.com/images/I/51rcjhgFc0L.jpg)
Cover: Failed to get file metadata for URL 'https://m.media-amazon.com/images/I/A1HfWhpjHgL.jpg' (ServerTimeoutError Connection timeout to host https://m.media-amazon.com/images/I/A1HfWhpjHgL.jpg)
Cover: Failed to get file metadata for URL 'https://m.media-amazon.com/images/I/61L8ZbVYqxL.jpg' (ServerTimeoutError Connection timeout to host https://m.media-amazon.com/images/I/61L8ZbVYqxL.jpg)
Cover: Failed to get file metadata for URL 'https://m.media-amazon.com/images/I/61jYB9iKrtL.jpg' (ServerTimeoutError Connection timeout to host https://m.media-amazon.com/images/I/61jYB9iKrtL.jpg)
Cover: Failed to get file metadata for URL 'https://m.media-amazon.com/images/I/91BIv6BCjFL.jpg' (ServerTimeoutError Connection timeout to host https://m.media-amazon.com/images/I/91BIv6BCjFL.jpg)
AmazonCdCoverSource: Search with source 'AmazonCdCoverSource' failed: AttributeError type object 'ServerDisconnectedError' has no attribute '_Http__qualname_'
Traceback (most recent call last):
  File "/usr/local/bin/sacad", line 11, in <module>
    load_entry_point('sacad==2.5.0', 'console_scripts', 'sacad')()
  File "/usr/local/lib/python3.8/dist-packages/sacad-2.5.0-py3.8.egg/sacad/__init__.py", line 221, in cl_main
  File "/usr/lib/python3.8/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
    return future.result()
  File "/usr/local/lib/python3.8/dist-packages/sacad-2.5.0-py3.8.egg/sacad/__init__.py", line 65, in search_and_download
  File "/usr/local/lib/python3.8/dist-packages/sacad-2.5.0-py3.8.egg/sacad/sources/base.py", line 122, in search
  File "/usr/local/lib/python3.8/dist-packages/sacad-2.5.0-py3.8.egg/sacad/cover.py", line 325, in updateImageMetadata
TypeError: not all arguments converted during string formatting

I can open each individual picture URL in a browser window without issue, so it doesn't seem like Amazon is blocking VPN users on its end. The only way I found to avoid this issue is to completely disable my VPN connection (which is not a practical option in my case, unfortunately.)

OS Version(s): Linux Mint 20/Ubuntu 20.04 LTS Python3 Version: python 3.8.10

lazerfloyd avatar Feb 13 '22 19:02 lazerfloyd

There are 2 different issues here:

  • The TypeError, which I have fixed in https://github.com/desbma/sacad/commit/945dcc0060c9916f25a2dc8a0f96da0761842d8c
  • The root cause of the error, ServerTimeoutError, which seems related to your VPN. I don't know why that would work in your browser and not in Sacad though.

desbma avatar Feb 16 '22 21:02 desbma

Are there any diagnostics I could run on my end to help identify the second issue? Like you said, the fact the URLs remain accessible in the browser despite timing out in sacad is kind of strange...

lazerfloyd avatar Feb 18 '22 14:02 lazerfloyd

In your browser, open the developer tools, and then open the URL of the image, get to the network tab, right click the request, click "Copy as curl". Then in a terminal, paste the curl command (add -v / -o /dev/null if needed), it should work as in your browser. Then remove the header parameters -H ... from the curl command line one by one, until your the command does not work anymore. That is probably the header that sacad and your browser set differently which explains the difference in behavior.

desbma avatar Feb 18 '22 16:02 desbma

Thanks! I'll try that tonight and report back.

lazerfloyd avatar Feb 18 '22 18:02 lazerfloyd

So, I tried like you said and removed each header from the curl command, until the command became: curl 'https://m.media-amazon.com/images/I/51jEBnTcfQL.jpg' --compressed -v --output test-file.jpg

And I still had no issues accessing the image file from behind the VPN. Weird.

lazerfloyd avatar Feb 20 '22 21:02 lazerfloyd

Ok I have another hypothesis then. How slow is your VPN ? The part of your log that leads to ServerTimeoutError is for requests that are done with 3s timeout, maybe thats too low.

desbma avatar Feb 20 '22 21:02 desbma

I mean, I never tested, but it's responsive enough I don't really notice it's on. Is there any way to make the timeout window slightly longer and see if that does it?

lazerfloyd avatar Feb 26 '22 12:02 lazerfloyd

I have increased default timeouts: https://github.com/desbma/sacad/commit/2a8db8db05792d0ba3acc2043ab813f1b4ef9478, please test.

desbma avatar Feb 26 '22 13:02 desbma

So, I tested the new commit -- still getting timeouts, still only for Amazon; all other sources seem to work fine. And I can open each timed out image file manually in Chromium/Firefox without issue.

Since I'm getting results from other sources, I'm inclined to just disable Amazon lookups at the command line, but this would be nice to fix. Let me know if I can do anything else on my end.

lazerfloyd avatar Mar 06 '22 20:03 lazerfloyd

@lazerfloyd can you post the output of curl -v --http1.1 -I 'https://m.media-amazon.com/images/I/51jEBnTcfQL.jpg' -o /dev/null when your VPN is enabled?

desbma avatar Apr 10 '22 19:04 desbma

Sure thing! Here you go:

curl -v --http1.1 -I 'https://m.media-amazon.com/images/I/51jEBnTcfQL.jpg' -o /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 13.33.83.177:443...
* TCP_NODELAY set
* Connected to m.media-amazon.com (13.33.83.177) port 443 (#0)
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
} [5 bytes data]
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* TLSv1.3 (IN), TLS handshake, Server hello (2):
{ [122 bytes data]
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
{ [25 bytes data]
* TLSv1.3 (IN), TLS handshake, Certificate (11):
{ [4196 bytes data]
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
{ [264 bytes data]
* TLSv1.3 (IN), TLS handshake, Finished (20):
{ [36 bytes data]
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
} [1 bytes data]
* TLSv1.3 (OUT), TLS handshake, Finished (20):
} [36 bytes data]
* SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256
* ALPN, server accepted to use http/1.1
* Server certificate:
*  subject: CN=Images-na.ssl-images-amazon.com
*  start date: Feb  1 00:00:00 2022 GMT
*  expire date: Jan  2 23:59:59 2023 GMT
*  subjectAltName: host "m.media-amazon.com" matched cert's "m.media-amazon.com"
*  issuer: C=US; O=DigiCert Inc; CN=DigiCert Global CA G2
*  SSL certificate verify ok.
} [5 bytes data]
> HEAD /images/I/51jEBnTcfQL.jpg HTTP/1.1
> Host: m.media-amazon.com
> User-Agent: curl/7.68.0
> Accept: */*
>
{ [5 bytes data]
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Content-Type: image/jpeg
< Content-Length: 57555
< Connection: keep-alive
< Server: Server
< Date: Sun, 17 Apr 2022 22:46:55 GMT
< X-Amz-IR-Id: 4b7d559a-22fe-4669-992d-7445e7d0e1ab
< Expires: Sat, 12 Apr 2042 22:46:55 GMT
< Cache-Control: max-age=630720000,public
< Surrogate-key: x-cache-046 /images/I/51jEBnTcfQL
< Timing-Allow-Origin: https://www.amazon.in, https://www.amazon.com
< Edge-Cache-Tag: x-cache-046,/images/I/51jEBnTcfQL
< Access-Control-Allow-Origin: *
< Last-Modified: Wed, 26 Nov 2014 23:42:36 GMT
< X-Nginx-Cache-Status: MISS
< Accept-Ranges: bytes
< X-Cache: Miss from cloudfront
< Via: 1.1 542aa1c3fd7431ac31b596fde254f388.cloudfront.net (CloudFront)
< X-Amz-Cf-Pop: EWR52-C1
< X-Amz-Cf-Id: kBKEEEuVI56GKb-SmJiith5lBA01r-FABiinu7n8gCCtofHfwwod6Q==
<
  0 57555    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
* Connection #0 to host m.media-amazon.com left intact

lazerfloyd avatar Apr 17 '22 22:04 lazerfloyd

This should be obsolete, as version 2.8.0 no longer includes Amazon sources.

desbma avatar Jul 28 '24 20:07 desbma