Server Timeout Errors Connecting to Amazon Behind VPN
Hi,
When using sacad from behind a VPN, I get the following errors whenever it tries to connect to Amazon (I included a sample search, but the results are the same no matter what artist/album combination is used):
sacad "King Crimson" "In the Court of the Crimson King" 800 file.png
Cover: Failed to get file metadata for URL 'https://m.media-amazon.com/images/I/91BIv6BCjFL.jpg' (ServerTimeoutError Connection timeout to host https://m.media-amazon.com/images/I/91BIv6BCjFL.jpg)
Cover: Failed to get file metadata for URL 'https://m.media-amazon.com/images/I/91RdP5S1NML.jpg' (ServerTimeoutError Connection timeout to host https://m.media-amazon.com/images/I/91RdP5S1NML.jpg)
Cover: Failed to get file metadata for URL 'https://m.media-amazon.com/images/I/5128npoNfFL.jpg' (ServerTimeoutError Connection timeout to host https://m.media-amazon.com/images/I/5128npoNfFL.jpg)
Cover: Failed to get file metadata for URL 'https://m.media-amazon.com/images/I/5128npoNfFL.jpg' (ServerTimeoutError Connection timeout to host https://m.media-amazon.com/images/I/5128npoNfFL.jpg)
Cover: Failed to get file metadata for URL 'https://m.media-amazon.com/images/I/717CGWwNQQL.jpg' (ServerTimeoutError Connection timeout to host https://m.media-amazon.com/images/I/717CGWwNQQL.jpg)
Cover: Failed to get file metadata for URL 'https://m.media-amazon.com/images/I/515chlrscwL.jpg' (ServerTimeoutError Connection timeout to host https://m.media-amazon.com/images/I/515chlrscwL.jpg)
Cover: Failed to get file metadata for URL 'https://m.media-amazon.com/images/I/51rcjhgFc0L.jpg' (ServerTimeoutError Connection timeout to host https://m.media-amazon.com/images/I/51rcjhgFc0L.jpg)
Cover: Failed to get file metadata for URL 'https://m.media-amazon.com/images/I/A1HfWhpjHgL.jpg' (ServerTimeoutError Connection timeout to host https://m.media-amazon.com/images/I/A1HfWhpjHgL.jpg)
Cover: Failed to get file metadata for URL 'https://m.media-amazon.com/images/I/61L8ZbVYqxL.jpg' (ServerTimeoutError Connection timeout to host https://m.media-amazon.com/images/I/61L8ZbVYqxL.jpg)
Cover: Failed to get file metadata for URL 'https://m.media-amazon.com/images/I/61jYB9iKrtL.jpg' (ServerTimeoutError Connection timeout to host https://m.media-amazon.com/images/I/61jYB9iKrtL.jpg)
Cover: Failed to get file metadata for URL 'https://m.media-amazon.com/images/I/91BIv6BCjFL.jpg' (ServerTimeoutError Connection timeout to host https://m.media-amazon.com/images/I/91BIv6BCjFL.jpg)
AmazonCdCoverSource: Search with source 'AmazonCdCoverSource' failed: AttributeError type object 'ServerDisconnectedError' has no attribute '_Http__qualname_'
Traceback (most recent call last):
File "/usr/local/bin/sacad", line 11, in <module>
load_entry_point('sacad==2.5.0', 'console_scripts', 'sacad')()
File "/usr/local/lib/python3.8/dist-packages/sacad-2.5.0-py3.8.egg/sacad/__init__.py", line 221, in cl_main
File "/usr/lib/python3.8/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/usr/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
return future.result()
File "/usr/local/lib/python3.8/dist-packages/sacad-2.5.0-py3.8.egg/sacad/__init__.py", line 65, in search_and_download
File "/usr/local/lib/python3.8/dist-packages/sacad-2.5.0-py3.8.egg/sacad/sources/base.py", line 122, in search
File "/usr/local/lib/python3.8/dist-packages/sacad-2.5.0-py3.8.egg/sacad/cover.py", line 325, in updateImageMetadata
TypeError: not all arguments converted during string formatting
I can open each individual picture URL in a browser window without issue, so it doesn't seem like Amazon is blocking VPN users on its end. The only way I found to avoid this issue is to completely disable my VPN connection (which is not a practical option in my case, unfortunately.)
OS Version(s): Linux Mint 20/Ubuntu 20.04 LTS Python3 Version: python 3.8.10
There are 2 different issues here:
- The TypeError, which I have fixed in https://github.com/desbma/sacad/commit/945dcc0060c9916f25a2dc8a0f96da0761842d8c
- The root cause of the error, ServerTimeoutError, which seems related to your VPN. I don't know why that would work in your browser and not in Sacad though.
Are there any diagnostics I could run on my end to help identify the second issue? Like you said, the fact the URLs remain accessible in the browser despite timing out in sacad is kind of strange...
In your browser, open the developer tools, and then open the URL of the image, get to the network tab, right click the request, click "Copy as curl".
Then in a terminal, paste the curl command (add -v / -o /dev/null if needed), it should work as in your browser.
Then remove the header parameters -H ... from the curl command line one by one, until your the command does not work anymore. That is probably the header that sacad and your browser set differently which explains the difference in behavior.
Thanks! I'll try that tonight and report back.
So, I tried like you said and removed each header from the curl command, until the command became:
curl 'https://m.media-amazon.com/images/I/51jEBnTcfQL.jpg' --compressed -v --output test-file.jpg
And I still had no issues accessing the image file from behind the VPN. Weird.
Ok I have another hypothesis then. How slow is your VPN ? The part of your log that leads to ServerTimeoutError is for requests that are done with 3s timeout, maybe thats too low.
I mean, I never tested, but it's responsive enough I don't really notice it's on. Is there any way to make the timeout window slightly longer and see if that does it?
I have increased default timeouts: https://github.com/desbma/sacad/commit/2a8db8db05792d0ba3acc2043ab813f1b4ef9478, please test.
So, I tested the new commit -- still getting timeouts, still only for Amazon; all other sources seem to work fine. And I can open each timed out image file manually in Chromium/Firefox without issue.
Since I'm getting results from other sources, I'm inclined to just disable Amazon lookups at the command line, but this would be nice to fix. Let me know if I can do anything else on my end.
@lazerfloyd can you post the output of curl -v --http1.1 -I 'https://m.media-amazon.com/images/I/51jEBnTcfQL.jpg' -o /dev/null when your VPN is enabled?
Sure thing! Here you go:
curl -v --http1.1 -I 'https://m.media-amazon.com/images/I/51jEBnTcfQL.jpg' -o /dev/null
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 13.33.83.177:443...
* TCP_NODELAY set
* Connected to m.media-amazon.com (13.33.83.177) port 443 (#0)
* ALPN, offering http/1.1
* successfully set certificate verify locations:
* CAfile: /etc/ssl/certs/ca-certificates.crt
CApath: /etc/ssl/certs
} [5 bytes data]
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* TLSv1.3 (IN), TLS handshake, Server hello (2):
{ [122 bytes data]
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
{ [25 bytes data]
* TLSv1.3 (IN), TLS handshake, Certificate (11):
{ [4196 bytes data]
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
{ [264 bytes data]
* TLSv1.3 (IN), TLS handshake, Finished (20):
{ [36 bytes data]
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
} [1 bytes data]
* TLSv1.3 (OUT), TLS handshake, Finished (20):
} [36 bytes data]
* SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256
* ALPN, server accepted to use http/1.1
* Server certificate:
* subject: CN=Images-na.ssl-images-amazon.com
* start date: Feb 1 00:00:00 2022 GMT
* expire date: Jan 2 23:59:59 2023 GMT
* subjectAltName: host "m.media-amazon.com" matched cert's "m.media-amazon.com"
* issuer: C=US; O=DigiCert Inc; CN=DigiCert Global CA G2
* SSL certificate verify ok.
} [5 bytes data]
> HEAD /images/I/51jEBnTcfQL.jpg HTTP/1.1
> Host: m.media-amazon.com
> User-Agent: curl/7.68.0
> Accept: */*
>
{ [5 bytes data]
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Content-Type: image/jpeg
< Content-Length: 57555
< Connection: keep-alive
< Server: Server
< Date: Sun, 17 Apr 2022 22:46:55 GMT
< X-Amz-IR-Id: 4b7d559a-22fe-4669-992d-7445e7d0e1ab
< Expires: Sat, 12 Apr 2042 22:46:55 GMT
< Cache-Control: max-age=630720000,public
< Surrogate-key: x-cache-046 /images/I/51jEBnTcfQL
< Timing-Allow-Origin: https://www.amazon.in, https://www.amazon.com
< Edge-Cache-Tag: x-cache-046,/images/I/51jEBnTcfQL
< Access-Control-Allow-Origin: *
< Last-Modified: Wed, 26 Nov 2014 23:42:36 GMT
< X-Nginx-Cache-Status: MISS
< Accept-Ranges: bytes
< X-Cache: Miss from cloudfront
< Via: 1.1 542aa1c3fd7431ac31b596fde254f388.cloudfront.net (CloudFront)
< X-Amz-Cf-Pop: EWR52-C1
< X-Amz-Cf-Id: kBKEEEuVI56GKb-SmJiith5lBA01r-FABiinu7n8gCCtofHfwwod6Q==
<
0 57555 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
* Connection #0 to host m.media-amazon.com left intact
This should be obsolete, as version 2.8.0 no longer includes Amazon sources.