Add HTTPS_PROXY to the environment for New Yorker puzzle tests
This PR allows for the definition of an XWORD_DL_HTTPS_PROXY GitHub secret which, if set, will be used to override Python's network requests to go through the specified proxy.
It turns out The New Yorker API is hosted on an AWS network, and I think it's possible that whatever rules are causing the sporadic blocking of TNY requests from xword-dl will be circumvented when the originating IP address is in fact also in the AWS network.
I discovered that I could set up a very simple EC2 instance on the free tier, install "tinyproxy", and use it as an HTTPS proxy from the xword-dl automated tests. This causes the requests made by xword-dl running on GitHub's servers to appear to the New Yorker API as though they are coming from within AWS.
Since the proxy is only set for the New Yorker tests, I don't think the usage on the server would ever likely exceed the amount allowed but the free tier. So, if you choose to adopt this approach:
- Create an AWS account if needed.
- Add an EC2 t3.micro instance running Ubuntu (free tier)
- Configure the EC2 instance with a security group allowing ports 22 (ssh), 80 (http), and 8888 UDP (for the proxy)
- ssh in to the new instance, and
apt-get install tinyproxy - sudo edit the /etc/tinyproxy/tinyproxy.conf file and add:
Listen 0.0.0.0
Allow 0.0.0.0/0
BasicAuth [username] [password]
- sudo systemctl restart tinyproxy
Now you should be able to set a XWORD_DL_HTTPS_PROXY secret with a URL of the form:
http://[username]:[password]@[instance-ip]:8888
And the pertinent tests will be passed through the AWS proxy.
Actually the free tier on AWS is apparently only good for 12 months from account creation. Maybe not so compelling! But I think having the support for providing a proxy might still be useful.
It occurs to me it might be even better to support simply opening an ssh tunnel to a server, rather than needing to set up a purpose-made tinyproxy server. Maybe I will rethink this and propose a system by which an SSH_PROXY_HOST and SSH_PROXY_CERT could inform the test to open a tunnel and use that as the HTTPS_PROXY.
An AWS Lambda might actually be able to get the job done, and they support a truly free forever tier for those.
Thanks for this and for documenting the thought process too. Given my personal experience I have been considering routing this traffic through Tailscale, and possibly through a machine on my residential network, but I am appreciating your approach here!
Thanks @thisisparker - I'm glad you appreciate the thinking through. Seems like something simple might ultimately do the job, and maybe whatever solution ends up getting used should be an option on the tool itself. I could see folks wanting to provide VPN and/or proxy info to the tool, especially if they're trying to download from a geolocation that is blocked. Of course, since the tool is Python, maybe they could work around it themselves by setting the pertinent env variables.