xword-dl icon indicating copy to clipboard operation
xword-dl copied to clipboard

Add HTTPS_PROXY to the environment for New Yorker puzzle tests

Open danielpunkass opened this issue 2 months ago • 5 comments

This PR allows for the definition of an XWORD_DL_HTTPS_PROXY GitHub secret which, if set, will be used to override Python's network requests to go through the specified proxy.

It turns out The New Yorker API is hosted on an AWS network, and I think it's possible that whatever rules are causing the sporadic blocking of TNY requests from xword-dl will be circumvented when the originating IP address is in fact also in the AWS network.

I discovered that I could set up a very simple EC2 instance on the free tier, install "tinyproxy", and use it as an HTTPS proxy from the xword-dl automated tests. This causes the requests made by xword-dl running on GitHub's servers to appear to the New Yorker API as though they are coming from within AWS.

Since the proxy is only set for the New Yorker tests, I don't think the usage on the server would ever likely exceed the amount allowed but the free tier. So, if you choose to adopt this approach:

  1. Create an AWS account if needed.
  2. Add an EC2 t3.micro instance running Ubuntu (free tier)
  3. Configure the EC2 instance with a security group allowing ports 22 (ssh), 80 (http), and 8888 UDP (for the proxy)
  4. ssh in to the new instance, and apt-get install tinyproxy
  5. sudo edit the /etc/tinyproxy/tinyproxy.conf file and add:
Listen 0.0.0.0
Allow 0.0.0.0/0
BasicAuth [username] [password]
  1. sudo systemctl restart tinyproxy

Now you should be able to set a XWORD_DL_HTTPS_PROXY secret with a URL of the form:

http://[username]:[password]@[instance-ip]:8888

And the pertinent tests will be passed through the AWS proxy.

danielpunkass avatar Oct 20 '25 23:10 danielpunkass

Actually the free tier on AWS is apparently only good for 12 months from account creation. Maybe not so compelling! But I think having the support for providing a proxy might still be useful.

danielpunkass avatar Oct 20 '25 23:10 danielpunkass

It occurs to me it might be even better to support simply opening an ssh tunnel to a server, rather than needing to set up a purpose-made tinyproxy server. Maybe I will rethink this and propose a system by which an SSH_PROXY_HOST and SSH_PROXY_CERT could inform the test to open a tunnel and use that as the HTTPS_PROXY.

danielpunkass avatar Oct 20 '25 23:10 danielpunkass

An AWS Lambda might actually be able to get the job done, and they support a truly free forever tier for those.

danielpunkass avatar Oct 21 '25 00:10 danielpunkass

Thanks for this and for documenting the thought process too. Given my personal experience I have been considering routing this traffic through Tailscale, and possibly through a machine on my residential network, but I am appreciating your approach here!

thisisparker avatar Oct 21 '25 17:10 thisisparker

Thanks @thisisparker - I'm glad you appreciate the thinking through. Seems like something simple might ultimately do the job, and maybe whatever solution ends up getting used should be an option on the tool itself. I could see folks wanting to provide VPN and/or proxy info to the tool, especially if they're trying to download from a geolocation that is blocked. Of course, since the tool is Python, maybe they could work around it themselves by setting the pertinent env variables.

danielpunkass avatar Oct 22 '25 03:10 danielpunkass