DietPi icon indicating copy to clipboard operation
DietPi copied to clipboard

General | Fix raw.githubusercontent.com for systems in China

Open MichaIng opened this issue 6 months ago • 3 comments

The GFW blocks access to raw.githubusercontent.com, while not blocking access to GitHub itself. I remember reports where this happened with some regions/ISPs in India as well.

As of https://github.com/dragonflylee/DietPi/commit/e823ef8, it can be solved with a hosts entry:

185.199.111.133 raw.githubusercontent.com

But if it is DNS-wise only, switching the DNS provider is probably a cleaner solution. However, when I visited China, IIRC Cloudflare DNS was blocked.

@dragonflylee picking up the solution of your fork:

  • Is it really only a DNS-wise block for this domain, i.e. does e.g. echo 'nameserver 9.9.9.9' > /etc/resolv.conf (Quad9 DNS) work as well, so that we do not rely on a hardcoded IP address?
  • Or is it blocked IP-wise, with 185.199.111.133 being the only, or one of a few functional IPs for the host, while public DNS can return different non-functional IPs? Just checked what it returns here:
    root@micha:~# getent ahosts raw.githubusercontent.com
    2606:50c0:8000::154 STREAM raw.githubusercontent.com
    2606:50c0:8000::154 DGRAM
    2606:50c0:8000::154 RAW
    2606:50c0:8003::154 STREAM
    2606:50c0:8003::154 DGRAM
    2606:50c0:8003::154 RAW
    2606:50c0:8001::154 STREAM
    2606:50c0:8001::154 DGRAM
    2606:50c0:8001::154 RAW
    2606:50c0:8002::154 STREAM
    2606:50c0:8002::154 DGRAM
    2606:50c0:8002::154 RAW
    185.199.111.133 STREAM
    185.199.111.133 DGRAM
    185.199.111.133 RAW
    185.199.109.133 STREAM
    185.199.109.133 DGRAM
    185.199.109.133 RAW
    185.199.110.133 STREAM
    185.199.110.133 DGRAM
    185.199.110.133 RAW
    185.199.108.133 STREAM
    185.199.108.133 DGRAM
    185.199.108.133 RAW
    
  • What is the exact error output when running curl -sSfL raw.githubusercontent.com without the workaround?
  • Are there any legal concerns for Chinese users, when offering or applying a fix (either hosts entry or DNS provider change) automatically, when facing the error? Our updates and software installs check DNS functionality. So if that fails for raw.githubusercontent.com only, we know that it is very likely a country/ISP block, and offer the workaround quite targeted. As we might not want to do this fully automatically, at least not permanently (we could override it for the particular curl call in non-interactive executions), we could instead add a single toggle to dietpi.txt.

I was also thinking about creating a mirror on Gitee, but it is difficult to manage for me, as I cannot switch to English language on that website 😅. Also, while this solves DietPi updates and downloads from the DietPi repo, it does not solve possible access to other GitHub repositories. And we do access a lot of different repos, not all under our control, for certain software installs. Enabling access to raw.githubusercontent.com in general, is the much better solution, and also does not cause any extra work for us, obtaining the correct conditional URL depending on the Git platform throughout our code.

MichaIng avatar May 26 '25 16:05 MichaIng

TLS connections will be randomly blocked based on SNI, but the connection which was blocked maybe return if you wait for a long time. the behaviour is not exactly the same in different regions.

So, The most effective way is to use a thirdpart proxy such as https://gh-proxy.com/

dragonflylee avatar May 27 '25 02:05 dragonflylee

Image

dragonflylee avatar May 27 '25 02:05 dragonflylee

TLS connections will be randomly blocked based on SNI, but the connection which was blocked maybe return if you wait for a long time. the behaviour is not exactly the same in different regions.

Hmm, if SNI is incorrect, a different host must have answered, hence DNS returned a wrong IP, or the connection was intercepted. This also happens with public DNS providers like Cloudflare, Quad9, Google or such?

The output of your particular test shows a correct IP (matching one of those offered to me), with raw.githubusercontent.com in SAN, and hence it verifies as "ok".

A proxy or VPN surely solves such things most reliably, or is needed if direct communication with the actual target host is blocked on a different level than just local/ISP DNS returning an invalid IP. But I would prefer to find a solution which does not include any 3rd party service. If bypassing ISP DNS alone does not reliably solve it, then we probably really need a mirror on a different host than GitHub. Instead of using Gitee, we could simply store and update a copy on dietpi.com actually, with the same URL structure, so a quick fix for downloads from at least our repo is easy to implement. And mid-term solution would be to add proper mirror support to our scripts, like:

# default
G_GIT_MIRROR="https://raw.githubusercontent.com/$G_GITOWNER/DietPi/$G_GITBRANCH"
# optionaly override, holding e.g. our dev beta and master branches only
G_GIT_MIRROR="https://dietpi.com/git/$G_GITBRANCH"
curl -sSf "$G_GIT_MIRROR/.update/version"

And a way to change this mirror base URL in dietpi.txt, or even multiple ones to loop through.

MichaIng avatar May 27 '25 13:05 MichaIng