shadowsocks-rust icon indicating copy to clipboard operation
shadowsocks-rust copied to clipboard

[Mobile/Russia] dns_probe_finished_nxdomain when visiting lm.facebook.com

Open nickcarterney opened this issue 1 year ago • 22 comments
trafficstars

Shadowsocks version: 1.17.2 - 1.18.0 - 1.18.1 - 1.18.3 Client device: IPhone (iOS) Server: Ubuntu 23 (Vultr and OVH) Config:

{
          "server":"207.148.90.219",
          "mode":"tcp_and_udp",
          "server_port": 55837,
          "local_port": 1080,
          "password":"d0c222d5bd70ab16f4df9dd3caa70a54",
          "timeout": 60,
          "method":"chacha20-ietf-poly1305",
          "fast_open":true,
          "nameserver": "8.8.8.8"
 }

When I clicked on a link on Facebook, It returned an error screen. When I copied the link to the browser, It said dns_probe_finished_nxdomain. This issue only occurs on 4G network (OctopusNet Ltd), but not on Wifi (another ISP: Zelenaya Vladivostok Network).

It has no errors on the log. I tried to ping or traceroute to l.facebook.com or lm.facebook.com on my VPS, It still responded

image

image

image

https://l.facebook.com/l.php?u=https%3A%2F%2Fbaonga.com%2Fuav-va-ten-lua-nga-pha-huy-trung-tam-hau-can-cua-ukraine-o-odessa.html%3Ffbclid%3DIwZXh0bgNhZW0CMTAAAR1tMfKhQpRQwdEeWKPs6bkHYpaxsUCaL-9gWUmLBkJx_XlMKMjG37Abphg_aem_AUQ06mMNOEoxacNHH9qatPBT50R57-qs79wnQvKI_ufWHY6MrCxo5BXActIIHjclcK_JzP5U8l1Zu8KVBZyW6nK1&h=AT3TRiH86M5iyQke2wCwr_-mBUpipxOI9uTm1P_hdeM7MeI-k3DQT6VoQd1Ku8kHa3sUPQVmVLWBIRwos53RE_5Cf2L9R4k-Z846hAClXIowZKwbTG0EkCEwHxwKsRfu1Zea

nickcarterney avatar May 02 '24 13:05 nickcarterney

I think the DNS query should be handled by your iOS App, which serve as a local client of shadowsocks. So I think you should first open this issue first to the iOS App repository.

zonyitoo avatar May 02 '24 13:05 zonyitoo

I think the DNS query should be handled by your iOS App, which serve as a local client of shadowsocks. So I think you should first open this issue first to the iOS App repository.

But this issue only happened on the 4G network, the other networks were not. And only happened to lm.facebook.com, the other facebook's domains still worked well. And only happened in Russia.

In Vietnam, Finland, and Thailand still work

nickcarterney avatar May 02 '24 16:05 nickcarterney

If you suspect there are something happening in the server side, you could run ssserver with -vvv and see what was happening when you see errors on your mobile phone.

zonyitoo avatar May 05 '24 13:05 zonyitoo

If you suspect there are something happening in the server side, you could run ssserver with -vvv and see what was happening when you see errors on your mobile phone.

I am also recently experiencing issues with access to most of the sites while connecting to SS server using mobile network in Russia. How do I force SS docker container to run with -vvv options?

frozzway avatar May 06 '24 05:05 frozzway

You can run whatever command you want with docker run, right?

zonyitoo avatar May 06 '24 05:05 zonyitoo

You can run whatever command you want with docker run, right?

Ok, I've figured it out.

I have copied logs with -vvv mode and I would appreciate if you help with determining the issue.

https://gist.github.com/frozzway/0a5c84739e75770b114268c449ff417c

And some more: https://gist.github.com/frozzway/608c0f6edd7e9639df616c21ab6d684c

Chrome says "This site can't be reached, unexpectedly closed the connection"

For now had to deploy wireguard in parallel. Works fine by the way

frozzway avatar May 06 '24 05:05 frozzway

From the first logs, I can see a tunnel to ya.ru was already established. So your browser still said that The site (ya.ru) cannot be reached?

zonyitoo avatar May 06 '24 15:05 zonyitoo

In the second one, I can see access to 149.154.167.41:5222, graph.facebook.com:443, 1.1.1.1:853, chrome.cloudflare-dns.com:443. They were all established and finished successfully.

For example, lines like this:

DEBUG   tokio-runtime-worker ThreadId(02) shadowsocks_service::server::tcprelay: crates/shadowsocks-service/src/server/tcprelay.rs:255: established tcp tunnel 85.140.23.181:14785 <-> chrome.cloudflare-dns.com:443 with ConnectOpts { fwmark: None, bind_local_addr: None, bind_interface: None, tcp: TcpSocketOpts { send_buffer_size: None, recv_buffer_size: None, nodelay: false, fastopen: true, keepalive: Some(15s), mptcp: false }, udp: UdpSocketOpts { mtu: None } }    

indicated that the tunnel 85.140.23.181:14785 <-> chrome.cloudflare-dns.com:443 have already been established. TCP socket connect() to the remote target successfully.

and lines like:

TRACE   tokio-runtime-worker ThreadId(02) shadowsocks_service::server::tcprelay: crates/shadowsocks-service/src/server/tcprelay.rs:264: tcp tunnel 85.140.23.181:14785 <-> chrome.cloudflare-dns.com:443 closed, L2R 1563 bytes, R2L 4424 bytes    

told us that the tunnel was finished. It has copied 1563 bytes from local to remote, and 4424 bytes in the other direction. This tells us the tunnel working well and actually transferred data from local to remote.

I didn't see anything abnormal in these logs.

zonyitoo avatar May 06 '24 15:05 zonyitoo

BTW, I didn't see any UDP logs. Did you enabled UDP mode? Or you don't need that in your environment?

zonyitoo avatar May 06 '24 15:05 zonyitoo

So your browser still said that The site (ya.ru) cannot be reached?

Yep(

I didn't see anything abnormal in these logs.

Yes. That is the issue. Some connections establishes fine, especially those that goes though some applications like Telegram or Instagram. But some breaks immediately (like 90% that comes through Chrome or YouTube). And it is always random. I might 'get through' on to some site only because few minutes before I visited it without SS enabled (or not because of this, I really do not know). But then I open one-two-three other random sites and poof -> no more establishing connections to any of them.

BTW, I didn't see any UDP logs. Or you don't need that in your environment?

I have used tcp_only configuration just fine for several months. So I guess I don't need it.

Yet about "ya.ru" connection. How can it be that logs show no abnormalities but I still couldn't reach the site?

Btw, I did not change any of server or client configurations myself from the time this issue appeared and weeks before.

And the issue persists only on mobile 4G network. WiFi home/work networks just fine.

frozzway avatar May 06 '24 16:05 frozzway

I have rented another VM on another hosting and deployed my configuration to it. It all the same.

frozzway avatar May 06 '24 16:05 frozzway

And the issue persists only on mobile 4G network. WiFi home/work networks just fine.

That's very interesting. You can see access logs on server when you are using 4G network, right? (for example, "accepted connection xxxx").

Yet about "ya.ru" connection. How can it be that logs show no abnormalities but I still couldn't reach the site?

I can only guess:

  1. The ya.ru's TCP connection works fine, but TLS handshake failed between your browser and remote server (data transfer exists, but closes immediately after handshake).
  2. The data sent from ssserver was hijacked by a middleman that makes your browser showed errors. (probably no)
  3. The connection in ssserver's log was not the actual connection that your browser was using.

zonyitoo avatar May 06 '24 16:05 zonyitoo

I saw an error in your log:

TRACE   tokio-runtime-worker ThreadId(02) shadowsocks::net::tcp: crates/shadowsocks/src/net/tcp.rs:76: connected ya.ru:443 77.88.55.242:443    
DEBUG   tokio-runtime-worker ThreadId(02) shadowsocks_service::server::tcprelay: crates/shadowsocks-service/src/server/tcprelay.rs:255: established tcp tunnel 85.140.23.181:14792 <-> ya.ru:443 with ConnectOpts { fwmark: None, bind_local_addr: None, bind_interface: None, tcp: TcpSocketOpts { send_buffer_size: None, recv_buffer_size: None, nodelay: false, fastopen: true, keepalive: Some(15s), mptcp: false }, udp: UdpSocketOpts { mtu: None } }    
DEBUG   tokio-runtime-worker ThreadId(02) shadowsocks::relay::tcprelay::utils: crates/shadowsocks/src/relay/tcprelay/utils.rs:262: copy bidirection ends with error: Broken pipe (os error 32), a_to_b: Done(2136), b_to_a: Running(CopyBuffer { read_done: false, pos: 0, cap: 63, amt: 67337, .. })    
TRACE   tokio-runtime-worker ThreadId(02) shadowsocks_service::server::tcprelay: crates/shadowsocks-service/src/server/tcprelay.rs:273: tcp tunnel 85.140.23.181:14792 <-> ya.ru:443 closed with error: Broken pipe (os error 32)    

As you can see, a -> b which was local -> remote have already finished (probably EOF), but b -> a which was remote -> local fails because of Broken pipe. The only reason of that was the local client closed the connection before received the whole response data.

This is the connection you saw on Chrome that showed errors. Chrome or your sslocal client closed the connection actively before finishing receiving the whole respond data.

But we can still see some connections to ya.ru finished successfully.

Which client application were you using? Could you see its logs about this connection?

zonyitoo avatar May 06 '24 16:05 zonyitoo

I saw an error in your log:

TRACE   tokio-runtime-worker ThreadId(02) shadowsocks::net::tcp: crates/shadowsocks/src/net/tcp.rs:76: connected ya.ru:443 77.88.55.242:443    
DEBUG   tokio-runtime-worker ThreadId(02) shadowsocks_service::server::tcprelay: crates/shadowsocks-service/src/server/tcprelay.rs:255: established tcp tunnel 85.140.23.181:14792 <-> ya.ru:443 with ConnectOpts { fwmark: None, bind_local_addr: None, bind_interface: None, tcp: TcpSocketOpts { send_buffer_size: None, recv_buffer_size: None, nodelay: false, fastopen: true, keepalive: Some(15s), mptcp: false }, udp: UdpSocketOpts { mtu: None } }    
DEBUG   tokio-runtime-worker ThreadId(02) shadowsocks::relay::tcprelay::utils: crates/shadowsocks/src/relay/tcprelay/utils.rs:262: copy bidirection ends with error: Broken pipe (os error 32), a_to_b: Done(2136), b_to_a: Running(CopyBuffer { read_done: false, pos: 0, cap: 63, amt: 67337, .. })    
TRACE   tokio-runtime-worker ThreadId(02) shadowsocks_service::server::tcprelay: crates/shadowsocks-service/src/server/tcprelay.rs:273: tcp tunnel 85.140.23.181:14792 <-> ya.ru:443 closed with error: Broken pipe (os error 32)    

As you can see, a -> b which was local -> remote have already finished (probably EOF), but b -> a which was remote -> local fails because of Broken pipe. The only reason of that was the local client closed the connection before received the whole response data.

This is the connection you saw on Chrome that showed errors. Chrome or your sslocal client closed the connection actively before finishing receiving the whole respond data.

But we can still see some connections to ya.ru finished successfully.

Which client application were you using? Could you see its logs about this connection?

I'm using ShadowSocks by LV Max on Android OS (Download from Google Play Store). I've never gotten this issue before on 4G Networks

nickcarterney avatar May 06 '24 17:05 nickcarterney

@madeye Do you have any idea about this issue?

zonyitoo avatar May 07 '24 01:05 zonyitoo

Which client application were you using? Could you see its logs about this connection?

I tried Shadowsocks by Max Lv and v2rayNG. Same symptoms.

Some logs from v2rayNG https://gist.github.com/frozzway/28ef2645eefac19964fc14a618246e50 Do not see abnormalities from it, but still couldn't connect.

frozzway avatar May 07 '24 01:05 frozzway

https://github.com/shadowsocks/shadowsocks-android/issues/3151#issue-2279101786

Found similar issue report

frozzway avatar May 07 '24 03:05 frozzway

It looks your ISP blocks all the UDP traffic, causing DNS issues. Enabling a SIP003 plugin can solve the problem, as it makes SS app works in TCP-only mode.

madeye avatar May 09 '24 02:05 madeye

It looks your ISP blocks all the UDP traffic, causing DNS issues. Enabling a SIP003 plugin can solve the problem, as it makes SS app works in TCP-only mode.

Any thoughts on this one https://github.com/shadowsocks/shadowsocks-rust/issues/1518#issuecomment-2095222263? We sort of discussed it on this topic cause it is Russia mobile network related

frozzway avatar May 09 '24 03:05 frozzway

Hello! I wanted to make separate bug, but I think this is almost the same. Testing from Russia also. I have sslocal (port 7771) on my laptop that is always connected to ssserver on VPS. Using WiFi/LAN/WAN (local ISP) I can connect to e.g. https://bitbucket.com (test with curl)

curl -x http://127.0.0.1:7771/ https://bitbucket.com -v
*   Trying 127.0.0.1:7771...
* Connected to (nil) (127.0.0.1) port 7771 (#0)
* allocate connect buffer!
* Establish HTTP proxy tunnel to bitbucket.com:443
> CONNECT bitbucket.com:443 HTTP/1.1
> Host: bitbucket.com:443
> User-Agent: curl/7.81.0
> Proxy-Connection: Keep-Alive
> 
< HTTP/1.1 200 OK
< Date: Thu, 26 Sep 2024 15:56:40 GMT
< 
* Proxy replied 200 to CONNECT request
* CONNECT phase completed!
* ALPN, offering h2
* ALPN, offering http/1.1
*  CAfile: /etc/ssl/certs/ca-certificates.crt
*  CApath: /etc/ssl/certs
* TLSv1.0 (OUT), TLS header, Certificate Status (22):
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS header, Certificate Status (22):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS header, Finished (20):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.2 (OUT), TLS header, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN, server accepted to use h2
* Server certificate:
*  subject: C=US; ST=California; L=San Francisco; O=Atlassian US, Inc.; CN=*.bitbucket.com
*  start date: Feb 22 00:00:00 2024 GMT
*  expire date: Mar 24 23:59:59 2025 GMT
*  subjectAltName: host "bitbucket.com" matched cert's "bitbucket.com"
*  issuer: C=US; O=DigiCert Inc; CN=DigiCert Global G2 TLS RSA SHA256 2020 CA1
*  SSL certificate verify ok.
* Using HTTP2, server supports multiplexing
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
* Using Stream ID: 1 (easy handle 0x62921d699eb0)
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
> GET / HTTP/2
> Host: bitbucket.com
> user-agent: curl/7.81.0
> accept: */*
> 
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* old SSL session ID is stale, removing
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* Connection state changed (MAX_CONCURRENT_STREAMS == 64)!
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
< HTTP/2 301 
< location: https://bitbucket.org/
< x-content-type-options: nosniff
< x-xss-protection: 1; mode=block
< atl-traceid: e5583a715c5247c18de313f7cc0bfc7f
< report-to: {"endpoints": [{"url": "https://dz8aopenkvv6s.cloudfront.net"}], "group": "endpoint-1", "include_subdomains": true, "max_age": 600}
< nel: {"failure_fraction": 0.001, "include_subdomains": true, "max_age": 600, "report_to": "endpoint-1"}
< strict-transport-security: max-age=63072000; includeSubDomains; preload
< access-control-allow-origin: *
< vary: Accept-Encoding
< server-timing: atl-edge;dur=2,atl-edge-internal;dur=3,atl-edge-upstream;dur=0,atl-edge-pop;desc="aws-eu-central-1"
< date: Thu, 26 Sep 2024 15:56:40 GMT
< server: AtlassianEdge
< 
* Connection #0 to host (nil) left intact

Using 4G from smartphone I can not connect to https://bitbucket.com, it just hangs

curl -x http://127.0.0.1:7771/ https://bitbucket.com -v
*   Trying 127.0.0.1:7771...
* Connected to (nil) (127.0.0.1) port 7771 (#0)
* allocate connect buffer!
* Establish HTTP proxy tunnel to bitbucket.com:443
> CONNECT bitbucket.com:443 HTTP/1.1
> Host: bitbucket.com:443
> User-Agent: curl/7.81.0
> Proxy-Connection: Keep-Alive
> 
< HTTP/1.1 200 OK
< Date: Thu, 26 Sep 2024 16:00:12 GMT
< 
* Proxy replied 200 to CONNECT request
* CONNECT phase completed!
* ALPN, offering h2
* ALPN, offering http/1.1
*  CAfile: /etc/ssl/certs/ca-certificates.crt
*  CApath: /etc/ssl/certs
* TLSv1.0 (OUT), TLS header, Certificate Status (22):
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* SSL connection timeout
* Closing connection 0
curl: (28) SSL connection timeout

volodalexey avatar Sep 26 '24 16:09 volodalexey

Compiled and enabled https://github.com/shadowsocks/v2ray-plugin for sslocal and ssserver - now it works!

volodalexey avatar Sep 26 '24 17:09 volodalexey

So it is suspected that there was a firewall that trying to detact such kinds of data packets in cellular network of RU.

zonyitoo avatar Sep 27 '24 06:09 zonyitoo