redroid-doc icon indicating copy to clipboard operation
redroid-doc copied to clipboard

Intermittent internet connectivity issue

Open karlbaumg opened this issue 8 months ago • 3 comments

Describe the bug

I have both Firefox and the default WebView Shell pre-installed and I'm deploying redroid container on Kubernetes running on bare metal with k0s and Calico CNI.

The problem is that sometimes Firefox, sometimes WebView Shell or both don't get internet connection. Firefox shows empty white page, WebView Shell shows net::ERR_ACCESS_DENIED. The puzzling part is that it happens in like 15-20% of the cases.

make sure the required kernel modules present

  • [x] grep binder /proc/filesystems
  • [x] grep ashmem /proc/misc (optional)

Some information on the kubernetes cluster:

  • Kubernetes v1.31.5+k0s (managed by k0s)
  • CNI: Calico v3.28.2
  • OS: Ubuntu 24.04.2 LTS (GNU/Linux 6.8.0-53-generic x86_64)

An partly redacted example pod YAML:

apiVersion: v1
kind: Pod
metadata:
  name: test
  namespace: default
spec:
  containers:
  - args:
    - androidboot.use_memfd=1
    - androidboot.redroid_net_ndns=1
    - androidboot.redroid_net_dns1=1.1.1.1
    image: redroid/redroid:14.0.0-latest
    imagePullPolicy: IfNotPresent
    name: android
    securityContext:
      privileged: true
    startupProbe:
      exec:
        command:
        - /system/bin/sh
        - -c
        - '[[ "$(/system/bin/getprop sys.boot_completed)" == "1" ]] && ip route get 1.1.1.1'
      failureThreshold: 60
      initialDelaySeconds: 30
      periodSeconds: 1
      successThreshold: 1
      timeoutSeconds: 1

Collect debug logs

Logs from a container where the default WebView Shell app can't connect to internet but Firefox can: webview-not-working.tgz

Logs from the case where Firefox cannot but WebView Shell app can: firefox-not-working.tgz

Both working: all-working.tgz

Since it's not a docker container, I ended up changing debug.sh to work diectly with adb but do let me know if more commands are needed.

Screenshots

Image

karlbaumg avatar Apr 05 '25 13:04 karlbaumg

@zhouziyang I've debugged this further with Chrome and saw this error which is already reported https://github.com/remote-android/redroid-doc/issues/523

It's most clear when you type on URL bar.

04-07 22:10:44.483  3399  3478 E chromium: [ERROR:socket_posix.cc(93)] CreatePlatformSocket() failed: Operation not permitted (1)
04-07 22:10:44.755  3399  3399 W cr_UrlBar: Text change observed, triggering autocomplete.
04-07 22:10:44.787  3399  3478 E chromium: [ERROR:socket_posix.cc(93)] CreatePlatformSocket() failed: Operation not permitted (1)
04-07 22:10:45.075  3399  3399 W cr_UrlBar: Text change observed, triggering autocomplete.
04-07 22:10:45.105  3399  3478 E chromium: [ERROR:socket_posix.cc(93)] CreatePlatformSocket() failed: Operation not permitted (1)
04-07 22:10:45.203  3399  3399 W cr_UrlBar: Text change observed, triggering autocomplete.
04-07 22:10:45.235  3399  3478 E chromium: [ERROR:socket_posix.cc(93)] CreatePlatformSocket() failed: Operation not permitted (1)
04-07 22:10:45.395  3399  3399 W cr_UrlBar: Text change observed, triggering autocomplete.
04-07 22:10:45.427  3399  3478 E chromium: [ERROR:socket_posix.cc(93)] CreatePlatformSocket() failed: Operation not permitted (1)
04-07 22:10:45.531  3399  3399 W cr_UrlBar: Text change observed, triggering autocomplete.
04-07 22:10:45.564  3399  3478 E chromium: [ERROR:socket_posix.cc(93)] CreatePlatformSocket() failed: Operation not permitted (1)
04-07 22:10:45.619  3399  3399 W cr_UrlBar: Text change observed, triggering autocomplete.
04-07 22:10:45.671  3399  3478 E chromium: [ERROR:socket_posix.cc(93)] CreatePlatformSocket() failed: Operation not permitted (1)

karlbaumg avatar Apr 07 '25 22:04 karlbaumg

I narrowed it down to Chrome process at Linux level. In a case where this happens, I looked up PID of com.android.chrome process and when tested to open a socket, here is what I get:

> setpriv --reuid=10056 --regid=10056 --clear-groups \
  /bin/bash -c 'exec 3<>/dev/tcp/8.8.8.8/53 2>/dev/null \
               && echo "socket OK" \
               || echo "socket() got EPERM ($?)"'
/bin/bash: socket: Operation not permitted
/bin/bash: line 1: /dev/tcp/8.8.8.8/53: Operation not permitted
socket() got EPERM (1)

The weird thing is in some cases it works but after a couple of minutes it fails to connect again. @zhouziyang Got any clues on how to further debug?

karlbaumg avatar May 05 '25 18:05 karlbaumg

Were you able to figure this out @karlbaumg ? i am having the same issue on azure kubernetes

anshuman852 avatar Jun 13 '25 05:06 anshuman852