element-android icon indicating copy to clipboard operation
element-android copied to clipboard

Discovery order for homeserver causes "network errors"

Open ThoreKr opened this issue 4 years ago • 5 comments
trafficstars

Describe the bug

When trying to set up element-android with my homeserver via the "other" functionality using the base url always yielded a "No network. Please check your internet connection." However, when testing with another homeserver I got to the login prompt. Also in the webserver logs I could see a request to GET /_matrix/client/versions HTTP/2.0. For reference (and to check I'm not insane I also tested my homeserver against app.element.io. The discovery was successful.

After looking at the request and the responses in android studio and comparing it to the other homeserver which did connect successfully I noted this difference:

V/FormattedJsonHttpLogger: --> GET https://other-home.server/_matrix/client/versions
V/FormattedJsonHttpLogger: <-- 404 https://www.other-home.server/_matrix/client/versions (368ms, unknown-length body)
W/RetrofitExtensionsKt: The error returned by the server is not a MatrixError
E/Request: Exception when executing request GET https://other-home.server/_matrix/client/versions
V/FormattedJsonHttpLogger: --> GET https://other-home.server/config.other-home.server.json
V/FormattedJsonHttpLogger: <-- 404 https://www.other-home.server/config.other-home.server.json (220ms, unknown-length body)
W/RetrofitExtensionsKt: The error returned by the server is not a MatrixError
E/Request: Exception when executing request GET https://other-home.server/config.other-home.server.json
V/FormattedJsonHttpLogger: --> GET https://other-home.server/config.json
V/FormattedJsonHttpLogger: <-- 404 https://www.other-home.server/config.json (214ms, unknown-length body)
W/RetrofitExtensionsKt: The error returned by the server is not a MatrixError
E/Request: Exception when executing request GET https://other-home.server/config.json
V/FormattedJsonHttpLogger: --> GET https://other-home.server/.well-known/matrix/client
V/FormattedJsonHttpLogger: <-- 200 https://other-home.server/.well-known/matrix/client (96ms, 81-byte body)
V/FormattedJsonHttpLogger: --> GET https://matrix.other-home.server/_matrix/client/versions
V/FormattedJsonHttpLogger: <-- 200 https://matrix.other-home.server/_matrix/client/versions (216ms, unknown-length body)
V/FormattedJsonHttpLogger: --> GET https://matrix.other-home.server/_matrix/client/versions
V/FormattedJsonHttpLogger: <-- 200 https://matrix.other-home.server/_matrix/client/versions (159ms, unknown-length body)
V/FormattedJsonHttpLogger: --> GET https://matrix.other-home.server/_matrix/client/r0/login
V/FormattedJsonHttpLogger: <-- 200 https://matrix.other-home.server/_matrix/client/r0/login (27ms, unknown-length body)

In comparison my failing connection looked like this

V/FormattedJsonHttpLogger: --> GET https://MYDOMAIN/_matrix/client/versions
V/FormattedJsonHttpLogger: <-- 200 https://other.MYDOMAIN/ (374ms, unknown-length body)
E/Request: Exception when executing request GET https://MYDOMAIN/_matrix/client/versions

This was related to the configuration of the webserver which redirected all except for the well-kown lookup to other.

Apparently this crashes the entire lookup.

The workaround was to effectively blackhole the other lookup urls and return 404 instead.

To Reproduce

  1. Set up a homeserver on some subdomain and a base_url different to that subdomain
  2. In the reverse proxy configure the base domain server 3. create a location for the well-known endpoint as described in the Installation guide 4. Forward all other requests somewhere else
  3. try to configure element-android for the new homeserver

Expected behavior

Element should discover the homeserver base url from /.well-known/matrix/client as app.element.io when the reverse proxy for the base url only serves the ``well-known` endpoint and forwards all other requests somewhere else.

Smartphone (please complete the following information):

  • Android Studio Emulator (Android 9):
Name: Nexus_S_API_28
CPU/ABI: Google APIs Intel Atom (x86)
Target: google_apis [Google APIs] (API level 28)
  • One Plus 3 (Android 9)
  • Nexus_5_API_30 (Android K)

Additional context

  • Occurs in latest play store app and local checkout of v1.1.3 as well as latest develop ( 2df8eb199b6e2b1f857f632dae48747de89f0a97 )

ThoreKr avatar Apr 05 '21 00:04 ThoreKr

Can confirm same behavior on my installation and devices.

My root webserver virtual host (example.org) is set to serve the .well-known/matrix/server and .well-known/matrix/client endpoints as static JSON returns in the Nginx config.

My Synapse server runs at https://matrix.example.org, and is reverse-proxied to a backend server separately from the root webserver (though they are on the same host, they use different virtual hosts in Nginx).

When setting up the app, entering example.org as the homeserver should look for the well-known value and use that. However, looking at the access logs for example.org, there is never a call to example.org/.well-known/matrix/client and the first request is actually to /_matrix/client/versions. It looks like Element is not using the well-known client config at all.

asimons04 avatar May 03 '21 17:05 asimons04

@asimons04 It is called if a series of conditions hold:

https://github.com/vector-im/element-android/blob/develop/matrix-sdk-android/src/main/java/org/matrix/android/sdk/internal/auth/DefaultAuthenticationService.kt#L244

The discovery order is to look for a matrix server, then the element config in two different places and then finally the well known. However the next one is only checked if it couldn't be found and returned a 404. For convenience this feature is nowhere documented except for one small mention that apparently users might type in the url of the webchat.

Due to the way this is chained i couldn't figure out how to catch this since it will follow redirects and fail internally, so it can't continue to process the chain. That also makes it surprisingly difficult to change the lookup order or just fallback to that.

This is my nginx workaround:

  location ~ ((/_matrix.*)|(config.*json)) {
    return 404;
  }

ThoreKr avatar May 03 '21 19:05 ThoreKr

Thanks for the follow-up. I tried adding that custom location you sent to my root virtual host, but it didn't seem to make any difference. I'm not too concerned as it's easy enough for friends and family to use matrix.mydomain.org instead of just mydomain.org when setting up Element. Just thought if I could make it easier, I would.

asimons04 avatar May 03 '21 20:05 asimons04

In my case I set _matrix/ and _synapse/ to reverse proxy to 8008 of the docker instance.

JesseKPhillips avatar May 01 '24 18:05 JesseKPhillips

In my case I set _matrix/ and _synapse/ to reverse proxy to 8008 of the docker instance.

This is not ideal, as it circumvents delegation and will cause all client traffic to be proxied by the host that is running the root domain. If it is the same host that may be acceptable, if you want to separate them later you'll run into problems.

ThoreKr avatar May 09 '24 13:05 ThoreKr