akka-http icon indicating copy to clipboard operation
akka-http copied to clipboard

User-Agent fails to parse Facebook in App string

Open garyfeltham opened this issue 2 years ago • 4 comments

A user agent string provided by a Facebook in-app browser causes the parsing functions not to resolve a User-Agent header.

Based on the following user agent

Mozilla/5.0 (Linux; Android 13; SM-G981B Build/TP1A.220624.014; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/115.0.5790.138 Mobile Safari/537.36 [FB_IAB/FB4A;FBAV/425.0.0.22.49;]

and subsequent usage of

headerValueByType(User-Agent)

Any HTTP call results in

Request is missing required HTTP header 'User-Agent'```

By removing the appended FB spec, it works: `Mozilla/5.0 (Linux; Android 13; SM-G981B Build/TP1A.220624.014; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/115.0.5790.138 Mobile Safari/537.36`

This appears related to `akka.http.impl.model.parser.SimpleHeaders`

```  // http://tools.ietf.org/html/rfc7231#section-5.5.3
  def `user-agent` = rule { products ~ EOI ~> (`User-Agent`(_)) }

which feeds into HttpRequest headers. Debugging shows that the values of the headers val headers: immutable.Seq[HttpHeader], is a RawHeader('user-agent', '...') when the FB user agent string is supplied. However, if this is cleaned the header instance is indeed a User-Agent instance, hence the extraction function appears to work.

There are some other related bugs such as https://github.com/guardian/support-frontend/issues/213

My current fix is to run an adaption filter when booting the server to strip out the [FB_IAB.*] on each request however the issue appears to be in the parsing of the SimpleHeaders and its ability to instantiate the User-Agent instance based on the pattern.

garyfeltham avatar Aug 02 '23 08:08 garyfeltham

Coded adaption process as

  def adaptUserAgentHeader(request: HttpRequest): HttpRequest = {

    val result = request.headers.collect {
      case h@RawHeader(_) if h.lowercaseName == "user-agent" =>
        val adaptedValue = h.value.replaceAll("\\[.*?\\]$", "")
        if (adaptedValue != h.value) `User-Agent`.parseFromValueString(adaptedValue).getOrElse(h) else h
      case h@_ => h
    }

    request.withHeaders(result)
  }
   val sb = http.newServerAt(
     serverConfigNode.getString("host"),
     serverConfigNode.getInt("port")
   )
   val adaptHeadersFlow: Flow[HttpRequest, HttpRequest, NotUsed] =
     Flow[HttpRequest].map(adaptUserAgentHeader)

   val combinedFlow: Flow[HttpRequest, HttpResponse, Any] = adaptHeadersFlow.via(flow)
   sb.bindFlow(combinedFlow)

garyfeltham avatar Aug 02 '23 09:08 garyfeltham

Is the config akka.http.parsing.modeled-header-parsing left at the default on? (If disabled then only a very small set of essential headers are parsed into model headers and the rest is left as RawHeaders)

johanandren avatar Aug 14 '23 10:08 johanandren

Ah, sorry, you mentioned it works as expected with another user agent string, so it is something about the value then

johanandren avatar Aug 14 '23 10:08 johanandren

Ok, reading up a bit more it seems that the FB user-agent header value is not RFC compliant. We have previously silenced logging warnings for such (https://github.com/akka/akka-http/issues/687), but in this case you want it parsed.

I'm not sure we want to change our current stance, which is to stay aligned with RFC rather than handle special cases, especially givent that it is possible to work around (thanks for sharing workaround btw).

johanandren avatar Aug 14 '23 11:08 johanandren