logbook Extract client identifier from authorization token

Detailed Description

Logbook should enrich requests with the subject from the JWT token in the Authorization header, if present.

Context

Having the id as part of the requests would make it way easier to identify clients which in turn helps when:

identifying unauthorized access issues
usage analysis

Possible Implementation

introduce the concept of an attribute
attributes are simple key-value pairs (tbd type of value)
a request/response can have multiple attributes
attributes should be derived/created from requests/responses before any filtering (e.g. obfuscation)
built-in attribute extractor for sub from JWT token
- detect JWT tokens: Bearer prefix + 3x base64 data separated by dots
- remove Bearer prefix
- split at .
- base64 decode payload, i.e. the second element
- parse JSON
- read properties in order and return the first one that is present
  - https://identity.zalando.com/managed-id (Zalando employee tokens)
  - sub
- don't hard code priorities, but rather allow to configure a list of names, defaults to ["sub"]
extend JsonHttpLogFormatter to include attributes (tbd, top level? nested? name clashes?)

Employee Token

{
  "sub": "3b66d47c-d886-4c63-a0b9-9ec3cad7e848",
  "https://identity.zalando.com/realm": "users",
  "https://identity.zalando.com/token": "Bearer",
  "https://identity.zalando.com/managed-id": "wschoenborn",
  "azp": "ztoken",
  "https://identity.zalando.com/bp": "810d1d00-4312-43e5-bd31-d8373fdd24c7",
  "auth_time": 1540188140,
  "iss": "https://identity.zalando.com",
  "exp": 1541411248,
  "iat": 1541407638
}

Service Token

{
  "sub": "stups_sales-order-service",
  "https://identity.zalando.com/realm": "services",
  "https://identity.zalando.com/token": "Bearer",
  "azp": "stups_sales-order-service_389e4e16-0695-45df-9afd-d9be0ffab456",
  "https://identity.zalando.com/bp": "810d1d00-4312-43e5-bd31-d8373fdd24c7",
  "iss": "https://identity.zalando.com",
  "exp": 1541411315,
  "iat": 1541407705,
  "https://identity.zalando.com/privileges": [
    "com.zalando::loyalty_point_account.read_all"
  ]
}

Links

https://jwt.io/
https://tools.ietf.org/html/rfc7519

Your Environment

Version used: 1.11.1

Nov 05 '18 08:11 whiskeysierra

This should probably be configurable and default to being disabled, the reason being that the subject often contains users email address, which you do not want to log for data protection reasons.

Nov 05 '18 10:11 jhorstmann

Good point!

Nov 05 '18 10:11 whiskeysierra

See also #373

Nov 06 '18 11:11 whiskeysierra

Alternative could be to integrate with spring security and log SecurityContextHolder.getContext().getAuthentication().getName() to keep it auth-method agnostic.

Mar 04 '19 10:03 AlexanderYastrebov

@AlexanderYastrebov That requires a spring dependency.

Mar 04 '19 10:03 whiskeysierra

We're extracting data from the JWT to the MDC already, but after validation of the token; would not want to log any of the content before that. So some kind of hook for validation would be desirable.

Mar 06 '19 16:03 skjolber

Which JWT parsers are you considering? See https://github.com/skjolber/java-jwt-benchmark for a few.

Mar 06 '19 16:03 skjolber

For Spring my experience is that it is limited how much information there is to be found about authorization at request-response-logging-time, since that depends on the implementation which is called later up the chain. For example within the same app some REST methods do not require authorization while others do, and this is enforced by @PreAuthorize and/or even Open Policy Agent rules within the RestControllers.

It is perhaps better to add @ControllerAdvice for AccessDeniedException and friends with extra logging (including dumping contents of JWT) if there is a violation.

Mar 06 '19 16:03 skjolber

but after validation of the token; would not want to log any of the content before that

Why not?

Which JWT parsers are you considering?

Tbh, my idea was to just do it by hand using the Java standard library (split + base64) and Jackson (json).

Mar 09 '19 20:03 whiskeysierra

Well logging details about the JWT goes into the same boat as logging request bodies before authenticaiton/autorization check. But I guess as long as there is not misunderstandings, it is fair to log those details. We're logging JWT details in the MDC, my first impression was that there was potential for mixing up verified and unverified credentails, but technically those should live in seperate parts of the log statements.

Mar 11 '19 12:03 skjolber

I guess 'by hand' parsing is fair if there is no validation involved.

Mar 11 '19 12:03 skjolber

I guess 'by hand' parsing is fair if there is no validation involved.

I believe it's even beneficial to log client identifiers especially for unauthorized requests since that may give you an indiciator who to talk to (assuming no shady intention).

Mar 11 '19 12:03 whiskeysierra

So what does the desired output from logging look like? I'm not so sure about these attributes, would it not be more simple to just transform the header presentation, i.e. in the HttpLogFormatter?

Nov 22 '19 16:11 skjolber

I'm not so sure about these attributes, would it not be more simple to just transform the header presentation, i.e. in the HttpLogFormatter?

We obfuscate the Authorization header, so in the formatter there wouldn't be any way to do that.

So what does the desired output from logging look like?

{
  "origin": "remote",
  "type": "request",
  "correlation": "2d66e4bc-9a0d-11e5-a84c-1f39510f0d6b",
  "protocol": "HTTP/1.1",
  "sender": "127.0.0.1",
  "method": "GET",
  "path": "http://example.org/test",
  "headers": {
    "Accept": ["application/json"],
    "Content-Type": ["text/plain"]
  },
  "attributes": {
    "subject": "[email protected]"
  },
  "body": "Hello world!"
}

Nov 22 '19 16:11 whiskeysierra

Looking at some of the headers we're using, a lot of them actually contain structured data, like

X-Shopify-Shop-Api-Call-Limit: 1/80
Strict-Transport-Security: max-age=7889238
Set-Cookie: BIGipServerpool_posten_api.x.com_7460=1622359.9345.0300; path=/; Httponly; Secure

i.e.

{
  "origin": "remote",
  "type": "request",
  "correlation": "2d66e4bc-9a0d-11e5-a84c-1f39510f0d6b",
  "protocol": "HTTP/1.1",
  "sender": "127.0.0.1",
  "method": "GET",
  "path": "http://example.org/test",
  "headers": {
    "Accept": ["application/json"],
    "Content-Type": ["text/plain"],
    "X-Shopify-Shop-Api-Call-Limit": {
       value: 1, 
       limit: 80
    }
  },
  "attributes": {
    "subject": "[email protected]"
  },
  "body": "Hello world!"
}

Would it be possible to 'capture' some of that, possibly also transforming authorization to

{
  "origin": "remote",
  "type": "request",
  "correlation": "2d66e4bc-9a0d-11e5-a84c-1f39510f0d6b",
  "protocol": "HTTP/1.1",
  "sender": "127.0.0.1",
  "method": "GET",
  "path": "http://example.org/test",
  "headers": {
    "Accept": ["application/json"],
    "Content-Type": ["text/plain"],
    "Authorization": {
       "sub": "stups_sales-order-service", 
       "iss": "https://identity.zalando.com"
    }
  }
  "body": "Hello world!"
}

if so desired?

Nov 22 '19 17:11 skjolber

I guess transforming the Authorization header, rather than filtering it, would be not reveal confidential information. Also, was it ever considered to just remove (filter) the token signature instead of the whole value? At least that would prevent someone taking a token from the logs.

Dec 03 '19 09:12 skjolber

The subject is already confidential, see https://github.com/zalando/logbook/issues/381#issuecomment-435833779

Dec 03 '19 09:12 whiskeysierra

But that is (application-specific-) misuse of the Subject. Logging 'who did what' becomes impossible without the subject?

Dec 03 '19 10:12 skjolber

Also, was it ever considered to just remove (filter) the token signature instead of the whole value?

That sounded like a proposal to change the current behavior of obfuscating the Authorization header completely. It would, again by default, expose subjects which is not ideal.

Logging 'who did what' becomes impossible without the subject?

That's totally desired, but it should be opt-in, so users need to make a conscious decision whether to use it or not. I want to be secure by default.

Might be that I misinterpreted your second to last comment.

Dec 03 '19 13:12 whiskeysierra

I agree opt-in and not changing current default behaviour is desired.

Dec 03 '19 13:12 skjolber

So for an opt-in solution, what do you think about a structured header output / transformation approach (like my comment with JSON example above)?

Dec 03 '19 13:12 skjolber

So for an opt-in solution, what do you think about a structured header output / transformation approach (like my comment with JSON example above)?

I believe that's an orthogonal concern and would deserve its own issue/discussion.

Feb 17 '20 21:02 whiskeysierra

Hi everybody,

I am looking for this feature. Is there a way to log the subject only?

Jan 04 '22 10:01 qvdk

Addressed by https://github.com/zalando/logbook/pull/1589.

Sep 20 '23 15:09 msdousti

logbook logbook copied to clipboard

Extract client identifier from authorization token

Detailed Description

Context

Possible Implementation

Employee Token

Service Token

Links

Your Environment

logbook
logbook copied to clipboard