pgbackrest icon indicating copy to clipboard operation
pgbackrest copied to clipboard

S3 WebID auth failure error message improvement

Open gricey432 opened this issue 2 years ago • 4 comments

If an error response is returned from the STS AssumeRoleWithWebIdentity call, the error message is largely eaten by the xml decoder and makes it difficult to debug permission issues.

https://github.com/pgbackrest/pgbackrest/blob/dffc933384e4327e2fd52868dd19668ed9d3ed91/src/storage/s3/storage.c#L418

An error response from STS here results in a log like:

[ERROR: [029]: unable to load info file '/mydb/backup/db/backup.info' or '/mydb/backup/db/backup.info.copy':
FormatError: unable to find child 'AssumeRoleWithWebIdentityResult':0 in node 'ErrorResponse'
FormatError: unable to find child 'AssumeRoleWithWebIdentityResult':0 in node 'ErrorResponse'
HINT: backup.info cannot be opened and is required to perform a backup.
HINT: has a stanza-create been performed?

Could this be improved to log the AWS response in the case of an error?

gricey432 avatar Feb 14 '23 01:02 gricey432

Yeah, that's not great. However, I am unable to find any information about the error format here: https://docs.aws.amazon.com/STS/latest/APIReference/API_AssumeRoleWithWebIdentity.html. Any ideas where that would be?

This is the typical response (https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingRESTError.html) but the token does not match.

dwsteele avatar Feb 15 '23 02:02 dwsteele

The specific error I was debugging ended up looking like this when I sent the same request directly. Was a 400 Bad Request.

<ErrorResponse xmlns="https://sts.amazonaws.com/doc/2011-06-15/">
  <Error>
    <Type>Sender</Type>
    <Code>InvalidIdentityToken</Code>
    <Message>OpenIDConnect provider's HTTPS certificate doesn't match configured thumbprint</Message>
  </Error>
  <RequestId>2d8fc0e3-ac00-4bbe-8bff-bf57499e5de2</RequestId>
</ErrorResponse>

I can find similar response structures in the docs for other services, just not on STS, not ideal on AWS's part.

E.g. https://docs.aws.amazon.com/Route53/latest/APIReference/requests-rest-responses.html#requests-rest-error-responses

gricey432 avatar Feb 16 '23 01:02 gricey432

We are getting the same error

ERROR: [103]: unable to find a valid repository:
       repo1: [FormatError] unable to load info file '/pgbackrest/fake-sergas-dev/archive/db/archive.info' or '/pgbackrest/fake-sergas-dev/archive/db/archive.info.copy':
       FormatError: unable to find child 'AssumeRoleWithWebIdentityResult':0 in node 'ErrorResponse'
       FormatError: unable to find child 'AssumeRoleWithWebIdentityResult':0 in node 'ErrorResponse'
       HINT: archive.info cannot be opened but is required to push/get WAL segments.
       HINT: is archive_command configured correctly in postgresql.conf?
       HINT: has a stanza-create been performed?
       HINT: use --no-archive-check to disable archive checks during backup if you have an alternate archiving scheme.

The bucket exist but i can't to know witch error is. Any ideia ?

Thanks

jmvizcainoio avatar May 15 '24 13:05 jmvizcainoio

The bucket exist but i can't to know witch error is. Any ideia ?

You can try sending the request directly and examining the error message.

dwsteele avatar May 15 '24 23:05 dwsteele