amazonka
amazonka copied to clipboard
A repro for issue #796
This adds a reproducible test case for issue #796 "PutObject request failing on MinIO because of malformed XML" by adding an integration test checking amazonka-s3
against MinIO. To execute the test just run the following:
$ nix-build tests/s3.nix
...
s3: must succeed:
echo 'Hello World!' > ./some-file
export AWS_ACCESS_KEY_ID="BKIKJAA5BMMU2RHO6IBB"
export AWS_SECRET_ACCESS_KEY="V7f1CwQqAcwo80UEIJEjc5gVQUSSx5ohQ9GSrr12"
amazonka-s3-test-app ./some-file http://s3:9000 my-bucket some-file
s3 # > > > > > [ 11.524753] minio[818]: {
"deploymentid": "0de4a2b3-eaf1-46a0-8ec7-2e706e68d3b4",
"level": "ERROR",
"errKind": "MINIO",
"time": "2022-07-09T18:07:28.335404358Z",
"api": {
"name": "SYSTEM",
"args": {}
},
"error": {
"message": "EOF (*errors.errorString)",
"source": [
"cmd/handler-utils.go:55:cmd.parseLocationConstraint()",
"cmd/auth-handler.go:344:cmd.checkRequestAuthTypeCredential()",
"cmd/auth-handler.go:296:cmd.checkRequestAuthType()",
"cmd/bucket-handlers.go:719:cmd.objectAPIHandlers.PutBucketHandler()",
"net/http/server.go:2047:http.HandlerFunc.ServeHTTP()"
]
}
}
s3 # amazonka-s3-test-app: ServiceError
( ServiceError'
{ _serviceErrorAbbrev = Abbrev {fromAbbrev = "S3"},
_serviceErrorStatus = Status {statusCode = 400, statusMessage = "Bad Request"},
_serviceErrorHeaders =
[ ("Accept-Ranges", "bytes"),
("Content-Length", "371"),
("Content-Security-Policy", "block-all-mixed-content"),
("Content-Type", "application/xml"),
("Server", "MinIO"),
("Strict-Transport-Security", "max-age=31536000; includeSubDomains"),
("Vary", "Origin"),
("Vary", "Accept-Encoding"),
("X-Amz-Bucket-Region", "us-east-1"),
("X-Amz-Request-Id", "17003B77BD220CF2"),
("X-Content-Type-Options", "nosniff"),
("X-Xss-Protection", "1; mode=block"),
("Date", "Sat, 09 Jul 2022 18:07:28 GMT")
],
_serviceErrorCode = ErrorCode "MalformedXML",
_serviceErrorMessage =
Just
( ErrorMessage
{ fromErrorMessage = "The XML you provided was not well-formed or did not validate against our published schema."
}
),
_serviceErrorRequestId =
Just
( RequestId
{ fromRequestId = "17003B77BD220CF2"
}
)
}
)
That ^ commit extends the test with capturing the network packets using tcpdump
to inspect the request amazonka generates. This is the dump I get:
10:20:48.571111 IP6 ::1.59144 > ::1.9000: Flags [S], seq 4201602683, win 65476, options [mss 65476,sackOK,TS val 2587980450 ecr 0,nop,wscale 7], length 0
`.b?.(.@..................................#(.o^{.........0.........
.Ar.........
10:20:48.571164 IP6 ::1.9000 > ::1.59144: Flags [S.], seq 1752750534, ack 4201602684, win 65464, options [mss 65476,sackOK,TS val 2587980450 ecr 2587980450,nop,wscale 7], length 0
`....(.@................................#(..hx...o^|.....0.........
.Ar..Ar.....
10:20:48.571226 IP6 ::1.59144 > ::1.9000: Flags [.], ack 1, win 512, options [nop,nop,TS val 2587980450 ecr 2587980450], length 0
`.b?. .@..................................#(.o^|hx.......(.....
.Ar..Ar.
10:20:48.572635 IP6 ::1.59144 > ::1.9000: Flags [P.], seq 1:461, ack 1, win 512, options [nop,nop,TS val 2587980451 ecr 2587980450], length 460
`.b?...@..................................#(.o^|hx.............
.Ar..Ar.PUT /some-file HTTP/1.1
Accept-Encoding: gzip
Content-Length: 13
Host: s3:9000
X-Amz-Date: 20220710T102048Z
X-Amz-Content-SHA256: 03ba204e50d126e4674c005e04d82e84c21366780af1f43bd54a37816b6ab340
Expect: 100-continue
Authorization: AWS4-HMAC-SHA256 Credential=BKIKJAA5BMMU2RHO6IBB/20220710/us-east-1/s3/aws4_request, SignedHeaders=expect;host;x-amz-content-sha256;x-amz-date, Signature=a8a6fa5ae7a99054f3b00e694b4413c9bee04bbefdd574af84a8987fb7a4807e
10:20:48.572657 IP6 ::1.9000 > ::1.59144: Flags [.], ack 461, win 508, options [nop,nop,TS val 2587980451 ecr 2587980451], length 0
`.... .@................................#(..hx...o`H.....(.....
.Ar..Ar.
10:20:48.573627 IP6 ::1.9000 > ::1.59144: Flags [P.], seq 1:26, ack 461, win 512, options [nop,nop,TS val 2587980452 ecr 2587980451], length 25
`....9.@................................#(..hx...o`H.....A.....
.Ar..Ar.HTTP/1.1 100 Continue
10:20:48.573656 IP6 ::1.59144 > ::1.9000: Flags [.], ack 26, win 512, options [nop,nop,TS val 2587980452 ecr 2587980452], length 0
`.b?. .@..................................#(.o`Hhx.......(.....
.Ar..Ar.
10:20:48.573789 IP6 ::1.59144 > ::1.9000: Flags [P.], seq 461:474, ack 26, win 512, options [nop,nop,TS val 2587980453 ecr 2587980452], length 13
`.b?.-.@..................................#(.o`Hhx.......5.....
.Ar..Ar.Hello World!
10:20:48.573807 IP6 ::1.9000 > ::1.59144: Flags [.], ack 474, win 512, options [nop,nop,TS val 2587980453 ecr 2587980453], length 0
`.... .@................................#(..hx...o`U.....(.....
.Ar..Ar.
10:20:48.581357 IP6 ::1.9000 > ::1.59144: Flags [P.], seq 26:836, ack 474, win 512, options [nop,nop,TS val 2587980460 ecr 2587980453], length 810
`....J.@................................#(..hx...o`U.....R.....
.Ar..Ar.HTTP/1.1 400 Bad Request
Accept-Ranges: bytes
Content-Length: 371
Content-Security-Policy: block-all-mixed-content
Content-Type: application/xml
Server: MinIO
Strict-Transport-Security: max-age=31536000; includeSubDomains
Vary: Origin
Vary: Accept-Encoding
X-Amz-Bucket-Region: us-east-1
X-Amz-Request-Id: 170070951ABB6B77
X-Content-Type-Options: nosniff
X-Xss-Protection: 1; mode=block
Date: Sun, 10 Jul 2022 10:20:48 GMT
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>MalformedXML</Code><Message>The XML you provided was not well-formed or did not validate against our published schema.</Message><BucketName>some-file</BucketName><Resource>/some-file</Resource><Region>us-east-1</Region><RequestId>170070951ABB6B77</RequestId><HostId>b31e5451-e212-4657-9db6-a3d4f5f47922</HostId></Error>
10:20:48.581390 IP6 ::1.59144 > ::1.9000: Flags [.], ack 836, win 506, options [nop,nop,TS val 2587980460 ecr 2587980460], length 0
`.b?. .@..................................#(.o`Uhx.
.....(.....
.Ar..Ar.
10:20:48.605507 IP6 ::1.59144 > ::1.9000: Flags [F.], seq 474, ack 836, win 512, options [nop,nop,TS val 2587980484 ecr 2587980460], length 0
`.b?. .@..................................#(.o`Uhx.
.....(.....
.Ar..Ar.
10:20:48.605588 IP6 ::1.9000 > ::1.59144: Flags [F.], seq 836, ack 475, win 512, options [nop,nop,TS val 2587980484 ecr 2587980484], length 0
`.... .@................................#(..hx.
.o`V.....(.....
.Ar..Ar.
10:20:48.605613 IP6 ::1.59144 > ::1.9000: Flags [.], ack 837, win 512, options [nop,nop,TS val 2587980484 ecr 2587980484], length 0
`.b?. .@..................................#(.o`Vhx.......(.....
.Ar..Ar.
So you can see a packet containing the HTTP request PUT /some-file
and later a packet containing the contetns Hello World!
.
You can also see the MinIO response The XML you provided was not well-formed
. Indeed we didn't provide any XML. We just PUT the contents Hello World!
without wrapping it in any XML.
What's at fault here? Amazonka? MinIO? Or am I missing something?
I'm suspicious of the fact that, even though the name of the bucket is specified in the PutObject
request:
PutObject'
{ contentLength = Nothing
, objectLockMode = Nothing
, expires = Nothing
, grantReadACP = Nothing
, sSECustomerAlgorithm = Nothing
, sSECustomerKey = Nothing
, requestPayer = Nothing
, grantWriteACP = Nothing
, bucketKeyEnabled = Nothing
, websiteRedirectLocation = Nothing
, grantRead = Nothing
, storageClass = Nothing
, sSECustomerKeyMD5 = Nothing
, sSEKMSKeyId = Nothing
, grantFullControl = Nothing
, contentEncoding = Nothing
, tagging = Nothing
, contentMD5 = Nothing
, objectLockRetainUntilDate = Nothing
, metadata = fromList []
, sSEKMSEncryptionContext = Nothing
, cacheControl = Nothing
, contentLanguage = Nothing
, objectLockLegalHoldStatus = Nothing
, acl = Nothing
, contentDisposition = Nothing
, expectedBucketOwner = Nothing
, serverSideEncryption = Nothing
, contentType = Nothing
, bucket = BucketName "my-bucket"
, key = ObjectKey "some-file"
, body = Hashed HashedStream
{ sha256 = 03 ba204e50d126e4674c005e04d82e84c21366780af1f43bd54a37816b6ab340
, length = 13
}
}
that bucket name is not included in the HTTP request to MinIO:
PUT /some-file HTTP/1.1
Accept-Encoding: gzip
Content-Length: 13
Host: s3:9000
X-Amz-Date: 20220710T165128Z
X-Amz-Content-SHA256: 03ba204e50d126e4674c005e04d82e84c21366780af1f43bd54a37816b6ab340
Expect: 100-continue
Authorization: AWS4-HMAC-SHA256 Credential=BKIKJAA5BMMU2RHO6IBB/20220710/us-east-1/s3/aws4_request, SignedHeaders=expect;host;x-amz-content-sha256;x-amz-date, Signature=67a8e6a8f495e6bcad4fc8ea7c65180e113f8a0a4c5e44f2565ff365806c1626
Maybe that explains why MinIO is not expecting to receive an object at URL /some-file
but is expecting XML instead because that URL is not for uploading objects.
If I look at examples of how to put an object to MinIO I see the request path needs to begin with the bucket name. So in our example we need the request to be:
PUT /my-bucket/some-file
I'll dive into the amazonka source code to see why the bucket name is omitted.
I found the root cause: the default request of a PutObject
contains the right path with bucketName/objectKey
. However it's passed through the s3vhost
function which strips of the bucket name from the path and moves it to the host.
The right solution is probably to only apply the s3vhost
transformation if we're dealing with an AWS endpoint.
@endgame, @akshaymankar https://github.com/basvandijk/amazonka/compare/put-to-minio-malformed-xml...put-to-minio-malformed-xml-fix-attempt-1 tries to fix #760 by adding the field _serviceRewriteS3VHost :: Bool
to the Service
record which is False
for all services except for S3.
A user can then override this field in the environment by setting it to False
:
defEnv <- Amazonka.newEnv Amazonka.Auth.fromKeysEnv
let env =
flip Amazonka.override defEnv $
(Amazonka.serviceEndpoint .~ endpoint)
. (Amazonka.serviceRewriteS3VHost .~ False)
Then s3vhost
will not rewrite the passed request rq
if not $ rq ^. requestService . serviceRewriteS3VHost
.
This doesn't work unfortunately since the AWSRequest
instance for PutObject
overrides the Service
record in the passed Request
with defaultService
, thereby setting the _serviceRewriteS3VHost
back to True
.
Possible solution
To fix this I think the request
method implementations should stop setting a defaultService
in the Request
like we do here. Instead we need to have a type class:
class AWSService a where
service :: Proxy a -> Service
and have instances for all request types like for example:
instance AWSService PutObject where
service _proxy = Amazonka.S3.Types.defaultService
then the request
method needs to accept a Service
argument:
class AWSRequest a where
...
request :: Service -> a -> Request a
The instance for PutObject
for example will then look like:
instance Core.AWSRequest PutObject where
type AWSResponse PutObject = PutObjectResponse
- request =
+ request srv =
Request.expectHeader
Prelude.. Request.s3vhost
- Prelude.. Request.putBody defaultService
+ Prelude.. Request.putBody srv
configureRequest
will then override the service
with the user supplied overrides
and apply the request
method to it:
- configureRequest :: AWSRequest a => Env' withAuth -> a -> Request a
+ configureRequest :: forall a. (AWSRequest a, AWSService a) => Env' withAuth -> a -> Request a
configureRequest env x =
let overrides = envOverride env
+ srv = appEndo (getDual overrides) $ service (Proxy :: Proxy a)
- in request x & requestService %~ appEndo (getDual overrides)
+ in request srv x
We could then also consider dropping the _requestService
field from Request
but I'm not sure if that's used for other things.
This seems overwrought; can you not just set the rewriteS3VHost
in an override with the rest of the amazonka Env
? The requestService
appears to be modified after the default request is generated by the typeclass instance:
https://github.com/brendanhay/amazonka/blob/26466244b5a23b29ba8894ec89d922e6a6835a05/lib/amazonka/src/Amazonka/HTTP.hs#L157-L160
... can you not just set the rewriteS3VHost in an override with the rest of the amazonka Env? ...
I tried that:
https://github.com/basvandijk/amazonka/compare/put-to-minio-malformed-xml...put-to-minio-malformed-xml-fix-attempt-1 tries to fix https://github.com/brendanhay/amazonka/issues/760 by adding the field _serviceRewriteS3VHost :: Bool to the Service record which is False for all services except for S3.
But:
This doesn't work unfortunately since the AWSRequest instance for PutObject overrides the Service record in the passed Request with defaultService, thereby setting the _serviceRewriteS3VHost back to True.
Hence my suggestion for that "possible solution".
FYI @basvandijk your fix is now implemented over on our branch here: https://github.com/SupercedeTech/amazonka/commits/temp-vhost-fix
I've tested it for our use case (using Moto as an S3 mock) and it works as expected.
One particular issue of note was this line, which I believe is the cause for #760, in part. Because the service overrides are originally applied after the request is built, the call to setEndpoint
resets the service host after the bucket name is prepended in s3vhost
- so the URL which is presigned ends up being incorrect, missing the bucket in both the path and hostname.
The fix involving AWSService
(or, at minimum, passing the Service
in to request
) seems necessary to resolve this, since Service
overrides alone don't seem to accurately manage it.
We're also experiencing the issue with s3vhost
rewriting, as we use localstack instead of real AWS to run our test suites.
What are the next steps regarding wrt. the patch proposed by @ivb-supercede?