AWS S3: Handle Illegal Headers in Copy Part
Currently this is a minimal change to demonstrate the issue so that the strategy can be discussed before addressed.
fixes #1245
@mdedetrich / @pjfanning - I made this draft PR to demonstrate the problem in order to discuss the path forward for fixing this problem. I'm going to look to see if I can create an integration test tomorrow.
I'm thinking that we have an allow list of all correct headers and filter out all the headers that start with the x-amz- that do not belong to that list?
I say starts with x-amz- as there might exist use cases with S3 compliant stores that have custom headers. This is just pure speculation however.
The main con here is that if AWS introduces new headers in the future this would require an update. Alternatively we may wish to simply remove x-amz-storage-class and create a block list, which would still need to be maintained.
Open to alternative suggestions if there are any
@mdedetrich / @pjfanning - I made this draft PR to demonstrate the problem in order to discuss the path forward for fixing this problem. I'm going to look to see if I can create an integration test tomorrow.
I'm thinking that we have an allow list of all correct headers and filter out all the headers that start with the
x-amz-that do not belong to that list?I say starts with
x-amz-as there might exist use cases with S3 compliant stores that have custom headers. This is just pure speculation however.The main con here is that if AWS introduces new headers in the future this would require an update. Alternatively we may wish to simply remove
x-amz-storage-classand create a block list, which would still need to be maintained.Open to alternative suggestions if there are any
I would say, if we do have an allowlist it should be provided in as configuration in reference.conf so people can add more values without needing us to do an additional release. The initial list can be hardcoded, but there can be additional pekko.connectors.s3.additional-upload-whitelist-headers configuration where people can also add additional headers in the case that s3 adds something new.
@pjfanning @raboof thoughts?
Sorry, I don't think I understand s3 well enough to have a meaningful opinion here
I have no strong opinion about being proactive about headers. It seems to introduce its own risks. If we do introduce a new config, avoid any name that uses colour based naming, eg white or black.
@mdedetrich / @pjfanning - Do I need to worry about source/binary compat? Not sure I can add the configuration handling without breaking compatibility on some level
Additionally I assume I should be handling the headers the same way for all header S3Request types and not just CopyPart?
@mdedetrich / @pjfanning - Do I need to worry about source/binary compat? Not sure I can add the configuration handling without breaking compatibility on some level
Additionally I assume I should be handling the headers the same way for all header S3Request types and not just CopyPart?
@Luka-J9 it's ok to break binary compatibility on main branch as it is targeted at a 2.0.0 release. Just add a mima-filters file. You'll find examples in the src code already (.excludes file extension).
We can decide based on whether it is safe to backport to a 1.3.0 release after that. We would accept some level of change for that but it would probably need careful thought as to what is acceptable.
@mdedetrich / @pjfanning - Do I need to worry about source/binary compat? Not sure I can add the configuration handling without breaking compatibility on some level
Additionally I assume I should be handling the headers the same way for all header S3Request types and not just CopyPart?
So I am not sure how this change would be binary or even source breaking, it should be just internal behaviour without needing to change and function/method signatures. We already have a mima check that will verify if there is a binary breaking change.
Regarding adding the configuration setting, this is fine as long as its released in a version with a bumped minor version.
I've updated the PR with something more concrete in terms of implementation. I still need to add tests and docs, but wanted to get this out there for some early feedback about the strategy. If the approach is good I can add tests and move this out of draft for a more detailed and formal review
@mdedetrich / @pjfanning - turns out the only errors when running mimaReportBinaryIssues were with one of the case classes creation methods, which only required a couple of compatibility functions to be added 🥳
If the general structure is okay - I'll write tests in the next day or two, just would like a rough 👍 before investing more time into this approach
Hi @mdedetrich - I updated the PR with some tests and documentation. Apologies for the slow follow up, I had a bunch of things come up and this ended up in the back burner. I pulled this out of draft, and was able to run the regression tests you pointed out here with no problems.
Also confirmed the mima compatability is maintained.
Let me know if you have any additional requirements, questions or tests you'd like me to add.
Hi @mdedetrich - I updated the PR with some tests and documentation. Apologies for the slow follow up, I had a bunch of things come up and this ended up in the back burner. I pulled this out of draft, and was able to run the regression tests you pointed out here with no problems.
Also confirmed the mima compatability is maintained.
Let me know if you have any additional requirements, questions or tests you'd like me to add.
Thanks, I will check it out in the next few days, in the meantime I have activated the CI tests.
@Luka-J9 I just ran the CI and some S3 tests failed, would you be able to look into it?
@Luka-J9 I just ran the CI and some S3 tests failed, would you be able to look into it?
Looks like one of my tests is not compatible with scala 3 - it uses reflection to get a list of members of a sealed trait - I'll see if I can work around it
@Luka-J9 I just ran the CI and some S3 tests failed, would you be able to look into it?
Looks like one of my tests is not compatible with scala 3 - it uses reflection to get a list of members of a sealed trait - I'll see if I can work around it
There are open source libraries that target both Scala 2 and Scala 3 which can do this (i.e. https://github.com/zio/izumi-reflect), albeit this can be considered a bit excessive.
If you do decide to use this library make sure to add the dependency only to test scope as we don't need as part of main s3.
@Luka-J9 I just ran the CI and some S3 tests failed, would you be able to look into it?
Looks like one of my tests is not compatible with scala 3 - it uses reflection to get a list of members of a sealed trait - I'll see if I can work around it
There are open source libraries that target both Scala 2 and Scala 3 which can do this (i.e. https://github.com/zio/izumi-reflect), albeit this can be considered a bit excessive.
If you do decide to use this library make sure to add the dependency only to
testscope as we don't need as part of main s3.
Opted to use exhaustivity checking - it should be good for another CI run now 🤞
@Luka-J9 I just ran the CI and some S3 tests failed, would you be able to look into it?
Looks like one of my tests is not compatible with scala 3 - it uses reflection to get a list of members of a sealed trait - I'll see if I can work around it
There are open source libraries that target both Scala 2 and Scala 3 which can do this (i.e. https://github.com/zio/izumi-reflect), albeit this can be considered a bit excessive. If you do decide to use this library make sure to add the dependency only to
testscope as we don't need as part of main s3.Opted to use exhaustivity checking - it should be good for another CI run now 🤞
Code is not formatted, format with scalafmt
@Luka-J9 I just ran the CI and some S3 tests failed, would you be able to look into it?
Looks like one of my tests is not compatible with scala 3 - it uses reflection to get a list of members of a sealed trait - I'll see if I can work around it
There are open source libraries that target both Scala 2 and Scala 3 which can do this (i.e. https://github.com/zio/izumi-reflect), albeit this can be considered a bit excessive. If you do decide to use this library make sure to add the dependency only to
testscope as we don't need as part of main s3.Opted to use exhaustivity checking - it should be good for another CI run now 🤞
Code is not formatted, format with scalafmt
Missed a newline when updating the reflection test 🤦 Sorted imports while I was at it
Awesome thanks! Ill have a look at this on the weekend.
@mdedetrich - I believe I have addressed your comments, let me know if you need any additional changes!