imageproxy icon indicating copy to clipboard operation
imageproxy copied to clipboard

Issue 250 encoded url param

Open mattpr opened this issue 3 years ago • 5 comments

Supporting url-encoded remote URLs as parameter

You can find more monologue about motivation (real world cases) and the implementation approach for this change in #250.

Summary

  • Options parameter is no longer optional. You can pass //http... or /x/http... or /0x0/http... but not /http....
  • URL encoding of remote urls is now supported (but still optional).
  • Added more test cases.

Before this change, the code would attempt to URL-parse the first param and if that failed then assumed there were options present. That meant in the common case (options and url present) both the options and url would be attempted to be URL parsed in the logic. Now we expect parseable options to always be the first parameter.

The fact that options were optional (before this change) was not documented in the README, only in the code...so hoping this isn't a breaking change for most users. The fix is to add an empty/no-op option before the remote URL parameter (//http... or /x/http... or /0x0/http...).

mattpr avatar Jun 06 '21 10:06 mattpr

Although I haven't worked on this recently, while opening the PR today, it occurred to me that there is an edge case that isn't handled and if this PR is accepted should be fixed.

In particular if the user is providing paths relative to a baseURL.
In these cases URL decoding will not run because we look for the presence of an http prefix. e.g. /x//path/to/remote/image.jpg?evId=123456 with baseUrl https://www.example.com/imagedir.

We shouldn't "always decode" because encoded parameters might be destined for some further "upstream" purpose and decoding these parameters will break requests (same reason we stop decoding as soon as http:// or https:// is discovered as prefix on absolute url).

However this is easy to resolve as we can differentiate between the following:

  1. /x//path/to/remote/image.jpg%3FevId%3D123456 - embedded parameters encoded
  2. %2Fx%2F%2Fpath%2Fto%2Fremote%2Fimage.jpg%3FevId%3D123456 - entire path encoded (leading slashes too)
  • In case 1 we should not decode at all and just use the provided parameter with the configured baseURL.
  • In case 2, the presence of a leading % indicates we should decode until we find a / (or give up and error)...and then append to baseURL. Note we should not key off of %2F prefix as the path may be encoded more than once for various reasons. / -> %2F -> %252F -> %25252F

I would need to check whether a leading slash is currently required on relative URLs or not. If it is not, that would be a breaking change.

I can add this (and tests of course) and update the PR. But I won't bother making further changes unless I get an indication you are open to some version of this PR.

Thanks!

mattpr avatar Jun 06 '21 11:06 mattpr

@willnorris - anything I can do to help move this forward?

mattpr avatar Nov 12 '21 09:11 mattpr

Was thinking about this again today. Regarding the baseURL comment above...

In particular if the user is providing paths relative to a baseURL. In these cases URL decoding will not run because we look for the presence of an http prefix.

The docs example uses baseURL with trailing / and the relative url parameter without a leading slash. So the idea about supporting url-encoded relative URLs based on the presence of a leading slash doesn't work without potentially breaking current users by requiring leading slash on relative urls. This is also a requirement because of the way the go url resolver works (trailing slash on baseURL is required or the last path segment is truncated. I can't think of any reliable way to determine whether a relative url needs to be url decoded or not without the leading slash.

Even a leading slash doesn't help because you can imagine baseURL=https://www.example.com/app?imgPath= expecting parameters like %2Fanother%2Findirection%2Fimage.jpg or https%3A%2F%2Fimg.example.com%2Fpic.jpg which should not be decoded before appending to baseURL.

The solution is to never url-decode the url parameter when baseURL is set. We can't infer what the user wants us to do in the relative url case (whereas with absolute urls we know it should always start with http:// or https://).

I've made this change and updated the tests.

No more open issues from my side that should block this. @willnorris can you review the change when you have a chance? Okay if you don't want to incorporate this change, but would be happy for some feedback nonetheless.

mattpr avatar Mar 14 '22 15:03 mattpr

Codecov Report

Merging #290 (9a8c38e) into main (6022f6a) will decrease coverage by 0.78%. The diff coverage is 70.37%.

:exclamation: Current head 9a8c38e differs from pull request most recent head 872e872. Consider uploading reports for the commit 872e872 to get more accurate results

@@            Coverage Diff             @@
##             main     #290      +/-   ##
==========================================
- Coverage   87.20%   86.42%   -0.79%     
==========================================
  Files           6        6              
  Lines         508      523      +15     
==========================================
+ Hits          443      452       +9     
- Misses         35       38       +3     
- Partials       30       33       +3     
Impacted Files Coverage Δ
data.go 93.22% <70.37%> (-4.84%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 6022f6a...872e872. Read the comment docs.

codecov[bot] avatar Mar 14 '22 16:03 codecov[bot]

Hi @willnorris,

Any chance you can give me some feedback on what we need to do to get url-encoding supported?

I'm not a golang guy, so I won't take any critical feedback on this PR personally.

Would just be happy for some feedback...or better, to be able to stop using my fork for production deployments.

Thanks!

mattpr avatar Dec 12 '22 00:12 mattpr