imageproxy icon indicating copy to clipboard operation
imageproxy copied to clipboard

Allow URI encoding of the URLs in the path

Open beattyml1 opened this issue 8 months ago • 5 comments

Summary Allow URI encoding of the URLs in the path

Motivation: We use URL from all over the internet who's hygine, use of query params, etc we largely don't have control over. These image urls are then also often put into RSS Feeds and other standards based formats that we make available to other crawlers. We often don't have control over the validations these third parties put on feeds including the URLS inside which often leads to validation issues, encoding is often the only ways to get these formats to pass, if there are query parameters or special characters.

Testing and Cases Handled

  • Thorugh Unit Tests Added
    • Options with encode url: /1x2/, /, //
    • Encoded Scheme: with 1, 2, or 3 slashes
    • Query params either on the overall url or in the encoded url or without query params
    • Havinghttp elsewhere in the URL (in parts of code keying off of http might break if it's in multiple times)
    • Having an encoded string in the destination image url but the url itself not being encoded already had a test case but is explicitly handled by code and continues to pass that test case
  • Currently in our QA environment and has been manually tested with a variety of real world URLS
  • Passes ESLint

Open to make changes requested to fit either project style, go best practices, or your personal preferences.

Let me know if there's anything you need from to get this in.

beattyml1 avatar Apr 25 '25 21:04 beattyml1

Would base64 encoding the remote URL work for you? I really thought there was an issue or PR for that at one point, but all I'm finding is #431 where I said basically the same thing :) Maybe it was just an idea I had, but never implemented. I think that would ultimately be cleaner and cover a host of issues like character sets and such. URI encoding could handle it as well, but I think base64 would just be simpler and create cleaner URLs... completely opaque, but at least it's obvious that's the case.

willnorris avatar Apr 28 '25 16:04 willnorris

Take a look at the base64 branch and see if that would work for you

willnorris avatar Apr 28 '25 18:04 willnorris

I think it would solve the issues I described but for us one additional layer is that we've had some version of this in production for a while and have links using the URL encoded version that we want to continue working but only recently got unit test coverage of our solution up to a standard we felt comfortable back contributing. Totally understand if that isn't something you want to support but just likely means we'll be permanently forking.

The handling we added shouldn't ever conflict with un-encoded URLs given the other validations you have on what constitutes a valid URL if that's a concern.

beattyml1 avatar Apr 28 '25 20:04 beattyml1

I'm curious about combining the queries at the end... the URL encoded query and any non-encoded query that was on the original URL. Are you actually using that in practice? When I was working on the base64 implementation, I was taking the stance that if you've encoded the remote URL, then that is expected to be the entire URL (modulo a baseURL if that's in use). Is there a reason to support both encoded and unencoded query parameters in the same URL?

willnorris avatar Apr 30 '25 05:04 willnorris

I went ahead and merged the base64 encoding feature. With that in place, adding support for URI encoding becomes quite minimal, provided you don't need to combine the encoded and un-encoded query strings as noted above.

https://github.com/willnorris/imageproxy/pull/461 has a much simpler implementation built on top of some of the base64 changes, but with all of your original test cases. See if that works.

willnorris avatar Apr 30 '25 08:04 willnorris

I'm going to go ahead and merge #461 since it does pass all of the test cases you've got here, and is a much simpler implementation, based on the previous work to add base64 encoding support. I think merging that commit will end up closing this.

If you do end up needing the combined query string support, I'd definitely love to hear more about the actual use case for that, and we can try to figure something out.

willnorris avatar Jun 06 '25 03:06 willnorris