Add a secure token feature
Hi again,
Here is another feature I need for a CDN I am using:
https://github.com/batiste/imaginary/commit/5cf91b2da6a7c2d9e51c5954cff50be5b9303d4d
MD5 seems to be still acceptable for this low stacke protection https://crypto.stackexchange.com/questions/9336/is-hmac-md5-considered-secure-for-authenticating-encrypted-data
I can do a proper PR if you guys pre approve this new feature.
Would not be key flag enough for you?
Not really unless I misunderstood the key feature. I want to expose the service to the public directly, or just behind a CDN for example.
I tried to use this https://www.keycdn.com/support/secure-token/ but unfortunately the query parameters were not taken in account in the hashing algorithm.
By using this method I can expose the URLs directly to the public e.g.
mycdn.com/crop?with=400&height=399&url=blop.png&token=213123kl211231
So the user cannot change any of the parameters without knowing the secret.
That sounds like a very ad-hoc requirement. I want to keep imaginary as much general purpose and domain-agnostic as possible.
That being said, if a similar requirement is required by more users and there's an industry standard that can be easily supported, I will revisit the feature request in the future.
Imaginary was never intended to be accessible publicly by a browser or another untrusted client? It seemed to me that would be a quite common need. A malicious client could easily trigger expensive operations.
Imaginary was never intended to be accessible publicly by a browser or another untrusted client? Yes, but always as a partially secure consumer. That's why it implements built-in a key based authorization or a throttle mechanism, which are API services recurrent approaches.
For fully public systems, it would be convenient that architects implement a custom security layer on top of imaginary that fits domain-specific needs, that as you can imagine, can be fairly simple or fairly complex based on each particular scenario.
As an example, you can implement an API proxy/gateway that does all the authorization layer and rate limiting based on the access policy you want (e.g: IP, API token...)
One of the design decision of imaginary was keeping it simple. That's why it can fit better for more people, since it's not opinionated in other areas more than doing it's core responsibility: processing images.
Just as a reference this is the full implementation of this middleware (80 LoC):
https://github.com/batiste/imaginary/blob/master/middleware.go#L104-L186
I am about to put it on production soon. I will report the results.
Great. You're very welcome to use, fork and adapt imaginary for ad-hoc needs. This is a key benefit of the OSS.
@batiste said: Imaginary was never intended to be accessible publicly by a browser or another untrusted client? It seemed to me that would be a quite common need. A malicious client could easily trigger expensive operations.
I don't think this problem should be solved in imaginary. For instance, you could easily rate limit incoming requests with a small proxy like nginx. Also, you should set up imaginary at a hidden-from-public hostname that only whitelists your cdn traffic.
With that said, one feature that would help you against things like these would be limiting the attack vector by only allowing a a few permutations on the resizes; for instance 5 sizes. That way, you now control how traffic flows and how many variations of the traffic you will get.
@jbergstroem
rate limit incoming requests with a small proxy like nginx
I am unconvinced a simple proxy would work because you wouldn't need much requests to put a service down... And in my current use a client (browser) is allowed to create hundreds of request by seconds because we have many images on a page. Everything is behind a cache proxy already so only the "new" requests go though.
But I will also need to have this "permutation limit" feature because now I am faced with a second problem: mobile app. This would be very insecure to put the secret inside a mobile app (as you could easily extract it) so the secret token approach falls apart in this use case.
@batiste said: I am unconvinced a simple proxy would work
It entirely depends on how you choose to scale it. Seeing how fast vips (and this library) is as well as the fact that its entirely stateless; auto-scaling it behind something like nginx or most other http proxies is pretty much a solved problem.
@batiste said: But I will also need to have this "permutation limit" feature because now I am faced with a second problem: mobile app
I think you're looking at this the wrong way; but yes, a default deny rule is really what you want. Allow and control the funnel from your CDN; including things like permutations. It might also be good to have in mind that things like DPI comes to play here. Explore the vary support of your cdn and use it to the fullest when planning your caching strategy.
White-listing actions isn't uncommon for this type of service. Pre-heating, if that's possible in your case, is common practise as well.
You have (at least) the following options:
- Pre-heat a CDN (generate all combinations and don't let Imaginary listen on a public interface) and don't allow for dynamic generation
- Harden Imaginary with a reverse-proxy
You could write a thin client that acts as a proxy for Imaginary (I've written one myself: https://github.com/Dynom/proxima) and add the business rules you desire.
Perhaps for some inspiration: Using k8s I've solved it by having my proxy listen on the public interface and imaginary on private interfaces. Either is loadbalanced, so it remains a HA setup.
You can use it for inspiration and simply white-list parameter values for width/height or whatever operations you support.
@Dynom pre-heating the CDN is quite difficult in my case, I have about 10 millions images to work with, and they are changing daily.
The reverse proxy could help. If I whitelist specific operations and image width and height then that would work nicely! I will probably ending up doing something like that.
@batiste said: pre-heating the CDN is quite difficult in my case, I have about 10 millions images to work with, and they are changing daily.
Just prime them when they are changed? I assume if you are indeed changing images you need to prune them anyway.
edit: just adding as a general hint; it's always better to pre-generate your thumbnails – be it priming through a cdn/proxy or generating on upload – if you know the set required.
@jbergstroem they are changing in the sense that there is new ones all the time. The source file don't change per say. I would love to avoid pre generating and it is what I have been doing for now and it is working nicely with a good CDN cache. I don't really control the upload pipe ATM.
@batiste I had the same requirements due to mobile access. Even with a cache proxy behind, I want to only authorize operarions from requesters how know the server key/salt.
I have pull the PR #194 which implements an URL signature mechanism. In my case my API engine generates the URL signature and requesters can't change any parameter at all.
I suppose this issue can be resolved now that https://github.com/h2non/imaginary/pull/194 is merged in.