hoverfly icon indicating copy to clipboard operation
hoverfly copied to clipboard

Add lifecycle hooks to run middleware

Open tommysitu opened this issue 6 years ago • 11 comments

At the moment, middleware can only intercept and modify outgoing requests in capture mode. This is inadequate if someone wants to mask sensitive information in both request and response when capturing.

tommysitu avatar Jan 26 '18 11:01 tommysitu

It makes sense to extend Hoverfly as there are probably other use-cases, but having to do masking in middleware seems like a leaky solution for larger teams as it has to be applied in real-time. Developers will forget and policy would be difficult to apply.

We could do with more use-cases on the general topic of masking.

JohnFDavenport avatar Feb 12 '18 09:02 JohnFDavenport

After reading this GitHub issue: https://github.com/arquillian/arquillian-organization/issues/10 I get a feeling that it is not necessary a middleware feature

Problem Request/response pair could contain sensitive data from capturing live endpoints. Storing them as plain text in Github is a bad idea. Manually removing them is error-prone and time-consuming.

Use case The original use case from @lordofthejars goes like this:

  • In Capture mode, hoverfly should be able to send the original request and receive the original response, but the exported simulation data should have the sensitive data masked.
  • Switching to simulate mode and using the modified data shouldn't break the original tests.

Implications

  1. One should be able to define which fields in the request and response to obfuscate
  2. Data masking should be done before a request/response pair is stored.
  3. If data masking is performed by middleware, Capture mode should support middleware execution similar to Modify mode, but also be able to preserve the original request when sending to target service.
  4. Sensitive data in the request can be replaced by GlobMatcher
  5. Sensitive data in the response can be encrypted (but requires decryption in Simulate mode, and encryption key from the user) or replaced by random value (it might break the tests that assert an exact value on this field in Simulate mode)

tommysitu avatar Mar 13 '18 11:03 tommysitu

The change of title reflects that we won't be adding explicit support for data masking in Hoverfly itself. Lifecycle hooks - yes. Data masking - no.

JohnFDavenport avatar Jul 18 '18 08:07 JohnFDavenport

The current middleware does not support request modification in simulate mode, and not support response modification in the capture mode. The way it works is probably to tackle problems such as:

  • A delay middleware is only intended to be used for SIMULATE mode, and switching to CAPTURE mode should not introduce any latency.
  • A middleware which modifies requests is supposed to be used for CAPTURE mode, and you don't want to apply it again when Hoverfly is switched to SIMULATE mode, otherwise the requests could be modified twice.

Making middleware to apply on both request and response for simulate/capture/spy mode is desirable for some use cases but has the potential to break backward compatibility.

tommysitu avatar Feb 22 '19 17:02 tommysitu

What I propose is that middleware should behave as it is, but you can pass an additional flag to force it to run at a certain point regardless of the mode. For example,

$ hoverctl middleware --binary python --script intercept_requests.py --hooks "PRE_REQUEST_HANDLE"

It allows to intercept requests using middleware before request matching in simulate mode.

Available hooks could be:

  • PRE_REQUEST_HANDLE: before hoverfly processes the requests, i.e. before matching in simulate mode, or before saving the requests in capture mode
  • POST_REQUEST_HANDLE: after the request is saved in capture mode, but before it's forwarded to the destination.
  • PRE_RESPONSE_HANDLE: before hoverfly saves the response for capture mode
  • POST_RESPONSE_HANDLE: after hoverfly saves the response for capture mode

tommysitu avatar Feb 22 '19 18:02 tommysitu

Looks good to me, except you've got PRE_REQUEST_HANDLE, etc when I assume you meant PRE_REQUEST_HOOK.

It is probably better if neither HOOK nor HANDLE was added.

Two questions:

  1. Can the same middleware run multiple times if more than one hook is specified?
  2. Can the middleware get or derive any context information to tell it what sort of hook it is?

I assume the answer is No to both these questions, but it's worth checking.

JohnFDavenport avatar Feb 22 '19 19:02 JohnFDavenport

The suffix_HANDLE in the name just provides clarity that the hook is before/after Hoverfly has processed the request/response.

For question 1, yes. You can pass comma seperated list of hooks, and invoke the same middleware at different hooks.

For question 2, maybe no initially, but it should be possible to pass context information into the middleware.

tommysitu avatar Feb 25 '19 09:02 tommysitu

Ok. Q1 is good. but wrt Q2. I think that then compromises the feature.

What if the lifecycle stage could be passed in metadata as that shouldn't affect existing middleware, or to be 100% sure, pass it if --hooks is stated?

JohnFDavenport avatar Feb 27 '19 11:02 JohnFDavenport

Currently, we do not need this feature anymore because we bypassed this information into fake information manually. Of course, it might still be interesting to have this feature but let's say that it is not urgent for our use case.

lordofthejars avatar Feb 27 '19 12:02 lordofthejars

@tommysitu if we need to build this feature, then I would like to take this up. You can assign this to me. I can start working on the same.

kapishmalik avatar Jan 15 '23 17:01 kapishmalik

@tommysitu you can assign this to me. I have started working on the same.

kapishmalik avatar Feb 11 '23 10:02 kapishmalik