azure-sdk-for-js icon indicating copy to clipboard operation
azure-sdk-for-js copied to clipboard

[Unified Recorder] Features

Open HarshaNalluru opened this issue 3 years ago • 4 comments

Background

Bare minimum prototype with storage blob client (underlying core-http) https://github.com/Azure/azure-sdk-for-js/pull/15826 node test.

Beyond the investigations and issues encountered so far in building a primitive prototype, a lot more needs to be done, following is the list of features that are required before we can attempt migrating the SDKs.

Features / Bugs to fix

  • [x] #19743
  • [x] #20401
  • [x] Sanitizers
    • [x] ContinuationSanitizer #18220
      • [x] Bug
    • [x] OAuthResponseSanitizer #18221
      • [x] Testing
    • [x] ResetSanitizer #18222
      • [x] Bug
  • [x] Add a check for the sanitizers to not be applied when the variable is undefined
  • [ ] Add logging env var to the docker command
  • [x] Recorder(and proxy tool) should allow tweaking the request bodies or paths(customizing outgoing requests) during playback
    • [x] uuids can be randomly generated within the SDK which makes the requests dynamic in playback, so customizing outgoing requests should be supported
  • [x] #19504
  • [x] #15954
  • [x] #18782
  • [ ] @witemple-msft idea of instrumenting mocha so that we don't have to do any of this beforeEach afterEach stuff to set up the recorder at all.
    • [ ] What if we could just do something like mocha -r @azure-tools/mocha-recorder and it would "just work" by attaching a recorder to every test context?
  • [x] Add transforms provided by the proxy tool #18225
  • [ ] Followups of #18322
    • [x] Allow providing docker run options along with the mocha and karma options for the test commands
    • [ ] Call mocha programmatically
      • [ ] experiment with using dev-tool's own register instead of ts-node/register. That way we can really centralize the concerns in our own config/scripts.
    • [ ] Call karma programmatically
    • [ ] @witemple-msft's comment
      • [ ] All of this is nitpicky, but it's best to be explicit about the mutability of the volume. If we are running in playback mode, we should bind the volume read-only. In record mode, we should bind it read-write. On Linux hosts that have SELInux (Red Hat, Fedora), we'll also need the Z flag to relabel the volume. We can test for SELinux using the selinuxenabled command, and that'll exit 0 if it's enabled on the machine:
    • [x] @scbedd raise a couple of interesting points
      • [x] #18712
      • [x] #19363
        • [x] Assign a name to the container image and stop the container using the name similar to what the powershell script does.
  • [x] Warn users to add replacements/transforms if we detect any secrets being leaked. - (Running credscan on the recordings before generating the reports)
  • [x] #21371
  • [x] Scope sanitization to be allowed to remove false positives from cred scan
  • [x] I(Will) would still like to see a solution where I can import and use one thing to use DefaultAzureCredential in record/live mode and NoOpCredential in playback mode. Maybe we can put that helper in the test-utils package. I(Harsha) was thinking that, but that means, test-utils can't be a dependency of recorder or recorder-new packages.
  • [ ] Drafting an idea(suggested by Will) - to extend mocha with recorder describeWithRecorder attempting - doesn't work yet · Azure/azure-sdk-for-js@133c215 (github.com) - to avoid calling start and stop for each test suite
  • [x] @witemple-msft's comments at #18585
    • [ ] It's clever and the implementation seems very simple. Since the variables are sent as JSON, I wonder why we can't store more than just strings. For example, I might like to store a number or even a nested object.
    • [ ] If you're interested in that idea, we can explore how the consumers of this API can strongly type that.
  • [x] #18241
  • [x] Add a session-level sanitizer #18223
  • [x] #18240
  • [ ] Run the docker command in detached mode and provide the logs
  • [ ] Figure out and make sure fiddler and Wireshark works to inspect conversations
  • [x] Feedback https://github.com/Azure/azure-sdk-for-js/pull/15826#discussion_r697009476
  • [ ] Make new recordings smaller by removing unnecessary properties.
  • [x] #19141
  • [x] #18691
    • #18580
  • [x] Migration guide
    • [x] Show the old and new recorder equivalents
    • [x] Dev-tool extension commands migration
    • [x] Mention the docker prerequisite
    • [x] Update the dev-tool readme with the new commands
  • [x] Handle https://github.com/Azure/azure-sdk-for-js/issues/19500
  • [x] #19140
  • [x] Is it an error to start twice without calling stop?
  • [x] Relationship/interaction between recorder and recorder-new - Proposal
    • [x] Eventually, we'll remove recorder-new and put everything in @azure-tools/test-recorder. (mainly because @azure-tools/test-recorder has been already published). So, the users will only need to import @azure-tools/test-recorder.
    • [x] Move the recorder-new code to the @azure-tools/test-recorder package after #18817 is merged and we lock the design from the user's perspective.
      • [x] Before this, release the recorder to publish the unreleased changes
  • [x] https://github.com/Azure/azure-sdk-for-js/issues/19560
  • [x] https://github.com/Azure/azure-sdk-for-js/issues/19559
  • [x] #19364
  • [x] Clean up ts-inputs, js-inputs node test commands - support all the test commands that we currently have
  • [x] #18897
  • [x] Add matchers provided by the proxy tool #18224
  • [x] Logs from proxy-tool in the CI https://github.com/Azure/azure-sdk-tools/pull/2252#discussion_r747749442
  • [x] Prototype with a package that uses core-http - [ongoing with storage node test #15826]
  • [x] Fix rollup config problem with the recorder v2 https://github.com/Azure/azure-sdk-for-js/issues/15931
  • [x] Recorder should support the browser tests as well - meaning, running the proxy tool must be handled.
    • [x] #16356
  • [x] #16345
  • [x] Not all clients that use core-http expose httpClient as an option, need to figure out a way to support such libraries
  • [x] https://github.com/Azure/azure-sdk-for-js/issues/17014
  • [x] #17013
  • [x] #17019
  • [x] #17384
  • [x] #15953
  • [x] Proxy tool should support regex replacements
  • [x] Proxy tool should understand connection string format and do the replacements accordingly
  • [x] Allow tweaking the recording for the end user programmatically
  • [x] Skipping the tests should be allowed (Choosing not to support this, users will have to rely on this.skip() and the if checks)
  • [x] Recordings have a lot of headers that can be dynamic, could be better if they are stripped out or updated with defaults
  • [x] Generate an application for each OS and let users install the application instead of the docker.
  • [x] #17070
  • [x] Understanding connection strings to do the replacements in recordings
  • [x] #17750
    • [x] a separate server to run a fake service and return responses
    • [x] to be able to test the proxy-tool's features
    • [x] should work for both node and browser
    • [x] server is supposed to run in parallel to the tests
    • [x] to be able to start the proxy-tool in parallel too
  • [x] Right now, the replacements/transforms only support one entry per request, can we do multiple to be sent at once?
    • [x] Proxy-tool - yet to support this directly
  • [x] #18580
  • [x] https://github.com/Azure/azure-sdk-for-js/pull/18585
  • [x] #17042
    • [x] Current prototype installs, runs, and stops the docker container manually, we need a script to make it black box and run when the test script is triggered

Resources

HarshaNalluru avatar Jun 17 '21 21:06 HarshaNalluru

Hey @HarshaNalluru ! Thanks for the details!

Couple comments!

Current prototype installs, runs and stops the docker container manually, we need a script to make it black box and run when the test script is triggered

Will have this for ya very soon. Messing about with permissions on our ACR to make it truely anonymous access.

Proxy tool should support regex replacements

We absolutely support this for header/URI quite well. I will finish updating the README with a whole bunch of detail and examples and forward it to ya. Should have that for you later today.

Proxy tool should support plain string replacements

Same as above.

uuids can be randomly generated within the SDK which makes the requests dynamic in playback, so customizing outgoing requests should be supported

Which test can I work against to repro? I've got an early version of this already implemented, but don't want to put you through debugging it against a docker container :P

Access tokens need to be masked automatically before saving the recordings

Should already be there, let's chat about where it's not happening!

After a brief discussion, I'll use the branch feature/unified-recorder to repro issues as soon as it's present on the azure-sdk-for-js repo.

For any questions I have that require additional detail, I'll send you a teams meeting request! Thanks!

scbedd avatar Jun 17 '21 22:06 scbedd

uuids can be randomly generated within the SDK which makes the requests dynamic in playback, so customizing outgoing requests should be supported Which test can I work against to repro? I've got an early version of this already implemented, but don't want to put you through debugging it against a docker container :P

Assume a test generates a uuid, the saved request has a different uuid from what will be the request during playback. So, this needs customization of the request in playback with a regex of sorts.

Access tokens need to be masked automatically before saving the recordings Should already be there, let's chat about where it's not happening!

Sure, I just logged this issue with all the details so they are being traked somewhere, I'll strikeoff one-by-one as I test them. :)

After a brief discussion, I'll use the branch feature/unified-recorder to repro issues as soon as it's present on the azure-sdk-for-js repo.

The PR #15826 isn't merged yet, feature branch can be used once that is merged. Until then, the only way would be to use my branch.

HarshaNalluru avatar Jun 17 '21 22:06 HarshaNalluru

Output of our discussion on 6/21.

Recorder should pass the test path to the proxy tool to be able to save the recording at the desired location

Sitting on this one given the new startup scripts.

Recorder(and proxy tool) should allow tweaking the request bodies or paths(customizing outgoing requests) during playback

Need to think about this more. The test pr has exact context for example.

Access tokens need to be masked automatically before saving the recordings

This is done.

Additional q not in set above. Need to ensure ContinuationSanitizer works appropriately. I will work through an example. Pass me an exact test I can work through. Here is an example recording

Scope sanitization to be allowed to remove false positives from cred scan

The bodies of the responses have a string that's NOT a secret, but still trips credscan. Here's an example. Check for scope argument.. Write a sanitizer to clean it up.

Recordings have a lot of headers that can be dynamic, could be better if they are stripped out or updated with defaults

I will add a RemoveHeader sanitizer. You call it before your test run.

Allow tweaking the recording for the end user programmatically

Use a transform or a sanitizer.

Filing issues for:

  • OS Compiled Executable
  • Add RemoveHeader Sanitizer
  • ScopeURL request body replacement <-- this fits well into BodyRegexSanitizer implementation.
  • MSAL Ignores body matching on specific request only. Not on all request/responses within a recording.

scbedd avatar Jun 21 '21 20:06 scbedd

Hey @harshanalluru, got some updates for ya.

General Discussion and Advice

To begin, you don't need to install a specific tag of the docker image anymore. Use the latest tag when accessing the docker image. Additionally, local builds are perfectly functional now.

When you're running in live mode, before any tests invoke, ensure that you hit up the Admin/AddSanitizer to add appropriate sanitization.

I believe there should be a base set or what-have-you that should be present generically, with the ability to add specific items for your service requirements.

In Python, I'd handle this with a session level fixture, with additional customization provided as part of test configuration per specific service.

For all of the below examples, feel free to hit /Info/Available for additional detail. I also have unit tests that exercise these sanitizers if you need an example of the actual regex as well. I used your scope replacement for one of them.

Random TableNames, other resources that are randomly generated.

I did give some thought to your original question WRT the whole:

Table name is randomly generated, so we retrieve what the last random name from the recording prior to invoking the test.

I don't really think there is a way to do this with a remote server. At least, not one that would be performant. I think the better way to do this is one of two options

  1. During record, add a sanitizer that updates stuff like table name (or whatever else is randomized) to a understood "filler". So table1249 changes to targettable. During playback mode, if you activate that same sanitizer, the incoming request will be sanitized to the same table name value and match the recording.
  2. During record, use the same sanitizer as in #1, and just directly use those filler values when you're running in playback.
  3. (As of yet uncoded), we create a generalized matcher that just ignores certain fields when matching your request.

Of course, I'm open to further feedback and discussion here.

Remove-Header Sanitizer

This one is pretty straightforward, pass along a comma separated list of key or keys for headers that you want gone from the saved recording.

POST
url: <proxyURL>/Admin/AddSanitizer
headers: {
    "x-abstraction-identifier": "BodyRegexSanitizer"
}
body: {
    "headersForRemoval": "Location,<your-other-header>",
}

Spaces are trimmed, so you don't need to worry about whitespace around the commas. You also aren't limited to a single one of these, so you could add multiple Remove-Header that each removes one. It's really up to you.

Body-Key Sanitizer

Use this sanitizer if you want to update a specific member w/ a regex or whole value replace. The document string in /Info/Available does link here, but the documentation for available search through your body is documented here

POST
url: <proxyURL>/Admin/AddSanitizer
headers: {
    "x-abstraction-identifier": "BodyKeySanitizer"
}
body: {
    "value": "<your-new-value>",
    "regex": ".*",
    "jsonPath": "$.TableName"
}

Note that the above example is just a "complete" replacement. You can do a scoped replacement just like I have for the BodyRegex example.

Body Regex Sanitizer

Use this sanitizer if you aren't dealing with a JSON body. It does free-text regex replace. Your scope replacement ask is an example of when you'd use this boyo.

POST
url: <proxyURL>/Admin/AddSanitizer
headers: {
    "x-abstraction-identifier": "BodyRegexSanitizer"
}
body: {
    "value": "<your-new-value>",
    "regex": "scope\\=(?<scope>[^&]*)",
    "groupForReplace": "scope"
}

In Summary

I believe the above should meet the vast majority of the needs you have. I have a ContinuationToken written up, but before I ask you to use that, I'd prefer to eat the bullet on debugging the inevitable issues :)

Additionally, I believe that providing an superceding sanitizer that does URI, Body, and HEAD regex replacement with the same regex may be useful, but that's not present currently. LMK what you think about this one.

I expect more issues will arise as this get's coded into real tests, but I'll tackle those when they occur. Easiest way to get through it is to hit the failure.

scbedd avatar Jul 07 '21 00:07 scbedd