aries-cloudagent-python Add startup flag --light-weight-webhook to trim down outbound webhook payload

When we create a credential offer with image, the image is repeated in following objects (dict) "credential_request", "cred_request", "credential_proposal", "cred_proposal", "credential_offer", "cred_offer", "credential_preview", "cred_preview", "values", "credentials~attach", "offers~attach"

a Credential offer with 30KB image resulted in a webhook sized at 227KB in my use case and resulted HTTP 413 as well as unnecessary stress on WAF rule regex processes on those encrypted string and base64 image string.

This startup flag will remove those objects only on outbound webhook and size went down to 7KB at most. I did preserve 1 object "cred_issue" which has the "rev_reg_id" and "cred_rev_id".

To me this is a more like a nice to have flag if someone ever encounter problem with image and webhook.

Previous related discussion PR #725 Issue #1222

Signed-off-by: Victor Lee [email protected]

Sep 13 '22 20:09 victorlee0505

Interesting problem. Looking forward to folks reviewing this and the best approach to solving the very real issue. I don't know nearly enough to know what is the right way to handle this, so adding @ianco @dbluhm @frostyfrog @andrewwhitehead to the discussion.

It would be a huge change, but perhaps the webhooks should only send new information to the controller (vs. the full state) and leave it to the controller to hold the data it needs to make decisions as the protocol proceeds. Or do that, and add in an endpoint that says "send all state for protocol X" whenever it is needed by a controller?

Thanks for the PR!

Sep 14 '22 10:09 swcurran

I'll agree that this is an interesting problem and I'm certain it will only get worse when implementors don't take file size into consideration, especially if upper management requests higher resolution imagery. For example, an image from an iPhone that controls embedded 3D data for biometric systems.

This issue, in terms of credentials, is prevalent during not only credential issuance, but verification/proof and endorsement. There are also plugins implementing features (such as the Q and A protocol) which may also cause increased webhook payloads if the system chooses to accept binary or non-fixed width payloads (I'm sure some malicious user is just waiting to send a script from Shakespeare as an answer to "why are you visiting the country").

I believe that what you've got here is a great first step towards solving the issue, but I believe that there's a lot more that can be done overall. Making sure to decrease unnecessarily duplicated data is one thing. Limiting the size of the payload in the face of untrusted data is another. I recommend that, for lightweight payloads, we send the minimal amount of data needed to request the full data where possible. Such as a presentation exchange ID during proof and allowing the system to retrieve the full record manually if they need it. This may be considered a bit extreme, but since we can't quite trust that the third party agent won't send us something massive, I feel as though it's the best compromise to prevent errors.

P.S. I hope that's all readable. Woke up and sent it from my phone. Really interesting problem we've got here :)

Sep 14 '22 13:09 TheTechmage

Kudos, SonarCloud Quality Gate passed!

0 Bugs
0 Vulnerabilities
0 Security Hotspots
0 Code Smells

No Coverage information
0.0% Duplication

Sep 14 '22 16:09 sonarqubecloud[bot]

Thanks for the comment and attention with great recommendations. I personally didn't like my approach either which I feel like I might removed something useful (although my invitation to offer to issue cred and revoke still works flawlessly, can't say about proofing yet ) and I am new to python, so this is the least destructive mod I can come up with before a deep dive.

Currently I only use webhook for state update and my workaround to get the full record is to manually fetch from rest API with cred_ex_id which give me the exact same thing back when needed.

Sep 15 '22 01:09 victorlee0505

Kudos, SonarCloud Quality Gate passed!

0 Bugs
0 Vulnerabilities
0 Security Hotspots
0 Code Smells

No Coverage information
No Duplication information

Jan 05 '23 21:01 sonarqubecloud[bot]

Hi @swcurran @frostyfrog,

I have reverted my previous approach (which remove stuff from the webhook payload, this is destructive and not safe IMO).

I have make new changes that is cleaner and safe. A new LightWeightWebhook object is created that only target to use on V10CredentialExchange and V20CredExRecord (so all other webhooks are not affected)

The new LightWeightWebhook object contain only state update and essential information. (basically the 1st level of the state protocol object)

This is inline with the idea send the minimal amount of data needed to request the full data where possible.

here is one of the example with the flag on topic=issue_credential_v2_0, state=credential-issued

{
    "auto_issue": false,
    "auto_offer": false,
    "connection_id": "0f75b936-1f0b-4db5-aa80-72e85cc08fce",
    "cred_ex_id": "7b48e9c2-dd31-4205-87c6-fdbb10efaf17",
    "initiator": "self",
    "role": "issuer",
    "state": "credential-issued",
    "thread_id": "b4fd1123-d8ef-4eae-ade9-6ef7b3ea7af8"
}

Feb 08 '23 15:02 victorlee0505

@ianco could you take a look at this one? Seems valuable to me.

Feb 08 '23 16:02 swcurran

@ianco could you take a look at this one? Seems valuable to me.

ok will do.

Feb 08 '23 16:02 ianco

Would it make sense to make this a breaking change with a “—useOldStyleCredExchangeProtocolStateObject” (or something equally weird) for backwards compatibility? It seems this is something we would rather have than not.

We are doing a 0.8.0 breaking release change Real Soon Now, so this would fit there.

Feb 09 '23 14:02 swcurran

I think the default is the old style verbose webhooks, no? so not a breaking change ... (unless you want the default to be the more terse format)

Feb 09 '23 16:02 ianco

PS failing test: FAILED aries_cloudagent/messaging/models/light_webhook.py::FLAKE8

Feb 09 '23 16:02 ianco

I’m suggesting that we want the default to be the terse form, and only if needed use the long form. AFAIK, all of the data is available in the long form — it’s just the extra copies of the data that are in the short form.

Put another way, we could be smarter about not repeating content in the Protocol State Object by blindly adding in each message, where the message repeats data from a previous message.

Or is that a bad idea? I guess things could be lost if the content changes across messages — then we would only have the latest version vs. all versions. How is that handled in this case? Is there checking that the content hasn’t changed between messages, and if so what is done?

Feb 09 '23 16:02 swcurran

I’m suggesting that we want the default to be the terse form, and only if needed use the long form.

I think this would be fine, however if we do this we should probably implement a terse form for the presentation exchange as well (this PR only addresses the cred exchange). (I think all the other exchanges are prob ok as is)

Feb 09 '23 17:02 ianco

So maybe go with this now, and add an Issue to change the presentation model as well, and then change to terse by default in the next breaking release? In theory, we could get it in for the final 0.8.0.

Feb 09 '23 17:02 swcurran

I am on it now should be done shortly. I will put 2 separate classes for the v1 and v2 cred exch in message package.

When we have a plan to make it default to be the terse form, maybe in future release we can tied long form to other flag like --debug or ---trace.

Feb 10 '23 14:02 victorlee0505

sorry my bad, still getting use to black and flake8...ha...

Feb 14 '23 19:02 victorlee0505

Kudos, SonarCloud Quality Gate passed!

0 Bugs
0 Vulnerabilities
0 Security Hotspots
0 Code Smells

No Coverage information
No Duplication information

Feb 15 '23 17:02 sonarqubecloud[bot]

@ianco @swcurran Thank you very much! I think next step is to create new issue on proof exchange webhook and terse form by default so that we can carry the discussion further.

Feb 15 '23 20:02 victorlee0505

Thanks @victorlee0505 — agree on that as the next step.

Feb 15 '23 20:02 swcurran

aries-cloudagent-python aries-cloudagent-python copied to clipboard

Add startup flag --light-weight-webhook to trim down outbound webhook payload

aries-cloudagent-python
aries-cloudagent-python copied to clipboard