aries-rfcs icon indicating copy to clipboard operation
aries-rfcs copied to clipboard

Signing strings in RFC 0017 Attachments

Open swcurran opened this issue 3 years ago • 31 comments

We have a need to sign a string using the RFC 0017. We can't use the base64 of the data and the data is not JSON. It's also not appropriate to go into a link. Specifically, we're signing an Indy transaction to be submitted to an Indy ledger.

@kdenhartog @dhh1128 -- the right thing to do to add a "string" type analogous to the base64 data?

swcurran avatar Oct 15 '20 21:10 swcurran

@ianco @jadhavajay FYI

swcurran avatar Oct 15 '20 21:10 swcurran

That proposal doesn't sound crazy to me. However, why can't you use encode a string as base64?

dhh1128 avatar Oct 15 '20 23:10 dhh1128

I'd think that adding a string would be a bit odd and we'll start to see a precedent where each different type of data will add a different property as optional. It could be done that way I suppose, but I do tend to agree with Daniel why not just base64url encode it?

The alternative option is to not put it in json/link/base64 base properties (just leave these off) and instead just only provide the jws and the content would get extracted from the JWS payload.

kdenhartog avatar Oct 16 '20 01:10 kdenhartog

That proposal doesn't sound crazy to me. However, why can't you use encode a string as base64?

Indy requires the signature be on the string, not on the base64 encoding of the string. Or more generally, the signature needs to be on what the signature needs to be on for the given application.

swcurran avatar Oct 16 '20 01:10 swcurran

@kdenhartog I'm not sure how many more data types there are that can be signed that can be represented in a message. I don't think binary data can be used. Or is that possible?

swcurran avatar Oct 16 '20 01:10 swcurran

This gets into rich schema country: consider en-CA $0.05 = 0,05$ fr-CA. Quaternions, complex numbers, floating points, the list goes on.

sklump avatar Oct 16 '20 12:10 sklump

Why would base64-encoding the string be wrong? We do it already with JSON dumps, which are strings. The application must know what to do with the signed content in any case. I must be missing some context here.

sklump avatar Oct 16 '20 12:10 sklump

@swcurran : I may have misunderstood something. My understanding (apparently not true?) was that a signature is always over or against raw bytes. Base64 encoding is just a way to guarantee that those raw bytes can be conveyed without data loss in a string -- but we should never be signing the output of base64 encoding, but rather the bytes that are encoded by base64:

decode_to_raw_bytes(Base64(bytes)) --> bytes; sign(bytes) NOT sign(Base64(bytes))

If my understanding were true, then utf8 strings are already raw bytes, so we can just sign them. But we can also encode them as base64, and then run them through the same base64 process:

decode_to_raw_bytes(Base64(string)) --> bytes which = utf8 string; sign(bytes)

if we are actually doing sign(Base64(bytes)), I think that's a bug and we need to fix it.

dhh1128 avatar Oct 16 '20 14:10 dhh1128

Ah....that's my misunderstanding - I had thought the signing was on the base64 data itself. Thanks for clarifying. As you can tell, this is not my specialty :-).

I'll close this.

swcurran avatar Oct 16 '20 15:10 swcurran

Before we close this, @swcurran , let's confirm which algorithm we're using. I only thought it worked a particular way; I'm no longer confident that my thinking was accurate.

dhh1128 avatar Oct 16 '20 15:10 dhh1128

There are two verifications we need:

  1. Verify the semantics required by the RFC. It's possible that the verbiage is unclear or inconsistent.
  2. Verify that implementations all understand those semantics the same way.

Regarding #1, I'll take a task to check carefully.

Regarding #2, I can take a task to check Evernym's impl. Can we have a few other impls checked by their respective experts -- e.g., acapy by @sklump or @andrewwhitehead , Aries FW Go by @troyronda or @llorllale , Trinsic by @tmarkovski ?

dhh1128 avatar Oct 16 '20 15:10 dhh1128

Aca-py signs message=(b64_protected + "." + b64_payload).encode("ascii"), not raw bytes, as per (JWS) RFC 7515 - see https://tools.ietf.org/html/rfc7515#page-15: 5. Compute the JWS Signature in the manner defined for the particular algorithm being used over the JWS Signing Input ASCII(BASE64URL(UTF8(JWS Protected Header)) || '.' || BASE64URL(JWS Payload)). ...

sklump avatar Oct 16 '20 17:10 sklump

If the intent is to produce JWS in compact form for a payload of Indy transaction, which is string/json, I believe we are aligned as the method output is dictated by the JWS/JOSE library used per RFC 7515.

tmarkovski avatar Oct 16 '20 20:10 tmarkovski

@sklump -- I don't know how to interpret what you have said. Is the signature on the base64 string itself, or on the data that has been base64-ized?

@tmarkovski -- I don't know what you mean in your statement above, but could you answer what it is the aries-framework-dotnet implementation is signing?

swcurran avatar Oct 16 '20 20:10 swcurran

So I checked with Evernym folks. We have several signing algorithms separate from RFC 0017 that are not JWS-centric, and we are inconsistent; in at least one place, we sign the raw bytes, and in another we sign the base64-encoded representation instead. However, this data point is mostly irrelevant, because it's not about RFC 0017 specifically.

After reading RFC 7515, I concur with @tmarkovski and @sklump that if we are using JWS in Aries RFC 0017, we have to sign the stream of bytes that consists of:

ASCII(BASE64URL(UTF8(JWS Protected Header)) || '.' || BASE64URL(JWS Payload))

This is different from my assumption that I articulated above, where I said I believed we needed to sign the raw bytes. I was incorrect.

I believe the text of RFC 0017 may benefit from a slight clarification, as follows.

Old text:

Embedded and appended attachments support signatures by the addition of a data.jws field containing a signature in JWS (RFC 7515) format with Detached Content. The payload of the JWS is the raw data of the attachment, whether externally referenced or encoded in base64 format, and is not contained within the signature itself.

Proposed new text:

Embedded and appended attachments support signatures by the addition of a data.jws field containing a signature in JWS (RFC 7515) format with Detached Content. The payload of the JWS is the raw bytes of the attachment, appropriately base64url-encoded per JWS rules. If these raw bytes are incorporated by value in the DIDComm message, they are already base64url-encoded in data.base64 and are thus directly substitutable for the missing data.jws.payload field; if they are externally referenced, then the bytes must be fetched via the URI in data.links and base64url-encoded before the JWS can be fully reconstituted.

If people agree with this clarification, I am happy to raise a PR.

All of this is somewhat of a distraction from @swcurran 's original question, which was about strings. I would prefer that string values appear inside data.base64 just like the raw bytes of files or other data types, and that we not create a data.string representation. I don't see any reason that a base64'ed string inside data.base64 would not work, except that it feels a bit clumsy. But an attachment that's a simple string feels a bit odd, anyway...

However, I'm not stuck on this answer. If the consensus of the group is that we want data.string, I could go along. We would just have to update my clarification above so that we explain how to deal with a JWS payload in an additional possible location.

dhh1128 avatar Oct 16 '20 22:10 dhh1128

aries-framework-dotnet doesn't support attachment signing via the jws field, future implementation would conform to the JWS spec represented in JSON form (as opposed to compact).

tmarkovski avatar Oct 16 '20 22:10 tmarkovski

@dhh1128 -- please go ahead and PR.

Back to my problem. Since I want to sign exactly the bytes of the transaction string, the use of JWS is of no value to me without an update to indy-node, which I would prefer not to do. More generally, there are very likely to be other cases where an application that we don't control wants a non-JWS signature over a chunk of data.

Is the recommendation to make an indy-specific protocol for our use case or to add a mechanism to use other than JWS signature schemes?

swcurran avatar Oct 16 '20 22:10 swcurran

@swcurran Would adding a field text in the attachment spec, that would contain a plain text string help? We can then easily produce JWS payload using base64(ascii(text)). If that would help, I'm not opposed to adding this data.text in the spec as new content format, in addition to data.json, data.base64, data.links.

Edit: wrote my reply too fast. I see @dhh1128 proposed data.string, which is exactly what I meant by data.text.

tmarkovski avatar Oct 16 '20 22:10 tmarkovski

I don't think it will work.

It looks like JWS signs the data plus some other information eg. ASCII(BASE64URL(UTF8(JWS Protected Header)) || '.' || BASE64URL(JWS Payload)) implies to me that there is more than just the "JWS Payload" signed. For the Indy "Endorser" use case I need only the specific data I indicate signed.

So even if the signature algorithm is the same for JWS and Indy, JWS is not signing what I need signed.

swcurran avatar Oct 16 '20 22:10 swcurran

@swcurran :

@dhh1128 -- please go ahead and PR.

Done. See https://github.com/hyperledger/aries-rfcs/pull/558.

dhh1128 avatar Oct 16 '20 23:10 dhh1128

Is the recommendation to make an indy-specific protocol for our use case or to add a mechanism to use other than JWS signature schemes?

I am not confident in suggesting an answer because I don't understand how DIDComm intersects with Indy ledger transactions; those have always been entirely independent problem domains in my mind. However, if I had this requirement, I might be tempted to introduce a new field into data -- perhaps data.rawsig or data.indysig -- that would allow an Indy-style signature over the bytes in data.base64. It seems to me that what you really need to change here is not the location of the signed data, but the semantics/convention of the signature itself.

When I say "introduce a new field", I imagine that such fields could either be standardized or not. If you simply started using it, without updating the RFC, then only parties who know what the new field means would pay any attention; everybody else would be required to ignore it. If you standardize it by raising a PR against the RFC to document the field, I wouldn't have a strong vote against it, but I'd want to know how many different signature types we imagine, because a big proliferation would feel yucky to me.

dhh1128 avatar Oct 16 '20 23:10 dhh1128

For more on the use case.

We want an ACA-Py agent that is a DID Author (but not an Endorser) to be able to write their own objects to an Indy ledger. To do that, they need to:

  • construct the transaction,
  • send it to an Endorser they know about
  • the Endorser will return a signature for the transaction
  • the Author adds the signature to the transaction, and
  • submit the transaction for execution.

We're trying to do this with a generalized "Please Sign This" protocol that makes use of existing decorators and protocols.

We can then update the Admin API for something like "Publish Schema" to optionally take an Endorser connection ID.

HackMD document about this: https://hackmd.io/5LzMhfsMQBevB5V2tKz4hA?view

swcurran avatar Oct 16 '20 23:10 swcurran

For more on the use case

Cool idea. Makes sense.

I think the "Please sign this" protocol needs to be able to tell parties to provide a signature in one or more formats. JWS would be one, and the one expected by Indy would be another. This leads me to think that adding data.indysig or similar would make sense.

dhh1128 avatar Oct 16 '20 23:10 dhh1128

Rethinking this. I don't think we need to use a signed attachment to return the signature. We need to send the data to be signed (perhaps as an unsigned attachment), but the response message need only be a signature of the requested data - e.g. just a string (or maybe a JWS if requested). There is no need to send back the data used to construct the signature.

We still have the issue of sending canonicalized data to be signed and for the signee to know what to sign. For example, we send a base64 encoding of the data to be signed, and a flag to indicate if signature should be on the base64 data or if it should be decoded first. We might also send an indicator to say the type of signature, with an enumeration of the types that could be supported - JWS, INDYTXN for now.

@ianco -- perhaps we adjust the proposed protocol accordingly?

swcurran avatar Oct 17 '20 17:10 swcurran

This could also be a good candidate for custom message family. It'll allow the implementation to use the existing message handler pipeline and fulfill the request. Optionally, threading can be used to correlate this to credential issuance flows. Both of the fields payload and signed_payload are directly usable in the indy-sdk api's for append_endorser_request which produces the signature appended request.

{
   "type": "https://didcomm.org/indy/1.0/endorsment-request",
   "body": {
        "payload": "<indy request to be signed>",
        "payload_checksum": "<some integrity check>"
    }
}

{
   "type": "https://didcomm.org/indy/1.0/endorsment-response",
   "body": {
        "signed_payload": "<signed endorser request>",
        "endorser_key": "<key used to sign the payload>"
    }
}

Just a suggestion.

tmarkovski avatar Oct 17 '20 18:10 tmarkovski

FYI the new protocol/message family design doc is here (w.i.p. of course): https://hackmd.io/5LzMhfsMQBevB5V2tKz4hA?view

Agree with @swcurran 's latest comment, I'll update the design doc accordingly.

ianco avatar Oct 18 '20 18:10 ianco

@sklump -- I don't know how to interpret what you have said. Is the signature on the base64 string itself, or on the data that has been base64-ized?

The signature is on two dot-delimited base64 blurbs, encoded to bytes via ASCII (note that base64 characters and the dot fall into the ASCII subset of UTF-8, so there are no i18n problems here). This applies only to aca-py signatures in Aries RFC 17 signed attachments, as the standard here dictates the use of IETF RFC 7515 (JWS) and JWS dictates what to sign.

sklump avatar Oct 19 '20 10:10 sklump

I've updated the "please sign me" doc here: https://hackmd.io/5LzMhfsMQBevB5V2tKz4hA?view

ianco avatar Oct 19 '20 16:10 ianco

@kdenhartog I'm not sure how many more data types there are that can be signed that can be represented in a message. I don't think binary data can be used. Or is that possible?

I was thinking more about basically any MIME type that would get embedded. There's ways to solve that problem if people want to handle that (like @dhh1128 described above where it's just encoded in the data.base64 property is what I was thinking as well), but that doesn't address your original question so I'm fine leaving that out of scope and focusing on your problem for now though.

Coming back to your original question, I could see an additional way to handle this.

Rather than using the JWS functionality to sign this instead, what you could do is pass the serialized and signed data without the JWS property in the base64 property and indicate it's the Indy formatted signature with the mime-type field. So an example would look like this:

{
  "@type": "https://didcomm.org/to/be/defined",
  "comment": "Here's a signed indy transaction",
  "report~attach": {
    "mime-type": "application/indy-txn-sig",
    "filename": "example-txn.json",
    "data": {
      "base64": "eyJ0eXAiOiJKV1QiLA0KICJhbGciOiJIUzI1NiJ... (bytes omitted to shorten)",
      }
    }
  }
}

Then the recipient would receive the attachment, decode the data and they've got a signed transaction without needing to add any new properties and still allowing the signature format to stay as is.

However, from the looks of what @ianco proposed, you guys have found a pretty good route for this.

kdenhartog avatar Oct 20 '20 05:10 kdenhartog

So I checked with Evernym folks. We have several signing algorithms separate from RFC 0017 that are not JWS-centric, and we are inconsistent; in at least one place, we sign the raw bytes, and in another we sign the base64-encoded representation instead. However, this data point is mostly irrelevant, because it's not about RFC 0017 specifically.

After reading RFC 7515, I concur with @tmarkovski and @sklump that if we are using JWS in Aries RFC 0017, we have to sign the stream of bytes that consists of:

ASCII(BASE64URL(UTF8(JWS Protected Header)) || '.' || BASE64URL(JWS Payload))

This is different from my assumption that I articulated above, where I said I believed we needed to sign the raw bytes. I was incorrect.

I believe the text of RFC 0017 may benefit from a slight clarification, as follows.

Old text:

Embedded and appended attachments support signatures by the addition of a data.jws field containing a signature in JWS (RFC 7515) format with Detached Content. The payload of the JWS is the raw data of the attachment, whether externally referenced or encoded in base64 format, and is not contained within the signature itself.

Proposed new text:

Embedded and appended attachments support signatures by the addition of a data.jws field containing a signature in JWS (RFC 7515) format with Detached Content. The payload of the JWS is the raw bytes of the attachment, appropriately base64url-encoded per JWS rules. If these raw bytes are incorporated by value in the DIDComm message, they are already base64url-encoded in data.base64 and are thus directly substitutable for the missing data.jws.payload field; if they are externally referenced, then the bytes must be fetched via the URI in data.links and base64url-encoded before the JWS can be fully reconstituted.

If people agree with this clarification, I am happy to raise a PR.

All of this is somewhat of a distraction from @swcurran 's original question, which was about strings. I would prefer that string values appear inside data.base64 just like the raw bytes of files or other data types, and that we not create a data.string representation. I don't see any reason that a base64'ed string inside data.base64 would not work, except that it feels a bit clumsy. But an attachment that's a simple string feels a bit odd, anyway...

However, I'm not stuck on this answer. If the consensus of the group is that we want data.string, I could go along. We would just have to update my clarification above so that we explain how to deal with a JWS payload in an additional possible location.

I like what you're proposing here to reduce bloat and redundancy in the message. To further contribute to the language, we could directly reference the detached JWS format which essentially is what you've described.

kdenhartog avatar Oct 20 '20 05:10 kdenhartog