discord-api-docs icon indicating copy to clipboard operation
discord-api-docs copied to clipboard

aead_aes256_gcm encryption modes

Open Zipdox2 opened this issue 1 year ago • 9 comments

Description

The documentation is missing information about the aead_aes256_gcm encryption modes. There was an issue opened about this before but it was closed as these modes were "experimental". The desktop client has been using these modes for a while now so I assume they aren't experimental anymore. Using AES256 encryption could speed up cryptography in bots significantly as many processors have acceleration for it.

Steps to Reproduce

Read the docs https://discord.com/developers/docs/topics/voice-connections#establishing-a-voice-udp-connection-encryption-modes

Expected Behavior

The table lists the aead_aes256_gcm based modes as well.

Current Behavior

It only lists the xsalsa20_poly1305 based modes.

Screenshots/Videos

No response

Client and System Information

N/A

Zipdox2 avatar Apr 07 '23 11:04 Zipdox2

More specifically this is the Ready payload that I receive:

{
  "op": 2,
  "d": {
    "streams": [
      {
        "type": "video",
        "ssrc": 256235,
        "rtx_ssrc": 256236,
        "rid": "",
        "quality": 0,
        "active": false
      }
    ],
    "ssrc": 256234,
    "port": 50010,
    "modes": [
      "aead_aes256_gcm_rtpsize",
      "aead_aes256_gcm",
      "aead_xchacha20_poly1305_rtpsize",
      "xsalsa20_poly1305_lite_rtpsize",
      "xsalsa20_poly1305_lite",
      "xsalsa20_poly1305_suffix",
      "xsalsa20_poly1305"
    ],
    "ip": "redacted",
    "experiments": [
      "fixed_keyframe_interval"
    ]
  }
}

If we could get documentation on all modes, that'd be great. The currently undocumented modes are:

  • aead_aes256_gcm_rtpsize
  • aead_aes256_gcm
  • aead_xchacha20_poly1305_rtpsize
  • xsalsa20_poly1305_lite_rtpsize

OoLunar avatar Apr 08 '23 03:04 OoLunar

I began work around a week ago mapping these encryption modes and voice API v7. I have not yet had success using any of the aead-labeled modes (technically xsalsa20 is also an aead mode, though) as I assume Discord is passing something in the AAD field during encryption. Referencing RFC 7714 section 8.2, it looks like the AAD should be a repeat of the header we generate for the payload, but that didn't work for me during testing, so I could be entirely wrong. I was able to both encrypt and decrypt arbitrary data, and I assume the RTP header hasn't changed, and working under the assumption the new ciphers use the same nonce generation as xsalsa20_poly1305_lite (xsalsa20_poly1305_lite_rtpsize does btw -- and RFC standard suggest the others should as well), the only field I haven't addressed would be the optional AAD field. It doesn't look like _rtpsize means anything, as far as I've been able to tell, but I've also not been able to find reference to it in any RFC standard or cipher modes, so it may be left as an internal note.

Given work on voice was on a backburner as of issue #2125, it's possible work was never finalized in some part, or release to the public wasn't priority as xsalsa20 support may be around for quite some time, and the cipher used for VC encryption doesn't necessarily matter to a point as xsalsa20 hasn't been cracked yet. Obsolete certainly, but not broken.

I've plans this week to reach out to someone on the voice team, if I can find one that's willing to speak, for clarification of what they're expecting to receive in the AAD field. If I'm correct, and this is what I am missing, I should have all aead modes working with my library once I know what's expected. I'm primarily interested in xchacha20_poly1305 encryption as it is the successor to the obsolete and aging xsalsa20_poly1305 cipher. I'd ask other library maintainers, but it doesn't seem like much is going on in the exploratory field, and haven't had luck receiving answers to my other queries, so I've been on my own with my faithful team for the most part.

My plan is/was to address all features my library (and others) lack, primarily voice API v7, webRTC support (which requires no extra encryption), and support for the new ciphers, then update Discord's dated voice documentation with my findings. It does seem, though, that I've gotten the furthest in my endeavors, as I've been unable to find reference to these ciphers in a working state in any other place.

Pehaps it's a placebo effect, or I fixed a bug in my library while testing, but voice API v7 sounds great compared to my experience with v4. Always had an issue with artifacting in my audio stream, which does not exist in v7.

I'll update you with my findings when I have more to work with. I've also sent some of my findings to the userdoccers documentation.

So TL;DR, operationally xsalsa20_poly1305_lite_rtpsize is the same as xsalsa20_poly1305_lite, and I'm missing the final piece to get the other modes working. Reaching out directly this week for more info.

elderlabs avatar Apr 10 '23 04:04 elderlabs

Hello,

Thanks for reaching out and for the detailed review. I'm a software engineer on the team responsible for these encryption modes.

We do intend to document these soon. We are currently phasing out some older modes and will likely not document deprecated modes. The two I expect we will document are aead_aes256_gcm_rtpsize and aead_xchacha20_poly1305_rtpsize. Because we know that changing encryption modes is likely to be significant work, we wanted to avoid asking devs to change modes repeatedly if it could be avoided, but it looks like we have likely settled on these modes for the foreseeable future. I don't have an exact date of when to expect updated documentation but it is something we are aware of. The existing modes will continue to function for now.

brian-armstrong-discord avatar Apr 11 '23 21:04 brian-armstrong-discord

Regarding voice websocket versioning

Pehaps it's a placebo effect, or I fixed a bug in my library while testing, but voice API v7 sounds great compared to my experience with v4. Always had an issue with artifacting in my audio stream, which does not exist in v7.

The versioning is purely limited to the control plane and will not have any effect on the data plane/rtp streams. There is also actually very little difference between v4 and v7. I've briefly reviewed the changes and it looks like all changes relate either to video or to internal instrumentation. I suspect there is likely nothing relevant here for bot devs but we will review again when we release documentation.

brian-armstrong-discord avatar Apr 11 '23 22:04 brian-armstrong-discord

You are amazing, thank you for communicating with us.

OoLunar avatar Apr 11 '23 22:04 OoLunar

Thank you Brian for the prompt update. Is there any chance of getting a brief idea of what's expected in the aead payloads ahead of time, or is it a simple "wait for the documentation update?" My hope was to make an attempt at documenting the new modes in a PR (though seems you guys may already be on that), but I'm missing something that the RFC doesn't lend solid definition to. I'm thinking it's something in the AAD field, as I'm providing the header, nonce, and encrypted data, but haven't figured out what it is. RFC 7714 suggested it can be the contents of the header a second time, but that didn't work in my testing.

I'm also curious if there are plans to document connecting with webRTC and supported video modes, if any. I know of a few meeting recording/sharing bots that would love the ability to decode video data if they knew what layout the data's in.

I expect it'll be a patience/"wait and see" response, but figured I'd ask.

Thanks again for the update. Let's hope things move along smoothly.

elderlabs avatar Apr 12 '23 00:04 elderlabs

Hello, are there any updates on the timeline for this? Thank you.

techchrism avatar Nov 05 '23 04:11 techchrism

https://git.kaydax.xyz/w/algos/src/branch/main/doc/crypt.md

Zipdox2 avatar Apr 14 '24 02:04 Zipdox2

Thanks @Zipdox2. My work here is complete. https://github.com/elderlabs/BetterDisco/commit/d988d6a8

EDIT: for clarity, and to answer my post above from a year ago, we were nearly there, missing one key detail -- the nonce size. AES256-GCM supports a nonce of 12 Bytes, whereas all other new modes support 24 Bytes. All new modes follow the same nonce format as xsalsa20_poly1305_lite. xchacha20 and AES256-GCM contain an additional data field (aad), which is the full RTP header, as noted above from RFC 7714 section 8.2. I would expect a number of formats to be deprecated at some point in the future, particularly most of the xsalsa20 modes for the reasons I noted above.

elderlabs avatar Apr 15 '24 04:04 elderlabs

The documentation has been updated will all current encryption modes

birarda avatar Aug 16 '24 14:08 birarda