httptoolkit icon indicating copy to clipboard operation
httptoolkit copied to clipboard

Support to decode application/vnd.syncml.dm+wbxml

Open nutterthanos opened this issue 7 months ago • 3 comments

Note as an example Samsung uses it for Software Updates on their devices


Does this affect you too? Click below and add a :+1: to vote for this and help decide where HTTP Toolkit goes next, or go vote on the other most popular ideas so far.

nutterthanos avatar Jun 03 '25 11:06 nutterthanos

Ooooh fun, that's an unusual case. It looks like wbxml was part of the WAP specification years ago (https://www.w3.org/1999/06/NOTE-wbxml-19990624/) - it's just XML in a binary format.

We do have reasonable support for XML already, so all this would require is adding support for wbxml by converting it to an XML string, and then just copying the normal XML behaviour. Unfortunately though that doesn't look easy, and because it's quite rare & unusual, so it doesn't seem like there's many existing implementations or libraries we could easily use (wbxml seems the best, but it may be Node-only - we'd need to use it in a browser - and it hasn't been maintained in 12 years).

The best approach would be for somebody to write a WBXML decoder from scratch (decode only - we don't need to be able to encode this I think).

I'm very unlikely to pick this up myself in the short term since I'm super busy, and I doubt this comes up often (although if many other people vote/comment here then I'm willing to be proven wrong). I would happily accept a PR for an implementation though, if anybody else is interested. I would guess that publishing a standalone modern WBXML decoder JS library would be quite a useful open source project in general, so could be an interesting project, since there doesn't seem to be any good actively maintained ones available.

pimterry avatar Jun 03 '25 12:06 pimterry

In fact WBXML seems a bit more complicated than I had imagined: the way it works is that it starts with a public identifier (an FPI) which references a DTD, and for various DTDs WBXML defines codepages. Codepages link ids to tag & attribute names.

In practice, that means the binary XML will effectively be <1><2 5="hello"></2></1> and you will need to look up 1/2/5 in the corresponding codepage for the document to get the actual names. That makes this quite a bit harder - we'd need to embed lots of codepages and know how to link them to FPIs somehow, or find (or host ourselves) a service to retrieve these on demand. Definitely possible, but that makes this harder and probably moves it further down the todo list unless there's clear demand for this.

pimterry avatar Jun 03 '25 12:06 pimterry

There is closer to an more recent updated library that can kinda do what is required but that's like not nodejs sadly https://github.com/libwbxml/libwbxml

again I assume this will be like you said near the bottom of the to-do list

Note the code seems to be in C

nutterthanos avatar Jun 03 '25 13:06 nutterthanos