kaitai_struct icon indicating copy to clipboard operation
kaitai_struct copied to clipboard

Wireshark dissectors as target language

Open GreyCat opened this issue 8 years ago • 26 comments

A few people have already asked for compiling .ksy into Wireshark dissector.

This would be an umberlla issue to track progress for this goal.

GreyCat avatar Nov 14 '16 09:11 GreyCat

Hey, guess what I've found - https://github.com/joushx/kaitai-to-wireshark And it's 2 months old :)

GreyCat avatar Nov 23 '16 10:11 GreyCat

Hi, I'm the author of the tool mentioned. (I think it was @GreyCat who contacted me via Twitter).

I'm not sure if a Wireshark dissector is really suitable to be treated as an target language. It does not require a runtime to extract values but rather needs a transpiler from the kaitai description language to the description language of Wireshark (in C++ or Lua).

I had a look at the compiler interface and would not know how to implement these methods as you not deal with classes but construct a single dissect function in Lua.

joushx avatar Nov 24 '16 15:11 joushx

First of all, thanks for coming, @joushx :)

I believe that we can find some parallels between KS reference compiler architecture and what's needed to be done for Wireshark lua output.

For example, I see that one is expected do declare fields in such a manner:

  logical_screen:add(buffer(0,2), f.image_width)
  logical_screen:add(buffer(2,2), f.image_height)
  logical_screen:add(buffer(4,1), f.flags)

While it's possible to count these offsets from the beginning of the structure in a compiler, may be we can add some internal counter inside lua, and do something like that (ok, disclaimer: I don't know anything about Lua syntax)?

  local ofs = 0
  logical_screen:add(buffer(ofs,2), f.image_width)
  ofs += 2
  logical_screen:add(buffer(ofs,2), f.image_height)
  ofs += 2
  logical_screen:add(buffer(ofs,1), f.flags)
  ofs += 1

This way it will work even if we'll encounter a variable-length structure (i.e. a string whose length will be determined by some previously parsed data). Could we use your Lua and Wireshark experise to suggest such pieces of code to us?

Even if our default compiler model won't fit Wireshark dissectors well enough, there's plan B. Actually, it's not necessary should be derived from ClassCompiler, but can be pretty much standalone stuff like GraphvizClassCompiler.

GreyCat avatar Nov 24 '16 15:11 GreyCat

It really looks like it isn't that hard to create the Lua output using the AbstractCompiler. :)

This would work of course. (The syntax is correct)

Maybe I could even have a look myself as I'm really interested to have a single binary description format I can use to debug protocols in wireshark.

joushx avatar Nov 25 '16 10:11 joushx

I just made a little test and implemented a few field types: https://github.com/joushx/kaitai_struct_compiler/commit/4ae163e0cca79e3621e4b681b4463956cbcbff99 (Please ignore the ugly code; I don't know Scala btw.)

joushx avatar Nov 25 '16 13:11 joushx

Wow, that was fast! I'm testing it now and will add some comments, if you'd like to see them ;)

GreyCat avatar Nov 25 '16 15:11 GreyCat

It is not quite ready for a review to be honest. Just wanted to know if it works like I thought.

joushx avatar Nov 25 '16 15:11 joushx

Could you lend me a hand on how to test it in Wireshark? I've read that one needs to copy the resulting lua script to ~/.wireshark/plugins - but what then? How do I force Wireshark to use that dissector?

GreyCat avatar Nov 25 '16 15:11 GreyCat

At the bottom of the generated file there is <port> which should be replaced by now with some random port. I tested it like this: Listen to a loopback interface in Wireshark and shuffle around data with netcat or something similar.

cat foo.gif | nc 127.0.0.1 9874 and nc -l 9874

Please make sure your Wireshark copy is compiled with Lua if it does not work. (Should be the standard) Using Help > About you can also get a list of loaded dissectors.

joushx avatar Nov 25 '16 15:11 joushx

Ok, got that. Actually, I have a very good example right from my workshop:

That's a very simple UDP-based protocol that is basically used to transfer lines to show at tram transport signs.

GreyCat avatar Nov 25 '16 15:11 GreyCat

I wonder if it's possible to write some automated tests for running Wireshark dissectors? Is it possible, for example, to load stuff into Wireshark and do a text dump (that can be diffed against some expected output) from the command line?

GreyCat avatar Nov 25 '16 15:11 GreyCat

Then you might want to right click and select decode as and change the port.

I don't know, but it might work somehow with tshark.

joushx avatar Nov 25 '16 15:11 joushx

Simple seq structures (like hello_world.ksy) work like a charm ;) A little more complex one, with one substructure like transport_sign.ksy, fail on the following:


transport_sign_proto = Proto("transport_sign","TransportSign")

local f = transport_sign_proto.fields
f.line_num_len = ProtoField.uint8("line.line_num_len", "line_num_len")
f.dest_len = ProtoField.uint8("line.dest_len", "dest_len")

function transport_sign_proto.dissector(buffer, pinfo, tree)
	pinfo.cols.protocol = "TransportSign"
	main = tree:add(transport_sign_proto, "TransportSign")

	local offset = 0
	main:add(f.num_lines, buffer(offset,4))
        -- ^^^ right on this line

GreyCat avatar Nov 25 '16 15:11 GreyCat

Can you give me some example of how to use substructures in protocol dissector in lua?

GreyCat avatar Nov 25 '16 15:11 GreyCat

I'm currently studying your other project. Looks like that we're supposed to somehow map all possible field combinations in advance, like that:

f.typecode = ProtoField.uint8("adsb.typecode", "Format type code", base.DEC, {[1] = "Aicraft identification", [2] = "Aicraft identification", [3] = "Aicraft identification", [4] = "Aicraft identification",[5] = "Surface position", [6] = "Surface position", [7] = "Surface position", [8] = "Surface position", [9] = "Airborne position (Baro Alt)", [10] = "Airborne position (Baro Alt)", [11] = "Airborne position (Baro Alt)", [12] = "Airborne position (Baro Alt)", [13] = "Airborne position (Baro Alt)", [14] = "Airborne position (Baro Alt)", [15] = "Airborne position (Baro Alt)", [16] = "Airborne position (Baro Alt)", [17] = "Airborne position (Baro Alt)", [18] = "Airborne position (Baro Alt)", [19] = "Airborne velocity", [20] = "Airborne position (GNSS Height)", [21] = "Airborne position (GNSS Height)", [22] = "Airborne position (GNSS Height)", [23] = "Test message" }, 0xf8)
f.ident = ProtoField.string("adsb.ident", "Aircraft Identification")
f.i_first = ProtoField.uint8("adsb.ident.first", "First letter", base.DEC, {[1] = "A", [2] = "B", [3] = "C", [4] = "D", [5] = "E", [6] = "F", [7] = "G", [8] = "H", [9] = "I", [10] = "J", [11] = "K", [12] = "L", [13] = "M", [14] = "N", [15] = "O", [16] = "P", [17] = "Q", [18] = "R", [19] = "S", [20] = "T", [21] = "U", [22] = "V", [23] = "W", [24] = "X", [25] = "Y", [26] = "Z", [32] = "_", [48] = "0", [49] = "1", [50] = "2", [51] = "3", [52] = "4", [53] = "5", [54] = "6", [55] = "7", [56] = "8", [57] = "9"}, 0xfc)
f.i_second = ProtoField.uint8("adsb.ident.second", "Second letter", base.DEC, {[1] = "A", [2] = "B", [3] = "C", [4] = "D", [5] = "E", [6] = "F", [7] = "G", [8] = "H", [9] = "I", [10] = "J", [11] = "K", [12] = "L", [13] = "M", [14] = "N", [15] = "O", [16] = "P", [17] = "Q", [18] = "R", [19] = "S", [20] = "T", [21] = "U", [22] = "V", [23] = "W", [24] = "X", [25] = "Y", [26] = "Z", [32] = "_", [48] = "0", [49] = "1", [50] = "2", [51] = "3", [52] = "4", [53] = "5", [54] = "6", [55] = "7", [56] = "8", [57] = "9"}, 0x3f0)
-- ...

f.v_subtype = ProtoField.uint8("adsb.velocity.subtype", "Subtype", base.DEC, {[1] = "Subsonic ground speed", [2] = "Supersonic ground speed", [3] = "Subsonic air speed", [4] = "Supersonic air speed"}, 0x7)
f.v_intchange = ProtoField.bool("adsb.velocity.intentchange", "Intent change flag", 8, nil, 0x80)
f.v_uncertainty = ProtoField.uint8("adsb.velocity.uncertainty", "Velocity uncertainty", nil, nil, 0x38)

What do we do if we need to dissect some infinitely nested structure, i.e. something like a TIFF (or RIFF) file?

GreyCat avatar Nov 25 '16 15:11 GreyCat

Hey, tshark -T json -j hello_world -r pkts.pcap actually works pretty well:

[
  {
    "_index": "packets-2016-11-25",
    "_type": "pcap_file",
    "_score": null,
    "_source": {
      "layers": {
        "frame": {
          "filtered": "frame"
        },
        "eth": {
          "filtered": "eth"
        },
        "ip": {
          "filtered": "ip"
        },
        "udp": {
          "filtered": "udp"
        },
        "_ws.lua.fake": "",
        "hello_world": {
          "hello_world.one": "4"
        }
      }
    }
  }
]

GreyCat avatar Nov 25 '16 16:11 GreyCat

It fails because I have not yet implemented the u4 type. ;) I'll need to go through all available types to make sure I do not forget any.

Nested fields are indeed a problem. Maybe I can have a look at some other projects and how it is solved there.

Looks good. :)

tshark -r pkts.pcap -X lua_script:transport_sign.lua -d udp.port==1-65535,transport_sign -T json (where -d forces it to use the given dissector)

{
    "_index": "packets-2016-11-25",
    "_type": "pcap_file",
    "_score": null,
    "_source": {
      "layers": {
        "frame": {
          "frame.encap_type": "1",
          "frame.time": "Sep 19, 2016 15:40:08.576182000 CEST",
          "frame.offset_shift": "0.000000000",
          "frame.time_epoch": "1474292408.576182000",
          "frame.time_delta": "95.869584000",
          "frame.time_delta_displayed": "95.869584000",
          "frame.time_relative": "1966.222449000",
          "frame.number": "21",
          "frame.len": "200",
          "frame.cap_len": "200",
          "frame.marked": "0",
          "frame.ignored": "0",
          "frame.protocols": "eth:ethertype:ip:udp:transport_sign"
        },
        "eth": {
          "eth.dst": {
            "eth.dst_resolved": "Tp-LinkT_78:74:a2",
            "eth.addr": "64:70:02:78:74:a2",
            "eth.addr_resolved": "Tp-LinkT_78:74:a2",
            "eth.lg": "0",
            "eth.ig": "0"
          },
          "eth.src": {
            "eth.src_resolved": "Cisco_28:c1:98",
            "eth.addr": "00:19:e7:28:c1:98",
            "eth.addr_resolved": "Cisco_28:c1:98",
            "eth.lg": "0",
            "eth.ig": "0"
          },
          "eth.type": "0x00000800"
        },
        "ip": {
          "ip.version": "4",
          "ip.hdr_len": "20",
          "ip.dsfield": {
            "ip.dsfield.dscp": "0",
            "ip.dsfield.ecn": "0"
          },
          "ip.len": "186",
          "ip.id": "0x000073d1",
          "ip.flags": {
            "ip.flags.rb": "0",
            "ip.flags.df": "1",
            "ip.flags.mf": "0"
          },
          "ip.frag_offset": "0",
          "ip.ttl": "64",
          "ip.proto": "17",
          "ip.checksum": "0x00003fd6",
          "ip.checksum.status": "2",
          "ip.src": "192.168.1.65",
          "ip.addr": "192.168.1.65",
          "ip.src_host": "192.168.1.65",
          "ip.host": "192.168.1.65",
          "ip.dst": "192.168.3.250",
          "ip.addr": "192.168.3.250",
          "ip.dst_host": "192.168.3.250",
          "ip.host": "192.168.3.250"
        },
        "udp": {
          "udp.srcport": "41340",
          "udp.dstport": "9730",
          "udp.port": "41340",
          "udp.port": "9730",
          "udp.length": "166",
          "udp.checksum": "0x00001660",
          "udp.checksum.status": "2",
          "udp.stream": "20"
        },
        "_ws.lua.fake": "",
        "transport_sign": {
          "transport_sign.num_lines": "50331648",
          "_ws.lua.text": {
            "line.line_num": "134217728",
            "line.line_num_len": "1",
            "line.dest_len": "56",
            "line.code1": "504493787",
            "line.timestamp": "3663160681"
          }
        }
      }
    }
  }

[...]

joushx avatar Nov 25 '16 20:11 joushx

Great job! I guess we can extend our testing framework to support Wireshark dissectors too, working like that:

  • We compile all tests as usual (./build-formats), if the compiler supports wireshark output, it will generate compiled/wireshark dir with tons of dissectors (that already works).
  • We create ./run-wireshark that will perform "testing" by running that tshark ... command, saving the JSON output, and diffing it agains expected output. Ideally, that should create JUnit-like report (or basically any structured report, I'll parse it, no problem), that could be integrated into our CI system.

Right now, probably Wireshark dissector target would pass only several tests, but eventually we can build it up so it will support more and more.

GreyCat avatar Nov 25 '16 21:11 GreyCat

Hey, I've played with tshark a bit and created a simple script tshark-one:

#!/bin/sh

BIN_FILE=$1
LUA_DISSECTOR=$2

od -Ax -tx1 -v "$BIN_FILE" | text2pcap -q -l 147 - tmp.pcap
tshark \
	-r tmp.pcap \
	-X "lua_script:compiled/wireshark/$LUA_DISSECTOR.lua" \
	-o "uat:user_dlts:\"User 0 (DLT=147)\",\"$LUA_DISSECTOR\",\"0\",\"\",\"0\",\"\"" \
	-j "$LUA_DISSECTOR" \
	-T json |
	jq ".[0]._source.layers.$LUA_DISSECTOR"

This one is supposed to be used like that:

git/kaitai_struct/tests $ ./tshark-one src/fixed_struct.bin hello_world                                       
{
  "hello_world.one": "80"
}

Internally, it does the following:

  1. Wraps given binary file into a valid pcap using text2pcap. It uses link_type = 147, which is the first link type reserved for private use, i.e. "user-defined type", which is perfect for our experiments.
  2. Runs tshark on that pcap with special command line arguments which invoke the dissector. Note that generated .lua file should not contain lines like these that bind a particular dissector into protocol dissection tree:
tcp_table = DissectorTable.get("tcp.port")
tcp_table:add(12345, hello_world_proto)
  1. Removes lots of extra noisy lines by using jq, essentially cutting only one branch in JSON tree that we actually need.

GreyCat avatar Nov 26 '16 02:11 GreyCat

Very nice! However, I don't really know how to integrate this tests into the project files.

joushx avatar Nov 27 '16 12:11 joushx

Was there any movement on this? I recently discovered Kaitai and I have been playing around with Wireshark dissectors and this would be an amazing feature.

tmcnag avatar Jun 28 '17 20:06 tmcnag

Unfortunately, no. There is a (pretty outdated) branch that supports compilation into lua dissectors, and there have been some work done on starting testing, but that's all.

Generally, we need to follow this checklist, as with any other new target language. If you're willing to help — you're most welcome to join :)

GreyCat avatar Jun 28 '17 21:06 GreyCat

I simply had not enough time to finish it, so feel free to use my edits to far. :)

joushx avatar Jun 29 '17 07:06 joushx

Just chiming in, as I'm very interested in Wireshark dissector support.

cryptocode avatar Jan 09 '18 23:01 cryptocode

@joushx , @GreyCat I'd like to explore using the Lua language output to build a Wireshark dissector. I see that the generated file is dependent on the kaitaistruct Lua runtime library. What is the feasibility or potential steps required to parlay this Lua code into a usable dissector?

winstonwolfe06 avatar Jun 14 '18 14:06 winstonwolfe06

I've made a rudimentary version of this (also needs forked runtime files) if anyone's interested. It's based on the existing Lua backend

Only compiles working code with the --read-pos option (for example kaitai-struct-compiler --read-pos -t wireshark test.ksy)

from my initial testing conditionals, switches, loops seem to work

any help or guidance is welcome -- I don't actually know Scala

indivisible avatar Mar 27 '23 12:03 indivisible