kaitai_struct
kaitai_struct copied to clipboard
Wireshark dissectors as target language
A few people have already asked for compiling .ksy into Wireshark dissector.
This would be an umberlla issue to track progress for this goal.
Hey, guess what I've found - https://github.com/joushx/kaitai-to-wireshark And it's 2 months old :)
Hi, I'm the author of the tool mentioned. (I think it was @GreyCat who contacted me via Twitter).
I'm not sure if a Wireshark dissector is really suitable to be treated as an target language. It does not require a runtime to extract values but rather needs a transpiler from the kaitai description language to the description language of Wireshark (in C++ or Lua).
I had a look at the compiler interface and would not know how to implement these methods as you not deal with classes but construct a single dissect function in Lua.
First of all, thanks for coming, @joushx :)
I believe that we can find some parallels between KS reference compiler architecture and what's needed to be done for Wireshark lua output.
For example, I see that one is expected do declare fields in such a manner:
logical_screen:add(buffer(0,2), f.image_width)
logical_screen:add(buffer(2,2), f.image_height)
logical_screen:add(buffer(4,1), f.flags)
While it's possible to count these offsets from the beginning of the structure in a compiler, may be we can add some internal counter inside lua, and do something like that (ok, disclaimer: I don't know anything about Lua syntax)?
local ofs = 0
logical_screen:add(buffer(ofs,2), f.image_width)
ofs += 2
logical_screen:add(buffer(ofs,2), f.image_height)
ofs += 2
logical_screen:add(buffer(ofs,1), f.flags)
ofs += 1
This way it will work even if we'll encounter a variable-length structure (i.e. a string whose length will be determined by some previously parsed data). Could we use your Lua and Wireshark experise to suggest such pieces of code to us?
Even if our default compiler model won't fit Wireshark dissectors well enough, there's plan B. Actually, it's not necessary should be derived from ClassCompiler, but can be pretty much standalone stuff like GraphvizClassCompiler.
It really looks like it isn't that hard to create the Lua output using the AbstractCompiler. :)
This would work of course. (The syntax is correct)
Maybe I could even have a look myself as I'm really interested to have a single binary description format I can use to debug protocols in wireshark.
I just made a little test and implemented a few field types: https://github.com/joushx/kaitai_struct_compiler/commit/4ae163e0cca79e3621e4b681b4463956cbcbff99 (Please ignore the ugly code; I don't know Scala btw.)
Wow, that was fast! I'm testing it now and will add some comments, if you'd like to see them ;)
It is not quite ready for a review to be honest. Just wanted to know if it works like I thought.
Could you lend me a hand on how to test it in Wireshark? I've read that one needs to copy the resulting lua script to ~/.wireshark/plugins - but what then? How do I force Wireshark to use that dissector?
At the bottom of the generated file there is <port>
which should be replaced by now with some random port. I tested it like this: Listen to a loopback interface in Wireshark and shuffle around data with netcat or something similar.
cat foo.gif | nc 127.0.0.1 9874
and nc -l 9874
Please make sure your Wireshark copy is compiled with Lua if it does not work. (Should be the standard) Using Help > About
you can also get a list of loaded dissectors.
Ok, got that. Actually, I have a very good example right from my workshop:
That's a very simple UDP-based protocol that is basically used to transfer lines to show at tram transport signs.
I wonder if it's possible to write some automated tests for running Wireshark dissectors? Is it possible, for example, to load stuff into Wireshark and do a text dump (that can be diffed against some expected output) from the command line?
Then you might want to right click and select decode as
and change the port.
I don't know, but it might work somehow with tshark.
Simple seq
structures (like hello_world.ksy
) work like a charm ;) A little more complex one, with one substructure like transport_sign.ksy
, fail on the following:
transport_sign_proto = Proto("transport_sign","TransportSign")
local f = transport_sign_proto.fields
f.line_num_len = ProtoField.uint8("line.line_num_len", "line_num_len")
f.dest_len = ProtoField.uint8("line.dest_len", "dest_len")
function transport_sign_proto.dissector(buffer, pinfo, tree)
pinfo.cols.protocol = "TransportSign"
main = tree:add(transport_sign_proto, "TransportSign")
local offset = 0
main:add(f.num_lines, buffer(offset,4))
-- ^^^ right on this line
Can you give me some example of how to use substructures in protocol dissector in lua?
I'm currently studying your other project. Looks like that we're supposed to somehow map all possible field combinations in advance, like that:
f.typecode = ProtoField.uint8("adsb.typecode", "Format type code", base.DEC, {[1] = "Aicraft identification", [2] = "Aicraft identification", [3] = "Aicraft identification", [4] = "Aicraft identification",[5] = "Surface position", [6] = "Surface position", [7] = "Surface position", [8] = "Surface position", [9] = "Airborne position (Baro Alt)", [10] = "Airborne position (Baro Alt)", [11] = "Airborne position (Baro Alt)", [12] = "Airborne position (Baro Alt)", [13] = "Airborne position (Baro Alt)", [14] = "Airborne position (Baro Alt)", [15] = "Airborne position (Baro Alt)", [16] = "Airborne position (Baro Alt)", [17] = "Airborne position (Baro Alt)", [18] = "Airborne position (Baro Alt)", [19] = "Airborne velocity", [20] = "Airborne position (GNSS Height)", [21] = "Airborne position (GNSS Height)", [22] = "Airborne position (GNSS Height)", [23] = "Test message" }, 0xf8)
f.ident = ProtoField.string("adsb.ident", "Aircraft Identification")
f.i_first = ProtoField.uint8("adsb.ident.first", "First letter", base.DEC, {[1] = "A", [2] = "B", [3] = "C", [4] = "D", [5] = "E", [6] = "F", [7] = "G", [8] = "H", [9] = "I", [10] = "J", [11] = "K", [12] = "L", [13] = "M", [14] = "N", [15] = "O", [16] = "P", [17] = "Q", [18] = "R", [19] = "S", [20] = "T", [21] = "U", [22] = "V", [23] = "W", [24] = "X", [25] = "Y", [26] = "Z", [32] = "_", [48] = "0", [49] = "1", [50] = "2", [51] = "3", [52] = "4", [53] = "5", [54] = "6", [55] = "7", [56] = "8", [57] = "9"}, 0xfc)
f.i_second = ProtoField.uint8("adsb.ident.second", "Second letter", base.DEC, {[1] = "A", [2] = "B", [3] = "C", [4] = "D", [5] = "E", [6] = "F", [7] = "G", [8] = "H", [9] = "I", [10] = "J", [11] = "K", [12] = "L", [13] = "M", [14] = "N", [15] = "O", [16] = "P", [17] = "Q", [18] = "R", [19] = "S", [20] = "T", [21] = "U", [22] = "V", [23] = "W", [24] = "X", [25] = "Y", [26] = "Z", [32] = "_", [48] = "0", [49] = "1", [50] = "2", [51] = "3", [52] = "4", [53] = "5", [54] = "6", [55] = "7", [56] = "8", [57] = "9"}, 0x3f0)
-- ...
f.v_subtype = ProtoField.uint8("adsb.velocity.subtype", "Subtype", base.DEC, {[1] = "Subsonic ground speed", [2] = "Supersonic ground speed", [3] = "Subsonic air speed", [4] = "Supersonic air speed"}, 0x7)
f.v_intchange = ProtoField.bool("adsb.velocity.intentchange", "Intent change flag", 8, nil, 0x80)
f.v_uncertainty = ProtoField.uint8("adsb.velocity.uncertainty", "Velocity uncertainty", nil, nil, 0x38)
What do we do if we need to dissect some infinitely nested structure, i.e. something like a TIFF (or RIFF) file?
Hey, tshark -T json -j hello_world -r pkts.pcap
actually works pretty well:
[
{
"_index": "packets-2016-11-25",
"_type": "pcap_file",
"_score": null,
"_source": {
"layers": {
"frame": {
"filtered": "frame"
},
"eth": {
"filtered": "eth"
},
"ip": {
"filtered": "ip"
},
"udp": {
"filtered": "udp"
},
"_ws.lua.fake": "",
"hello_world": {
"hello_world.one": "4"
}
}
}
}
]
It fails because I have not yet implemented the u4 type. ;) I'll need to go through all available types to make sure I do not forget any.
Nested fields are indeed a problem. Maybe I can have a look at some other projects and how it is solved there.
Looks good. :)
tshark -r pkts.pcap -X lua_script:transport_sign.lua -d udp.port==1-65535,transport_sign -T json
(where -d
forces it to use the given dissector)
{
"_index": "packets-2016-11-25",
"_type": "pcap_file",
"_score": null,
"_source": {
"layers": {
"frame": {
"frame.encap_type": "1",
"frame.time": "Sep 19, 2016 15:40:08.576182000 CEST",
"frame.offset_shift": "0.000000000",
"frame.time_epoch": "1474292408.576182000",
"frame.time_delta": "95.869584000",
"frame.time_delta_displayed": "95.869584000",
"frame.time_relative": "1966.222449000",
"frame.number": "21",
"frame.len": "200",
"frame.cap_len": "200",
"frame.marked": "0",
"frame.ignored": "0",
"frame.protocols": "eth:ethertype:ip:udp:transport_sign"
},
"eth": {
"eth.dst": {
"eth.dst_resolved": "Tp-LinkT_78:74:a2",
"eth.addr": "64:70:02:78:74:a2",
"eth.addr_resolved": "Tp-LinkT_78:74:a2",
"eth.lg": "0",
"eth.ig": "0"
},
"eth.src": {
"eth.src_resolved": "Cisco_28:c1:98",
"eth.addr": "00:19:e7:28:c1:98",
"eth.addr_resolved": "Cisco_28:c1:98",
"eth.lg": "0",
"eth.ig": "0"
},
"eth.type": "0x00000800"
},
"ip": {
"ip.version": "4",
"ip.hdr_len": "20",
"ip.dsfield": {
"ip.dsfield.dscp": "0",
"ip.dsfield.ecn": "0"
},
"ip.len": "186",
"ip.id": "0x000073d1",
"ip.flags": {
"ip.flags.rb": "0",
"ip.flags.df": "1",
"ip.flags.mf": "0"
},
"ip.frag_offset": "0",
"ip.ttl": "64",
"ip.proto": "17",
"ip.checksum": "0x00003fd6",
"ip.checksum.status": "2",
"ip.src": "192.168.1.65",
"ip.addr": "192.168.1.65",
"ip.src_host": "192.168.1.65",
"ip.host": "192.168.1.65",
"ip.dst": "192.168.3.250",
"ip.addr": "192.168.3.250",
"ip.dst_host": "192.168.3.250",
"ip.host": "192.168.3.250"
},
"udp": {
"udp.srcport": "41340",
"udp.dstport": "9730",
"udp.port": "41340",
"udp.port": "9730",
"udp.length": "166",
"udp.checksum": "0x00001660",
"udp.checksum.status": "2",
"udp.stream": "20"
},
"_ws.lua.fake": "",
"transport_sign": {
"transport_sign.num_lines": "50331648",
"_ws.lua.text": {
"line.line_num": "134217728",
"line.line_num_len": "1",
"line.dest_len": "56",
"line.code1": "504493787",
"line.timestamp": "3663160681"
}
}
}
}
}
[...]
Great job! I guess we can extend our testing framework to support Wireshark dissectors too, working like that:
- We compile all tests as usual (
./build-formats
), if the compiler supportswireshark
output, it will generatecompiled/wireshark
dir with tons of dissectors (that already works). - We create
./run-wireshark
that will perform "testing" by running thattshark ...
command, saving the JSON output, and diffing it agains expected output. Ideally, that should create JUnit-like report (or basically any structured report, I'll parse it, no problem), that could be integrated into our CI system.
Right now, probably Wireshark dissector target would pass only several tests, but eventually we can build it up so it will support more and more.
Hey, I've played with tshark a bit and created a simple script tshark-one
:
#!/bin/sh
BIN_FILE=$1
LUA_DISSECTOR=$2
od -Ax -tx1 -v "$BIN_FILE" | text2pcap -q -l 147 - tmp.pcap
tshark \
-r tmp.pcap \
-X "lua_script:compiled/wireshark/$LUA_DISSECTOR.lua" \
-o "uat:user_dlts:\"User 0 (DLT=147)\",\"$LUA_DISSECTOR\",\"0\",\"\",\"0\",\"\"" \
-j "$LUA_DISSECTOR" \
-T json |
jq ".[0]._source.layers.$LUA_DISSECTOR"
This one is supposed to be used like that:
git/kaitai_struct/tests $ ./tshark-one src/fixed_struct.bin hello_world
{
"hello_world.one": "80"
}
Internally, it does the following:
- Wraps given binary file into a valid pcap using
text2pcap
. It uses link_type = 147, which is the first link type reserved for private use, i.e. "user-defined type", which is perfect for our experiments. - Runs
tshark
on that pcap with special command line arguments which invoke the dissector. Note that generated .lua file should not contain lines like these that bind a particular dissector into protocol dissection tree:
tcp_table = DissectorTable.get("tcp.port")
tcp_table:add(12345, hello_world_proto)
- Removes lots of extra noisy lines by using jq, essentially cutting only one branch in JSON tree that we actually need.
Very nice! However, I don't really know how to integrate this tests into the project files.
Was there any movement on this? I recently discovered Kaitai and I have been playing around with Wireshark dissectors and this would be an amazing feature.
Unfortunately, no. There is a (pretty outdated) branch that supports compilation into lua dissectors, and there have been some work done on starting testing, but that's all.
Generally, we need to follow this checklist, as with any other new target language. If you're willing to help — you're most welcome to join :)
I simply had not enough time to finish it, so feel free to use my edits to far. :)
Just chiming in, as I'm very interested in Wireshark dissector support.
@joushx , @GreyCat I'd like to explore using the Lua language output to build a Wireshark dissector. I see that the generated file is dependent on the kaitaistruct Lua runtime library. What is the feasibility or potential steps required to parlay this Lua code into a usable dissector?
I've made a rudimentary version of this (also needs forked runtime files) if anyone's interested. It's based on the existing Lua backend
Only compiles working code with the --read-pos
option (for example kaitai-struct-compiler --read-pos -t wireshark test.ksy
)
from my initial testing conditionals, switches, loops seem to work
any help or guidance is welcome -- I don't actually know Scala