Zeek reader "(empty)" handling on string data type
Based on black box testing, zq (commit a7522c2 at the moment) currently only seems to treat the #empty_field (empty) as indicative of empty set & vector types. But it looks like it should be doing the same for string types as well. I originally stumbled onto this while reading in Zeek TSV and shaped Zeek NDJSON and comparing the ZSON-format output, and this showed up in the rdp event types of the zq-sample-data. But here's proof in the form of a simple Zeek script that outputs an empty string "":
$ cat mine.zeek
module Mine;
export {
redef enum Log::ID += { LOG };
type Info: record {
my_str: string &log;
};
}
event zeek_init()
{
Log::create_stream(Mine::LOG, [$columns=Mine::Info, $path="mine"]);
Log::write( Mine::LOG, [$my_str=""]);
}
Run with Zeek v4.0.0, we can see that in the Zeek TZV log it does show up as (empty), which then gets read in by zq as that actual string rather than turning it back into an empty string:
$ /usr/local/zeek-4.0.0/bin/zeek local mine.zeek
WARNING: No Site::local_nets have been defined. It's usually a good idea to define your local networks.
$ cat mine.log
#separator \x09
#set_separator ,
#empty_field (empty)
#unset_field -
#path mine
#open 2021-03-17-11-31-40
#fields my_str
#types string
(empty)
#close 2021-03-17-11-31-40
$ zq -version
Version: v0.29.0-132-ga7522c26
$ zq -z mine.log
{_path:"mine",my_str:"(empty)" (bstring)} (=0)
Repeating the same with the Zeek NDJSON log, they render it as the empty string so there's no problem:
$ /usr/local/zeek-4.0.0/bin/zeek local "LogAscii::use_json=T" mine.zeek
WARNING: No Site::local_nets have been defined. It's usually a good idea to define your local networks.
$ cat mine.log
{"my_str":""}
$ zq -z mine.log
{my_str:""}
This can be filed under "You learn something new every day!" For the past couple years of staring at Zeek logs, I thought I'd only ever seen it used in contexts where (empty) was describing an empty set or vector. Indeed, here was me and @henridf agreeing with each other on this observation in an internal chat:

Somehow despite all the variations I've tested, I guess I never actually checked that one! :facepalm: