miller
miller copied to clipboard
DSL: add functions for base64 encoding & decoding
Tried to rewrite my jq
script but stuck immediately.
Great idea @b97tsk ! :)
ldap2json would certainly benefit from that! )
@b97tsk @onoraba one question needing to be addressed is, suppose we do decode, and the data non-ASCII -- I'm not sure Miller strings (per se) would suffice anymore.
In Python there is "foo"
and b"foo"
, and types string
and bytes
. I wonder if Miller would need these as well.
Can either of you perhaps share some examples of what some of your data might look like post-decode?
in LDAP case base64 being used for Cyrillic strings. Exec variant is too slow
$ echo "b64==YT3Qv9GA0LjQstC10YIsYj3QvNC40YA=" | mlr --idkvp --ips '==' --ojson --no-jlistwrap put 'func base64d(s) { return exec("openssl", ["enc", "-base64", "-d"], {"stdin_string": s . "\n" }); }; for(k,v in $*) { substr1(v,-1,-1) == "=" && k !=~ "binary" { $*[k] = base64d(v); }; };' { "b64": "a=привет,b=мир" } $
Return only string, "valid" or hexfmt like !? https://pkg.go.dev/unicode/utf8#ValidString
@onoraba I like that!
- In the shorter term (very easy), decode string to string
- If
ValidString
: return as is - else return a hexfmt
- If
- In the longer term (a bit more work)
- Create a
bytes
type in the Miller DSL - Support
b"..."
literals in the DSL - Extend some other functions to operate on bytes, e.g.
md5()
- Create a
@johnkerl
Sir, adding bytes
would bring much joy to parsing ms LDAP dns entries, etc.
Now it is os execute with base64 | dd | iconv
pipe, could be way more neater