fsharp
fsharp copied to clipboard
Improve F# Interactive formatting of byte arrays
Working with byte arrays in F# is hard, because you don't see the proper content as bytes, neither the text-string-presentation of it. Working with byte-arrays is common scenario that can relate to networks, files, etc.
Example byte array:
let myBytes = "this is a test, this is a test, this is a test.."B
The current (non-standard string-presentation):
val myBytes: byte array =
[|116uy; 104uy; 105uy; 115uy; 32uy; 105uy; 115uy; 32uy; 97uy; 32uy; 116uy;
101uy; 115uy; 116uy; 44uy; 32uy; 116uy; 104uy; 105uy; 115uy; 32uy; 105uy;
115uy; 32uy; 97uy; 32uy; 116uy; 101uy; 115uy; 116uy; 44uy; 32uy; 116uy;
104uy; 105uy; 115uy; 32uy; 105uy; 115uy; 32uy; 97uy; 32uy; 116uy; 101uy;
115uy; 116uy; 46uy; 46uy|]
The standard way (used by Git, hex-editors, etc) would be rendering the array in more visual way: (hex-line number, tab, 8 hex-bytes + space + 8 hex-bytes, tab, string representation):
0000 74 68 69 73 20 69 73 20 61 20 74 65 73 74 2c 20 this is a test,
0010 74 68 69 73 20 69 73 20 61 20 74 65 73 74 2c 20 this is a test,
0020 74 68 69 73 20 69 73 20 61 20 74 65 73 74 2e 2e this is a test..
The good news is that this can be done with fairly simple F#-function (open source, original thanks to Christian Steinert):
let getBytesTable (bytes : byte array) =
let maxLen = if bytes.LongLength > 1048576L then 1048576 else bytes.Length
let sb = System.Text.StringBuilder ()
bytes
|> Array.take maxLen
|> Array.iteri (fun i c ->
if i % 16 = 0 then
if i > 0 then sb.AppendLine () |> ignore
sb.AppendFormat("{0:x4}\t", i) |> ignore
sb.AppendFormat("{0:x2}", c) |> ignore
if i % 16 = 7 then sb.Append ' ' |> ignore
elif i % 16 = 15 then
sb.Append " " |> ignore
for i = i - 15 to i do
if bytes.[i] >= 32uy && bytes.[i] <= 126uy
then bytes.[i] |> char
else '\183'
|> sb.Append
|> ignore
if i + 1 < maxLen then sb.Append ' ' |> ignore
)
sb.ToString ()
But this function should be part of F# itself as the byte array default presentation format.
We're unlikely to take it to standard library (FSharp.Core), it's too specific and opinionated. It belongs to some sort of helpers library/module, now the question if we want to ship something like it alongside the fsi? Fsi printing is extensible, so shouldn't be a problem to just have a DLL for it?
Its not opinionated its just the standard format for hex bytes. Anything else would be opinionated and specific. If theres an option for binary formatted bytes then why not a hex format.
Its not opinionated its just the standard format for hex bytes.
Right, in hex editor, where you can render it by columns and synchronise cursor between them, so it's actually useful.
In F# REPL I prefer to see closer to how it would've been defined, so I can copy easily and do something with it.
Does any other REPL have this as standard output?
What Im saying is viewing bytes in hex is not opinionated, its a very sensible and standard way of viewing bytes. In my emulator code I use this too:
let mpf68901 = 0xFFFA00u
etc, it would be extremely unnatural to use ints
What Im saying is viewing bytes in hex is not opinionated, it's a very sensible and standard way of viewing bytes.
What I meant is that it can be confusing as standard view option in repl, but fine as a function/method, but we don't really have a standardised way of shipping such helpers with fsi.
Indeed the above getBytesTable is a nice helper function for building specific tools. But the standard fsi output for any array should definitely stay [|...; ...; ...|].
fssnip.net is a good place to post / search such functions.
Could it be possible to do something like .ToString("H") or is that all coming from .Net side?
When you are e.g. debugging a byte array content, you would need something that is already there, not needing to trust a random intrnet site still has maybe a working part you have to plugin to your source code.
fsioutput for any array should definitely stay[|...; ...; ...|].
String is a seq of characters and still the fsi output is not like [| 72; 101; 108; 108; 111; 32; 87; 111; 114; 108; ... |].
Why? Because no-one finds that useful, exactly like bytes [|116uy; 104uy; ... |] .