Binary Parsing - Null Terminated ASCII with Max bytes
I am attempting to parse a field that is a potentially Null Terminated ASCII string with a max of 64 bytes. If there is no null terminator at the 64th byte then I would like to terminate the string automatically.
This is what I have for a Null Terminated ASCII string and it appears to work fine. I just need to find a way to stop parsing past 64 bytes if there is no null terminator. Ideally in an efficient manner.
static Parser<byte, IEnumerable<byte>> NullTerminatedStringBytes = SingleByte.Until(Token(b => b == 0x00).Labelled("Null Terminated"));
public static Parser<byte, string> NullTerminatedString = Map((nullString) =>
{
var buffer = nullString.ToArray();
var result = Encoding.ASCII.GetString(buffer);
return result;
}, NullTerminatedStringBytes);
Thoughts?
I don't think there's a combinator which'll do this for-break type of loop out of the box. You can code it up yourself using recursion:
Parser<byte, IEnumerable<byte>> NullTerminated()
=> NullTerminated(64);
Parser<byte, IEnumerable<byte>> NullTerminated(int count)
{
if (count == 0)
{
return Return(new[] { });
}
return Byte.Then(b =>
b == 0
? Return(new[] { })
: NullTerminated(count - 1).Select(bs => new[]{ b }.Concat(bs))
);
}
ofc that won't be very efficient cos you're building parsers at runtime and storing the output bytes in a linked list shaped structure. (Could perhaps tweak this by eg putting the bytes in a List or caching the recursive NullTerminated calls but that's messy.)
One way to design an API for this might be some sort of "bounded Many", which runs a parser in a loop until it fails without consuming input or some maximum number of repetitions is reached. Then your code would look like this:
Parser<byte, IEnumerable<byte>> NullTerminated
=> Token(b => b != 0).ManyUpTo(64); // name TBD
I'd be happy to accept a PR implementing this ManyUpTo combinator.