awestruct icon indicating copy to clipboard operation
awestruct copied to clipboard

write new docs

Open goto-bus-stop opened this issue 6 years ago • 4 comments

goto-bus-stop avatar Mar 18 '19 15:03 goto-bus-stop

Thanks for writing awestruct - I like the neat way to describe parsing schema you're doing here!

Regarding how to improve your documentation (it's actually quite decent already...): I saw jsBinarySchemaParser, but it had pretty much no documentation. One nice thing they did tho was give real world example or parsing a GIF file. binary-parser has nice documentation and does schema description in similar way (chaining functions rather than using array notation), if you'd want some inspiration.

Few questions:

  1. What would be the best way to write a loop in array notation? For example I have a section in binary file with readings from a sensor that are delimited by a string (ie ^reading#time20191013#value32145^reading...). And I want to parse this section, putting each chunk into it's own array entry or object key. Basically, parse a section and create a list of entries from it.

  2. (tl;dr: probably yes - rtfm!) Is it possible to define small Structs as a building block and reference them in another struct? For example, main struct parses the file and when it encounters a section for [person], it calls a Struct_Person to parse the section and output an object userid: {name: xx, surname: yy, gender: F,dob: yymmdd}. Or from question 1, outputs array of readings like [{date: yymmddhhmmss, value: 32145},{date: yymmddhhmmss, value: 987123}, {...}] . Hmm, looking at your Readme examples it looks like it should be possible, since you're declaring Structs in place. I'd say for complex files it might be better to declare Structs elsewhere and just reference them, following DRY principles. Ah, you've got Struct.type , yaay! I'll give it a try!

  3. BigInt/int64 is with us! Node 12 and V8 engine I might give it a go writing a type for it at some point, but I'm just starting with buffers/typedArrays so not there yet to do it. :: nudge nudge ::

  4. Value Paths is a huge thing that will help with creating a parsing script. Thanks a lot for this!

  5. for Custom types: Struct.Type(type) when you write "// always 1 byte, could also write as { size: 1 }", do you mean that the return here should always be 1, or you mean that if we declare 1 it will be always 1? How would I dynamically declare this value when invoking this type during parsing?

  6. Might be in the docs, but I didn't see it and at the moment not sure how I do this: Create a Struct/Type that scans the file searching for a particular string (text or raw bits/hexes) and skips the buffer/stream until it finds the trigger sequence. Something like (pseudocode) LOOP (for buffer index i) if (Int8AtIndex(i) != 'AF0CBB7D') {skip(i), i++} else {foundIt => parseAtIndex(foundIt), loop(end, return foundIt)} . The reason is that I have a file where there are many sections for which I don't know what they are and I'm not interested in spending time to figure them out, but there are few sections that have data I want to extract. So I want to skip the junk, go to a section that starts with a keyword (i.e. sample_names), parse that section until it ends with a keyword and then jump to another section further down (i.e. reading_values), parse that section until its end keyword and finish.

  7. Is it possible to use offsets to jump throughout the file from one location to another? Not for my current user case, but I know there are file formats where the structure is not linear but uses pointers instead. Or, actually, in my file I have a table of contents with names of entries and pointers to a later table with sections of values corresponding to entries in the table of contents. I was planning to parse the sections separately in sequence and then combine, but if there is a way to use pointers, that would sometimes help. Something like [TOC: {Jane:c, Andy:a, Nick:b}] [......] [VALUES: {a:321, b:765, c: 984}]

Thanks for reading!

Zireael avatar Oct 13 '19 10:10 Zireael

thanks for the feedback! I don't have much time today, so will just post two very quick points in response to 2/6/7 that I hope are relevant to you:

structs can be nested like

const A = Struct([
  ['someValue', Struct.types.int8]
])
const B = Struct([
  ['key', A], // reads an `A` and puts it at property `b.key`
  A, // reads an `A` and merges its properties into the resulting `b`
])
assert.deepEqual(
  B(Buffer.from([1, 2])),
  {
    key: { someValue: 1 },
    someValue: 2
  }
)

Struct.types.skip allows you to jump around by a certain number of bytes, but there is no builtin type to find a sequence and jump to it at the moment.

I made a bunch of modules to read binary file formats using this package, for example: https://github.com/genie-js/genie-dat/blob/master/src/object.js which extensively uses nesting.

goto-bus-stop avatar Oct 13 '19 10:10 goto-bus-stop

Thanks a lot for fast reply, but no need to rush - my questions are for your consideration when convenient (I hope they will help other people as well). I'll play with the code and see how far I get but also got other work to do in between ^^ I'm happy tho as it looks like writing my parsing script will become that much easier.

Zireael avatar Oct 13 '19 10:10 Zireael

An alternative to new docs could be to convert to TypeScript...

Rechdan avatar Oct 18 '20 00:10 Rechdan