kaitai_struct
kaitai_struct copied to clipboard
Serialization merge
Following up #27, @generalmimon done tremendous job at implementing serialzation for some languages. However, it's not yet merged into master
, thus so far these live into separate "serialization" branches:
- https://github.com/kaitai-io/kaitai_struct_compiler/tree/serialization
- https://github.com/kaitai-io/kaitai_struct_tests/tree/serialization
- https://github.com/kaitai-io/kaitai_struct_java_runtime/tree/serialization
- https://github.com/kaitai-io/kaitai_struct_python_runtime/tree/serialization
It's documented in https://doc.kaitai.io/serialization.html, but ultimately our goal obviously would be not supporting two separate forks, but having one that does everything.
I wanted to start making itemized list of tasks of what exactly needs to be done to merge these back to master.
@generalmimon, can I ask you to share your thoughts on this?
It's documented in https://doc.kaitai.io/serialization.html, but ultimately our goal obviously would be not supporting two separate forks, but having one that does everything.
Of course. I'd love to do this - generally speaking, serialization in Python and Java is pretty much finished at this point and I believe the serialization branches are mostly ready for the merge. Let me assess the situation in individual repos:
-
[x] kaitai_struct_tests: The
aggregate/convert_to_json
script (and related XML "parsers") has to be changed to read the test name not only from the test method name, but to take the combination of test class name + test method name as a unique test.This is needed mainly because in both Java and Python, serialization is tested primarily via a generic test method called
testReadWriteRoundtrip
/test_read_write_roundtrip
, which is defined in theCommonSpec
abstract base class. Unfortunately, from what I've seen, theconvert_to_json
script currently takes only the test method name as the name of the test, so differenttestReadWriteRoundtrip
tests from all test formats are mistaken for just one test in the outputci.json
file. -
kaitai_struct_tests: Another thing to note here (but there's probably no need to do anything about it before the merge, it can be done after that) is that I've added ~60 new test formats during the serialization work, so these should eventually be generated from KST / ported to other languages. But I think our testing system doesn't care about additional test formats / KST specs without language-specific specs, so this can be done gradually for individual languages (no need to do it at once).
-
kaitai_struct_compiler: Since the
serialization
branch is up-to-date with the0.10
tag, I don't anticipate almost any problems with the merge. The only feature I'm personally afraid of are the "zero-copy" substreams (https://github.com/kaitai-io/kaitai_struct/issues/44), which I haven't thought of at all in the context of serialization. I think if we want to getserialization
merged tomaster
soon and avoid problems, for now I'd let the--read-write
option imply--zero-copy-substream false
(https://github.com/kaitai-io/kaitai_struct_compiler/commit/690af1b36d6fea4266d06dc58f1b5482524bdfe7).Or perhaps (but I don't know much about the substream implementation), the substreams could be only used for parsing, but serialization would still use the old way with copying for now.
-
kaitai_struct_java_runtime, kaitai_struct_python_runtime: Probably the least problematic. They could be perhaps merged right now because they actually only add APIs and the changes are independent on compiler changes, so the existing functionality should work as it did (at least I think).
Also I've just extended the description of --read-write
a bit to match the current situation, hopefully that's OK (JavaMain.scala:60-62
):
opt[Unit]('w', "read-write") action { (x, c) =>
c.copy(runtime = c.runtime.copy(readWrite = true, autoRead = false))
} text("generate read-write support in classes (implies --no-auto-read, Java and Python only, default: read-only)")
Are there some guidances how to support serialization for other languages? Especially interested in the JS one... Happy to contribute where possible.
I'm taking a look at
- [ ] kaitai_struct_tests: The aggregate/convert_to_json script ...
Slowly getting familiar with the tests
submodule. Limiting myself to the 'java'
flows.
Propagating class and method names into TestResult
is trivial and we can get this kind of ci.json
(actual output):
...
"TestBcdUserTypeBe#testBcdUserTypeBe": {
"status": "passed",
"elapsed": 0.001
},
"TestBcdUserTypeBe#testReadWriteRoundtrip": {
"status": "passed",
"elapsed": 0.006
},
...
However, tagging the results with is_kst: true
is not so trivial. kst_adoption.log
currently relies on congruence between filename and test name. Since serialization
departs from this congruence, we'll have to tweak some kst_adoption
machinery.
I was thinking we could generate kst_adoption_spec.log
and kst_adoption_specwrite.log
, instead of a single kst_adoption.log
, by doing Dir.glob('spec/java/src/**/specwrite/Test*.java')
and Dir.glob('spec/java/src/**/spec/Test*.java')
in kst-adoption-report
. Then we give TestResult
an is_specwrite: Boolean
field that is derived from the presence of specwrite
in the FQN of the class name, and now convert_to_json#add_kst_adoption
can tag accordingly.
What do you think?
I haven't looked at the full pipeline, but since serialization
affects test names in several ways, we may want to anticipate the implications for the order in which they appear on https://ci.kaitai.io/.
Moreover, perhaps I could start working on kaitai_ci_ui
to prepare it for adding the write test results (@GreyCat mentioned this on Gitter). I've created a ticket to refine the layout in https://github.com/kaitai-io/kaitai_ci_ui/issues/44.
I've took liberty at merging kaitai_struct_java_runtime — see https://github.com/kaitai-io/kaitai_struct_java_runtime/commit/e34e5a9b4d3e02a323fdbb000b61d0dd0fb9f776. A few very minor conflicts, otherwise all read-only tests work just as before.
Also merged https://github.com/kaitai-io/kaitai_struct_python_runtime — see https://github.com/kaitai-io/kaitai_struct_python_runtime/commit/704995acdb2f114e15809cf5f469e3a80a2dddad.
No changes in tests, no conflicts.
Also merged https://github.com/kaitai-io/kaitai_struct_python_runtime — see kaitai-io/kaitai_struct_python_runtime@704995a.
Seems a bit weird to me to do this as a fast-forward merge, but OK.
I've just resolved the merge conflicts on branch https://github.com/kaitai-io/kaitai_struct_compiler/tree/serialization.
How is it going? I wanner to do sth for go.
There is a question: we should imply function to calc length?
meta:
id: value_instances
seq:
- id: len_data_raw
type: u1
- id: data
size: len_data
instances:
len_data: # for read
value: len_data_raw - 3
calc_length: # for write
value: data.length + 3
calc_length
for write.
ValueInstances r = new ValueInstances();
r.setData(new byte[] { 1, 2, 3, 4, 5 });
r.setLenDataRaw(r.calcLength());
System.out.println(r.lenData()); // => 5
- array length reduce: we cannot range array.
seq:
- id: length
type: u2be
- id: params
type: params
size: len_params
instances:
len_params:
value: length - 2
calc_length:
value: params.calc_length + 2
types:
params:
seq:
- id: params
type: parameter
repeat: eos
instances:
calc_length:
value: params.reduce(0, r + i.calc_length)
parameter:
seq:
- id: length
type: u2be
- id: data
size: len_data
instances:
len_data:
value: length - 2
calc_length:
value: data.length + 2
The implementation of Golang is almost complete, but there have been some breaking changes and some areas that do not meet my requirements. If no one responds to my questions, then I will have to implement it according to my own ideas. As a result, there is a high probability that it won't be merged into the master branch.
What is the status here? Thanks for everyone working on this. I would like to use Kaitai for numerous Gadgetbridge gadgets, and serialization is paramount there.
Is there any chance of serialization support in the main java package? Maybe even in the web IDE?