Design of the p4c-model compiler
By reviewing code for multiple backends I have realized that the task of writing a backend is very onerous, and requires many manual checks that could be completely automated. In particular, the backend encodes a lot of information about the model in C++ code that verifies all sorts of well-formedness of the model usage, and extracts information from the model.
This is an initial PR for a compiler that would automate much of the task of validating a model and would simplify writing a back-end. Similar to the ir-generator compiler, which generate a lot of C++ boilerplate code for p4c, this compiler would generate lots of boilerplate code for a specific model. All backends that compile for that model would be able to reuse this code. For example, this compiler could generate code for psa.p4 which could be used in all backends for PSA, including psa-ebpf and psa-dpdk.
This initial PR only contains a README.md which describes briefly the design. I expect that this compiler will not be too difficult to implement. Migrating existing backends to use this infrastructure may be more complicated, but at least this should make writing new backends singificantly easier.
Please comment on this design @ChrisDodd @osinstom @rst0git @fruffy @hanw et al. Of course, contributions are welcome.
Makes perfect sense. I like the idea of re-using model-specific code among different backends.
Out of curiosity, does this use of the word "model" correspond to a P4_16 architecture?
The input to this compiler will be a *model.p4 file.
I liked the idea.
If we can simplify the way to visit the instantiated block, that would be nice. For example, multiple backends have code written like line 55 to 61 in this file. https://github.com/p4lang/p4c/blob/c794fb4aa8968f976bb0842b45543fdc5fe9fab1/backends/ebpf/psa/backend.cpp#L55
In order to get access to the instantiated control and parser block, backend often needs to do the following:
- apply an evaluator
- get access to toplevel,
- get access to main with toplevel->getMain()
- iterate through all the constantValue members in main
The boilerplate code is more complex if the architecture defines a main package that has a hierarchy of packages. Hopefully the model compiler can make this simpler.
Exactly. And also automate much error checking.
What would be the next steps for this? This is definitely useful for P4Testgen and all its back ends. But I wonder how we could adopt it.
indeed, it is not obvious that P4Test will actually benefit from this, but writing a new backend for an architecture should be significantly easier. One thing I haven't figured out is versioning: how you write a backend that supports multiple versions of the same architecture. If there is someone with free cycles to work on this I will be happy to collaborate.
Versioning will complicate the model compiler and thus one could consider it later or not support at all.