catala icon indicating copy to clipboard operation
catala copied to clipboard

Consider use of ATD for the definition of Catala types representation in the different backends

Open AltGr opened this issue 1 year ago • 4 comments

This PR has some issues related to this.

In several contexts, we want clearly defined datatypes for interchange: types of arguments and returned values when compiled Catala code is used as a library, exploration of Catala execution traces, etc. ; this is already obvious in the explanation backends and in the way our OCaml runtime has built-in JSON output.

ATD is "a language for defining data types across multiple programming languages and multiple data formats." It is mature, in use in the wild and seems actively maintained. The syntax is simple, OCaml-like and allows for annotations for specific backends. These definitions can be compiled into type definitions for OCaml, Python, Java, Typescript, etc. that we could reuse in our different backends.

In addition, ATD is completed with generators that provide i/o functions of the defined types in the various backends, both through JSON and biniou (a custom, more efficient binary format). Annotations can be used to customise the representation (e.g. wether dicts or associative lists should be used).

A point that may be interesting in the case of the usage in our explanations web-app is that the generation of JSON schemas is also possible.


A way to leverage it could be to have Catala generate, alongside its ouput source code in the backend target language, an ATD file that would be compiled to the expected type definitions for the same language. We can then call the appropriate atdgen command and use the resulting type definitions (or embed the relevant part of atdgen and run it as a library from Catala).

It could be interesting for the user to be able to tweak these files with annotations, comments, or custom data validators, but that leaves open the question of how we can synchronise them.

AltGr avatar Feb 26 '24 10:02 AltGr