edalize create schema for EDAM format

It would be nice to have a schema for defining constraint of the fields defining the EDAM format.

An obvious language agnostic candidate would be JSON schema, but there are higher level options like CUE which are easier to author, make it easier to mix both constraint and data and also allow to generate JSON schema for better interop.

Dec 10 '21 11:12 proppy

some pointer from @olofk:

https://edalize.readthedocs.io/en/latest/edam/api.html
https://github.com/olofk/fusesoc/blob/master/fusesoc/edalizer.py#L235

Dec 10 '21 11:12 proppy

Given the heavy usage of python in the opensource EDA tooling, using https://www.python.org/dev/peps/pep-0484/, https://www.python.org/dev/peps/pep-0526/ and https://www.python.org/dev/peps/pep-0557/ could be an easier way to describe the format while allowing direct integration with the edalize codebase.

Some wrapper like https://pydantic-docs.helpmanual.io/ also allow to generate JSON schema out of those annotations for better interoperability with other languages/tools: https://pydantic-docs.helpmanual.io/usage/schema/

Dec 10 '21 12:12 proppy

https://github.com/protocolbuffers/protobuf or https://capnproto.org/ could also be an option if we're looking for x-language interop and possible over-the-wire invocation.

the later in particular is also used for https://fpga-interchange-schema.readthedocs.io/ which could bring interesting synergy between the two projects.

Dec 10 '21 14:12 proppy

@olofk could you point me to examples of EDAM files that we could test schema validation against?

Dec 13 '21 04:12 proppy

@proppy Found some on my disk and put them in a gist here https://gist.github.com/olofk/cbf0625ff697232b3519e32a6a85396a Let me know if there's something in particular you would like

Dec 13 '21 10:12 olofk

@olofk, used https://pypi.org/project/genson/ a jsonschema from those: https://gist.github.com/proppy/113ba53a8a339baf7f906f05db0b0883#file-edam-schema-json

This of course should be refined with proper description property for every field but it gives an idea on how it could look like.

Dec 16 '21 14:12 proppy

used https://pypi.org/project/datamodel-code-generator/ to generate a pydantic model from https://github.com/olofk/edalize/issues/288#issuecomment-995862085: https://gist.github.com/proppy/113ba53a8a339baf7f906f05db0b0883#file-edam_model-py to get an idea how it would look like to define the schema using regular python typing annotation.

Dec 16 '21 14:12 proppy

and https://gist.github.com/proppy/113ba53a8a339baf7f906f05db0b0883#file-edam-cue gives an idea of what it would like in https://cuelang.org/ (I personally have a bias :cupid: for that one)

Dec 16 '21 14:12 proppy

Given the heavy usage of python in the opensource EDA tooling, using https://www.python.org/dev/peps/pep-0484/, https://www.python.org/dev/peps/pep-0526/ and https://www.python.org/dev/peps/pep-0557/ could be an easier way to describe the format while allowing direct integration with the edalize codebase.

@proppy, the usage of dataclasses for providing an alternative structured Python implementation that could read .core files was prototyped in umarcor.github.io/osvb/apis/core. Unfortunately, the dataclass approach has issues with polymorphic fields/objects/values. Edalize has some polymorphic fields. See yukihiko-shinoda/yaml-dataclass-config#19. That's why "CAPI 3" is used in OSVB, meaning that the current CAPI format might need to be adapted in order to support a dataclass approach.

Dec 17 '21 18:12 umarcor

@umarcor that doesn't seems to be an issue with dataclass itself, but rather with the dataclasses-json wrapper library, see https://gist.github.com/proppy/073181576ff968ad8abf77f8b7b258ac

dataclasses are capable of modeling both types, but don't perform any validation
pydantic integration with dataclasses does seems to perform accurate validation

Dec 17 '21 19:12 proppy

@proppy, awesome! Thanks a lot for that example!

Dec 17 '21 19:12 umarcor

@proppy That look great to me. The cue files are very readable. Happy to move forward with this.

Despite the heavy use of Python within this generation of FOSSi EDA tools I'm reluctant to build in any Python specifics into public interfaces, which is why I prefer a language-agnostic solution like this.

How about I create a new edam repo under the FuseSoC github org (to avoid creating yet another org, or perhaps some other org could be a better home?), give you write access and then you can just dump what you got so far in there to get things started?

Dec 18 '21 09:12 olofk

@olofk maybe we can have it live in a contrib directory in the main edalize repo first and get it out as a separate thing if it becauses too invasive.

I'd like to make sure we maintain the spec along the tool, and this seems easier to achieve this if we can have atomic commits that update both in same repo.

wdyt?

Dec 21 '21 14:12 proppy

re: something FuseSoC github org, know that I think more about it would it also make sense to have CUE schema definitions for .core files? (maybe we could even express the EDAM definitions as a transformation from the core ones using CUE expressions?)

wdyt @olofk?

Jan 05 '22 16:01 proppy

Yes @proppy . Let's keep it in Edalize for now. And cue files for CAPI2 core files would be fantastic. I want to kill myself every time I need to make a change to the CAPI2 format because the code is damn near impossible to understand, and I wrote it after all

Jan 12 '22 21:01 olofk

edalize edalize copied to clipboard

create schema for EDAM format

edalize
edalize copied to clipboard