edalize icon indicating copy to clipboard operation
edalize copied to clipboard

create schema for EDAM format

Open proppy opened this issue 3 years ago • 15 comments

It would be nice to have a schema for defining constraint of the fields defining the EDAM format.

An obvious language agnostic candidate would be JSON schema, but there are higher level options like CUE which are easier to author, make it easier to mix both constraint and data and also allow to generate JSON schema for better interop.

proppy avatar Dec 10 '21 11:12 proppy

some pointer from @olofk:

  • https://edalize.readthedocs.io/en/latest/edam/api.html
  • https://github.com/olofk/fusesoc/blob/master/fusesoc/edalizer.py#L235

proppy avatar Dec 10 '21 11:12 proppy

Given the heavy usage of python in the opensource EDA tooling, using https://www.python.org/dev/peps/pep-0484/, https://www.python.org/dev/peps/pep-0526/ and https://www.python.org/dev/peps/pep-0557/ could be an easier way to describe the format while allowing direct integration with the edalize codebase.

Some wrapper like https://pydantic-docs.helpmanual.io/ also allow to generate JSON schema out of those annotations for better interoperability with other languages/tools: https://pydantic-docs.helpmanual.io/usage/schema/

proppy avatar Dec 10 '21 12:12 proppy

https://github.com/protocolbuffers/protobuf or https://capnproto.org/ could also be an option if we're looking for x-language interop and possible over-the-wire invocation.

the later in particular is also used for https://fpga-interchange-schema.readthedocs.io/ which could bring interesting synergy between the two projects.

proppy avatar Dec 10 '21 14:12 proppy

@olofk could you point me to examples of EDAM files that we could test schema validation against?

proppy avatar Dec 13 '21 04:12 proppy

@proppy Found some on my disk and put them in a gist here https://gist.github.com/olofk/cbf0625ff697232b3519e32a6a85396a Let me know if there's something in particular you would like

olofk avatar Dec 13 '21 10:12 olofk

@olofk, used https://pypi.org/project/genson/ a jsonschema from those: https://gist.github.com/proppy/113ba53a8a339baf7f906f05db0b0883#file-edam-schema-json

This of course should be refined with proper description property for every field but it gives an idea on how it could look like.

proppy avatar Dec 16 '21 14:12 proppy

used https://pypi.org/project/datamodel-code-generator/ to generate a pydantic model from https://github.com/olofk/edalize/issues/288#issuecomment-995862085: https://gist.github.com/proppy/113ba53a8a339baf7f906f05db0b0883#file-edam_model-py to get an idea how it would look like to define the schema using regular python typing annotation.

proppy avatar Dec 16 '21 14:12 proppy

and https://gist.github.com/proppy/113ba53a8a339baf7f906f05db0b0883#file-edam-cue gives an idea of what it would like in https://cuelang.org/ (I personally have a bias :cupid: for that one)

proppy avatar Dec 16 '21 14:12 proppy

Given the heavy usage of python in the opensource EDA tooling, using https://www.python.org/dev/peps/pep-0484/, https://www.python.org/dev/peps/pep-0526/ and https://www.python.org/dev/peps/pep-0557/ could be an easier way to describe the format while allowing direct integration with the edalize codebase.

@proppy, the usage of dataclasses for providing an alternative structured Python implementation that could read .core files was prototyped in umarcor.github.io/osvb/apis/core. Unfortunately, the dataclass approach has issues with polymorphic fields/objects/values. Edalize has some polymorphic fields. See yukihiko-shinoda/yaml-dataclass-config#19. That's why "CAPI 3" is used in OSVB, meaning that the current CAPI format might need to be adapted in order to support a dataclass approach.

umarcor avatar Dec 17 '21 18:12 umarcor

@umarcor that doesn't seems to be an issue with dataclass itself, but rather with the dataclasses-json wrapper library, see https://gist.github.com/proppy/073181576ff968ad8abf77f8b7b258ac

  • dataclasses are capable of modeling both types, but don't perform any validation
  • pydantic integration with dataclasses does seems to perform accurate validation

proppy avatar Dec 17 '21 19:12 proppy

@proppy, awesome! Thanks a lot for that example!

umarcor avatar Dec 17 '21 19:12 umarcor

@proppy That look great to me. The cue files are very readable. Happy to move forward with this.

Despite the heavy use of Python within this generation of FOSSi EDA tools I'm reluctant to build in any Python specifics into public interfaces, which is why I prefer a language-agnostic solution like this.

How about I create a new edam repo under the FuseSoC github org (to avoid creating yet another org, or perhaps some other org could be a better home?), give you write access and then you can just dump what you got so far in there to get things started?

olofk avatar Dec 18 '21 09:12 olofk

@olofk maybe we can have it live in a contrib directory in the main edalize repo first and get it out as a separate thing if it becauses too invasive.

I'd like to make sure we maintain the spec along the tool, and this seems easier to achieve this if we can have atomic commits that update both in same repo.

wdyt?

proppy avatar Dec 21 '21 14:12 proppy

re: something FuseSoC github org, know that I think more about it would it also make sense to have CUE schema definitions for .core files? (maybe we could even express the EDAM definitions as a transformation from the core ones using CUE expressions?)

wdyt @olofk?

proppy avatar Jan 05 '22 16:01 proppy

Yes @proppy . Let's keep it in Edalize for now. And cue files for CAPI2 core files would be fantastic. I want to kill myself every time I need to make a change to the CAPI2 format because the code is damn near impossible to understand, and I wrote it after all

olofk avatar Jan 12 '22 21:01 olofk