croissant
croissant copied to clipboard
Croissant diagram
To help users understand Croissant dataset descriptions, it would be useful to generate diagrams that represent the contents of a dataset.
A croissant diagram could consist of two layers:
- (Bottom) Resources: Graphical rendition of the FileObject and Fileset contents of a dataset, with links that represent their dependencies. (e.g., a set of images FileSet extracted from an archive FileObject)
- (Top) The RecordSets defined in the dataset, with the Field entries they contain, and links to the sources of their data (FileObject and FileSet in the resources layer).
Such diagrams can be included in the documentation of a dataset, in the croissant dataset viewer.
To generate them, we can rely on an existing package such as mermaid, or nomnoml. These packages rely on a textual representation of the diagram, which can easily be generated from the validator, based on the object representation it creates when it parses a croissant dataset.