croissant icon indicating copy to clipboard operation
croissant copied to clipboard

Add support for examples to Croissant datasets

Open benjelloun opened this issue 2 years ago • 0 comments

Many dataset repositories show a few rows of data as examples, to help users understand the contents of data. To support that, we propose adding the ability to inline examples in a Croissant dataset.

A key benefit of examples is that they are freely accessible as part of the dataset metadata, while data files might be subject to additional access restrictions or license requirements.

Examples should be specified at the level of a RecordSet. At a syntax level, examples can be pretty much identical to the data property, which allows inlining the entire contents of a RecordSet.

One difficulty is providing data for binary fields (like images), but we have the same problem with inlined data as well. Maybe we can support urls that link to content, or base64 encoded binary content.

benjelloun avatar Nov 06 '23 16:11 benjelloun