Omar Benjelloun

Results 16 issues of Omar Benjelloun

Many dataset repositories show a few rows of data as examples, to help users understand the contents of data. To support that, we propose adding the ability to inline examples...

Croissant datasets in our Github repository are represented as standalone JSON-LD files. In order to make them crawlable, they should be made accessible as web pages with the JSON-LD embedded...

In this [discussion](https://github.com/mlcommons/croissant/discussions/52), we reached a consensus on how to represent enumerations at the RecordSet and Field level, by introducing an "isEnumeration" boolean property. We should add that mechanism to...

To help users understand Croissant dataset descriptions, it would be useful to generate diagrams that represent the contents of a dataset. A croissant diagram could consist of two layers: 1)...

The Croissant specification is inconsistent in its description of references (foreign keys). It still mentions a Reference class, which we removed prior to the 1.0 release in favor of the...

documentation

Define a mechanism to describe lineage of data / provenance information. This mechanism should support multiple levels of granularity: - Dataset level - RecordSet level - Field level - Row...

1.1

The Croissant Spec allows nesting RecordSets inside RecordSets, by using a field with dataType="cr:RecordSet" https://docs.mlcommons.org/croissant/docs/croissant-spec.html#nested-records This mechanism has not been used much, is not supported in the mlcroissant library, and...

1.1

Add a mechanism to Croissant to define data-level annotations. Annotations are a general mechanism to attach additional information to other pieces of data. We plan to use annotations for a...

1.1