Add "synthetic" entry to DatasetType
Propose to add "synthetic" entry to represent synthetic data in the DatasetType vocabulary.
A synthetic data is artificial data that is generated using algorithms, rather than being collected from real-world sources. It can be used to supplement or replace real-world data in various applications, such as machine learning, testing and validation, and privacy and security.
Having an ability to specify that a dataset is synthetic will help avoiding it being treated as a real-world collected dataset.
- synthetic: Data is artificially generated rather than produced by real-world events.
@rgopikrishnan91 @bennetkl
Currently, the DatasetType is more about structure of the data in the dataset but not where the data is from.
Arguably, "synthetic" is more about the source of data rather than the structure of the data.
So we may consider to have this as a separate property -- for example, isSynthetic or containsSynthetic ?