[NEURIPS] Hosted Editor hides files with errors (cannot be deleted)
Using the editor hosted on HuggingFace (https://huggingface.co/spaces/MLCommons/croissant-editor), I first accidentally added a FileObject instead of a FileSet. When I selected FileSet as that FileObjects parent, the editor hid the file (I'm assuming because an error was raised?). I corrected my mistake by adding two FileSets instead. Now, the editor shows that I have 3 resources on the overview tab even though I only two resources show up on the resources tab (see images). Furthermore, the overview tab highlights a number of errors related to that initial FileObject having unfilled fields (see image), and so I cannot export my croissant metadata (export button is not clickable).
Overview tab shows 3 resources:
Resource tab shows only 2 resources (which is what I want):
Overview tab shows errors relating to missing resource file:
Hi Francois,
If your data is in an archive, you should first add a FileObject for the archive file, and then a "child" FileSet with containedIn set to the FileObject.
I would recommend creating a new dataset from scratch in the editor... I'm not sure the one you currently have can be easily fixed.
Hope this helps, Omar
If your data is in an archive, you should first add a FileObject for the archive file, and then a "child" FileSet with containedIn set to the FileObject.
The problem is that the editor doesn't support uploading an archive file:
I would recommend creating a new dataset from scratch in the editor... I'm not sure the one you currently have can be easily fixed.
Fair enough. However, the fact that I have to restart from scratch every time I make even a small mistake makes the editor not user-friendly in the slightest. I should be able to free delete mistake files, rather than the editor hiding them from the UI but still complaining about the errors.
Same issue with tar.xz files, when provide a link and all requied fields, the error message persists.
I encounter similar issues, where the difference is that even when I upload unarchived .csv files, the editor still shows errors like "At least one of these properties should be defined: ['md5', 'sha256'].", "Property "https://schema.org/contentUrl" is mandatory, but does not exist.", "Property "https://schema.org/encodingFormat" is mandatory, but does not exist.", Node "xxx" is a field and has no source. Please, use http://mlcommons.org/croissant/source to specify the source.
Especially for the "no source" issue, the croissant documentation https://mlcommons.org/croissant/source does not exist.