dataverse icon indicating copy to clipboard operation
dataverse copied to clipboard

Implement Dropdown for Dataset Tagging in Metadata

Open Saixel opened this issue 1 year ago • 6 comments

Background

In the context of managing datasets in Dataverse, it is crucial to differentiate and filter datasets that contain data, code, or a combination of both. Currently, there is no clear mechanism for tagging these content types, which hampers effective organization and search within the platform.

Feature Request

Implement a dropdown menu for tagging in the metadata, possibly in the citation field. This new metadata field will allow users to indicate whether the dataset contains data, code, or a combination of both.

Justification

This change will significantly improve the organization and searchability of datasets within Dataverse. By allowing clear differentiation between datasets containing data, code, and combinations of both, users can more easily find the resources they need, enhancing the platform's efficiency and usability.

Implementation Considerations

  • Review the appropriate metadata block to integrate this new tagging field.
  • Implement a dropdown menu to select between "Data," "Code," and "Data + Code" options.
  • Ensure the new field is easily accessible and usable for searching and filtering datasets.

Additional Context

This request arises from the need to improve dataset management and organization in Dataverse, facilitating the differentiation and searchability of datasets based on their content. This change is particularly relevant for projects handling large volumes of data and code, such as the CAFE project.

Saixel avatar Aug 06 '24 12:08 Saixel

Is this resolved by #10694 ?

qqmyers avatar Sep 03 '24 15:09 qqmyers

Not really. The core dataset types added in #10694 are "Dataset" "Workflow" and "Software", though you can add your own. Also, it's only settable when you create a Dataset via the api.

sekmiller avatar Sep 03 '24 17:09 sekmiller

Fair enough - #10694 is only the first of several PRs, but we should make sure whether the underlying dataset type idea works for, or can work for, this use case and avoid creating a similar mechanism.

qqmyers avatar Sep 03 '24 18:09 qqmyers

We talked about this in tech hours today, the relationship between this issue and PR #10694.

For my part, once #10694 becomes available on Harvard Dataverse, the CAFE team is welcome to use it. For now, you have to create datasets via API to set datasetType=software.

If only a facet is needed, a quick solution could be to add a dropdown to a custom metadata block to allow the user to choose between the three options explained above: "Data," "Code," and "Data + Code". Perhaps this has been the plan all along. I'm not sure. 😅

pdurbin avatar Sep 03 '24 20:09 pdurbin

As you mentioned @pdurbin, that is just what we did!

image

Saixel avatar Oct 07 '24 17:10 Saixel

Hi @Saixel are you aware of the "tagging" feature that allows depositors to tag what type of file they are depositing? Here is an image of what it looks like in demo and production. and they are searchable facets as in the second very well tagged dataset in this image.

Looks highly similar to what is being proposed above and you can name the tag anything you need it to. By default we have "code" "documentation" and "data" in production setting. You can add any other file type tagging you need.

Screen Shot 2024-10-08 at 3 45 57 PM

Just an extensive example used a little differently in social science data: Screen Shot 2024-10-08 at 3 45 11 PM

sbarbosadataverse avatar Oct 08 '24 19:10 sbarbosadataverse

After reviewing the existing tagging feature and confirming with the team, we determined that it fulfills the objectives of this request. Therefore, creating a new metadata field is unnecessary, and we will proceed with utilizing the existing functionality. Marking this as resolved.

Saixel avatar Nov 21 '24 17:11 Saixel