dkan icon indicating copy to clipboard operation
dkan copied to clipboard

Support choosing data dictionaries in dataset node form

Open dafeder opened this issue 1 year ago • 3 comments

Modify json_form_widget to allow selection of Data Dictionaries populated by the metastore.

Testing steps

  1. Build from branch
  2. Modify dataset.ui.schema distribution.describedBy to be:
"describedBy": {
  "ui:options": {
    "description": "URL to the data dictionary for the file found at the Download URL.",
      "widget": "list",
      "type": "select",
      "titleProperty": "title",
      "source": {
        "metastoreSchema": "data-dictionary",
        "returnValue": "url"
      }
    }
},
  1. Create a data dictionary. It can be empty:
POST https://dkan.ddev.site/api/1/metastore/schemas/data-dictionary/items
{
  "identifier": "c4a997e1-e1dc-5139-93b5-0a4e15b012d7",
  "data": {
    "title": "Data dictionary title",
    "fields": []
  }
}
  1. Try selecting the data dictionary on a new dataset form
  2. Check the metastore at https://dkan.ddev.site/api/1/metastore/schemas/dataset/items
  3. The describedBy field should be displaying a working URL for the data dictionary.

Note: We really need to write a spec for the ui schema, and possibly make some things about this "source" feature a bit more consistent.

Upgrade

This PR changes the data dictionary schema. We did something a little weird with data dictionary where we made a schema with "identifier" and "data" properties but also put the "title" field at the same level. This has caused problems because in different places the metastore expects everything not a dataset to have an identifier/data wrapper. (This whole pattern is problematic but outside the scope of this ticket to fix; we have been talking about refactoring this for a while).

If you have a /schema folder in your DKAN project outside the DKAN module directory, you'll need to update your data-dictionary.json file with the new schema. If you have existing data dictionary nodes, an update hook will migrate them to the new schema.

To test:

  1. Build a site on 2.x branch.
  2. Create a data dictionary with the old schema:
POST https://dkan.ddev.site/api/1/metastore/schemas/data-dictionary/items
{
  "identifier": "c4a997e1-e1dc-5139-93b5-0a4e15b012d7",
  "title": "Data dictionary title",
  "data": {
    "fields": []
  }
}
  1. Checkout choose-dict branch
  2. Run drush updb
  3. Make sure you get no errors, and https://dkan.ddev.site/api/1/metastore/schemas/data-dictionary/items shows your data dictionary now with the title under "data".

dafeder avatar Nov 21 '23 22:11 dafeder

@paul-m we don't want to change the default dataset.ui.json because this only makes sense if you're using data dictionaries in "reference" mode; this would be an option you would implement in your site-specific implementation of the UI schema. Obviously, this should be reflected in the docs, but I think in general the json form widget doesn't have docs so this seems like a follow-up ticket and we consider this an undocumented/experimental feature for now.

I'll try to reproduce the errors you're getting.

dafeder avatar Nov 28 '23 17:11 dafeder

OK after much conversation I understand now that we're concerning ourselves with the distribution data dictionary and not the one for the dataset.

The distribution describedBy value is not a dkan:// scheme URI, so yay!

[..]
        "distribution": [
            {
                "@type": "dcat:Distribution",
                "format": "csv",
                "downloadURL": "https://demo.getdkan.org/sites/default/files/distribution/cedcd327-4e5d-43f9-8eb1-c11850fa7c55/Bike_Lane.csv",
                "describedBy": "https://dkan-core.ddev.site/api/1/metastore/schemas/data-dictionary/items/c4a997e1-e1dc-5139-93b5-0a4e15b012d7"
            }
        ]

Item 1: Following the instructions, I note that after altering the UI schema, but before adding a data dictionary, there's no way to add a URL instead of selecting one that's in metadata. That is, we can't just add a URL any more, and there are none available to select. This might be by design, but it might also be something for a follow-up.

Item 2: Also, after adding the data dictionary, while adding the dataset, there's no way to select NONE for the data dictionary. This seems a little more severe than the previous one, since we're allowing per-distribution dictionaries.

Item 3: The instructions say to use an empty data dictionary, which I realize is an edge case, but if you change the site's data dictionary settings to 'Distribution reference,' then you end up with a big error message when the DD is applied to the dataset:

% ddev drush cron
 [error]  TypeError: Drupal\metastore\DataDictionary\DataDictionaryDiscovery::getReferenceDictionaryId(): Return value must be of type ?string, none returned in Drupal\metastore\DataDictionary\DataDictionaryDiscovery->getReferenceDictionaryId() (line 104 of /var/www/html/dkan/modules/metastore/src/DataDictionary/DataDictionaryDiscovery.php) #0 /var/www/html/dkan/modules/metastore/src/DataDictionary/DataDictionaryDiscovery.php(74): Drupal\metastore\DataDictionary\DataDictionaryDiscovery->getReferenceDictionaryId('96e43b6b71d7e21...', 1703274569)

Whether these issues fall into the scope of the original ticket is up for discussion...

paul-m avatar Dec 22 '23 20:12 paul-m

@paul-m we don't want to change the default dataset.ui.json because this only makes sense if you're using data dictionaries in "reference" mode; this would be an option you would implement in your site-specific implementation of the UI schema. Obviously, this should be reflected in the docs, but I think in general the json form widget doesn't have docs so this seems like a follow-up ticket and we consider this an undocumented/experimental feature for now.

I'll try to reproduce the errors you're getting.

Maybe we can add a form_alter that alters the dataset form only when the data dictionary settings are set to Distribution reference. If its set to sitewide then that can appear on the dataset form so the user knows that.

kaise-lafrai avatar Feb 23 '24 21:02 kaise-lafrai