marquez
marquez copied to clipboard
Dataset `currentVersion` not in Dataset Versions Listing
When retrieving the details of a Dataset, an attribute is included to specify the current version of that Dataset.
curl -s {baseUrl}/api/v1/namespaces/food_delivery/datasets/public.delivery_7_days | jq '.currentVersion'
"54e17e0b-b9da-43f2-8a7a-33d139625a18"
However, if you try to use this version identifier to retrieve a specific Dataset version, no records can be located.
curl -s {baseUrl}/api/v1/namespaces/food_delivery/datasets/public.delivery_7_days/versions/54e17e0b-b9da-43f2-8a7a-33d139625a18
{"code":404,"message":"Dataset version '54e17e0b-b9da-43f2-8a7a-33d139625a18' not found."}
The API call to display all versions for this Dataset does not include the currentVersion value obtained in the first step.
curl -s {baseUrl}/api/v1/namespaces/food_delivery/datasets/public.delivery_7_days/versions/ | jq '.versions | map(.version)'
[
"337f23ac-e1b5-3bac-ade4-5bc5dd5312fb",
"52afbdc4-d397-3208-8a22-e6cb0aaaaddc",
"b1e5c54b-ed0b-37d9-b3fc-258d82d3fd1b",
"246ba3cd-9920-3681-a488-9ae0c46fed4c",
"48eb39b2-07b8-3ae0-a275-c56c4d460334",
"0529933a-6a85-3090-b91f-db9f4fc19a07",
"5623b1b7-02d3-3aed-81b0-ced013aa0d76",
"bdeca942-a66e-3956-aea5-096cdcbe1705",
"00ae6d5f-bdb9-3691-9849-15fdc9079622",
"3e84ed80-ed8a-3a31-aa77-bd027fb72ac3",
"8e46b91c-7cf6-3341-9842-830ed3c39918",
"1181b17f-222c-3f97-9baf-1a6dce5f840a",
"574ce8d4-6f26-34bd-a9ad-c0ea0abeea0a",
"e9e55522-5501-3bc8-823d-72d839091850",
"a30772dc-ed05-3f5e-baa7-66112b2caf96",
"6a1c9760-3fba-3ea9-9379-0b5928af5302",
"59cecc28-f97a-3123-8db1-79ee9f0f20f0",
"2341448d-b186-34b0-abbd-7b78018096e8",
"44372ad7-3216-3541-8422-9ac303ea89e6",
"4b152261-1f35-3ae8-8ae0-e67c4f294abf",
"de2fb8f9-7358-3a20-8656-f3e88ccca78f",
"df242c88-7bf9-3b17-bfb3-f6fbc8aa6c59",
"1947fe61-45e4-3866-90d0-1ca09d4b5339",
"2f07307b-1b72-36d1-88ef-34764e9852e2",
"7f4da3d1-1e63-3c35-b241-ca8dcce33a9c"
]
This can be reproduced by running a fresh Marquez install and attempting to lookup any of the seed data. On a related note, the Gitpods functionality that was recently added made reproducing this issue extremely easy!
I'm able to reproduce. I'd be happy to take this one.
Related: https://github.com/MarquezProject/marquez/issues/1977
There are a couple things going on that are causing some confusion:
- Currently the API for
GET
ting aDatasetVersion
gets version data fromDatasetVersionDao
. The queries inDatasetVersionDao
use the fieldversion
. As discussed in the commentary on #1977, this field is supposed to be an internal identifier to allow any logic around the DatasetVersion UUID to change as needed without affecting the client. This is the source of the bug itself. - Due to the confusion with
version
, the nomenclaturecurrentVersion
is misleading. The API forGET
ting aDataset
correctly gets the intended-to-be-publicly-facing UUID, butcurrentVersion
remains confusing, and the bug from (1) initially makes this API appear to be the culprit.
Closing #1977 according to my comment there and updating Dataset.java to a clearer name (e.g. currentVersionUuid
) would resolve this issue.
If this makes sense, I'll try and close this by the end of the week.