datahub
datahub copied to clipboard
UI exception after adding a data contract
After adding the data contract below via: datahub datacontract upsert -f mq.yaml
, when navigating to the dataset in UI I get: No enum constant com.linkedin.datahub.graphql.generated.AssertionType.FRESHNESS (code 400)
mq.yaml
entity: urn:li:dataset:(urn:li:dataPlatform:postgres,ml_db.public.mq,PROD)
version: 1
data_quality:
- type: unique
I would have expected the data contract to show up in the UI under the validation tab. At the moment, the dataset is not accessible anymore because of said error.
Desktop (please complete the following information):
- OS: deployed datahub via the docker quickstart. OS: MacOS Sonoma
- Browser: chrome
- Version: v0.13.1
When deleting the datacontract via CLI, the issue seems to be persisting and now I have a broken entity I can't access anymore.
The same happens when in the contract you specify a schema. This time the exception is: No enum constant com.linkedin.datahub.graphql.generated.AssertionType.DATA_SCHEMA (code 400)
. Even after deleting the data contract, the issue persists. This makes these datasets unusable. Even deleting records in the metadata table in mariadb leads to nothing.
Trying to fetch the entity via Rest.li doesn't return any info to a connected data contract:
curl --header 'X-RestLi-Protocol-Version: 2.0.0' 'http://localhost:8080/entitiesV2/urn%3Ali%3Adataset%3A(urn%3Ali%3AdataPlatform%3Apostgres%2Cml_db.public.mq%2CPROD)' --header 'Authorization: Bearer xxx'
{"urn":"urn:li:dataset:(urn:li:dataPlatform:postgres,ml_db.public.mq,PROD)","aspects":{"container":{"created":{"actor":"urn:li:corpuser:datahub","time":1715606005904},"name":"container","type":"VERSIONED","systemMetadata":{"lastRunId":"postgres-2024_05_13-16_05_49","lastObserved":1715609175530,"runId":"postgres-2024_05_13-15_12_52"},"version":0,"value":{"container":"urn:li:container:3259e02b11cc42389f51a53ea87bf335"}},"browsePathsV2":{"created":{"actor":"urn:li:corpuser:datahub","time":1715606006906},"name":"browsePathsV2","type":"VERSIONED","systemMetadata":{"lastRunId":"postgres-2024_05_13-16_05_49","lastObserved":1715609175533,"runId":"postgres-2024_05_13-15_12_52"},"version":0,"value":{"path":[{"urn":"urn:li:container:de966145fb7cd0778c1e3d8c019b6c8c","id":"urn:li:container:de966145fb7cd0778c1e3d8c019b6c8c"},{"urn":"urn:li:container:3259e02b11cc42389f51a53ea87bf335","id":"urn:li:container:3259e02b11cc42389f51a53ea87bf335"}]}},"datasetKey":{"created":{"actor":"urn:li:corpuser:datahub","time":1715606005904},"name":"datasetKey","type":"VERSIONED","systemMetadata":{"lastObserved":1715606005851,"lastRunId":"postgres-2024_05_13-15_12_52","runId":"postgres-2024_05_13-15_12_52"},"version":0,"value":{"name":"ml_db.public.mq","platform":"urn:li:dataPlatform:postgres","origin":"PROD"}},"dataPlatformInstance":{"created":{"actor":"urn:li:corpuser:datahub","time":1715606005904},"name":"dataPlatformInstance","type":"VERSIONED","systemMetadata":{"lastObserved":1715606005851,"lastRunId":"postgres-2024_05_13-15_12_52","runId":"postgres-2024_05_13-15_12_52"},"version":0,"value":{"platform":"urn:li:dataPlatform:postgres"}},"subTypes":{"created":{"actor":"urn:li:corpuser:datahub","time":1715606006685},"name":"subTypes","type":"VERSIONED","systemMetadata":{"lastRunId":"postgres-2024_05_13-16_05_49","lastObserved":1715609175533,"runId":"postgres-2024_05_13-15_12_52"},"version":0,"value":{"typeNames":["Table"]}},"datasetProperties":{"created":{"actor":"urn:li:corpuser:datahub","time":1715606006300},"name":"datasetProperties","type":"VERSIONED","systemMetadata":{"lastObserved":1715609175531,"lastRunId":"postgres-2024_05_13-16_05_49","runId":"postgres-2024_05_13-15_12_52"},"version":0,"value":{"name":"mq","customProperties":{},"tags":[]}},"schemaMetadata":{"created":{"actor":"urn:li:corpuser:datahub","time":1715606006300},"name":"schemaMetadata","type":"VERSIONED","systemMetadata":{"lastObserved":1715609175531,"lastRunId":"postgres-2024_05_13-16_05_49","runId":"postgres-2024_05_13-15_12_52"},"version":0,"value":{"created":{"actor":"urn:li:corpuser:unknown","time":0},"platformSchema":{"com.linkedin.schema.MySqlDDL":{"tableSchema":""}},"lastModified":{"actor":"urn:li:corpuser:unknown","time":0},"schemaName":"ml_db.public.mq","fields":[{"fieldPath":"documentatlasid","isPartOfKey":true,"nullable":false,"type":{"type":{"com.linkedin.schema.NumberType":{}}},"nativeDataType":"BIGINT()","recursive":false},{"fieldPath":"title","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.StringType":{}}},"nativeDataType":"TEXT()","recursive":false},{"fieldPath":"article_text","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.StringType":{}}},"nativeDataType":"TEXT()","recursive":false},{"fieldPath":"publication_date","isPartOfKey":false,"nullable":true,"type":{"type":{"com.linkedin.schema.DateType":{}}},"nativeDataType":"DATE()","recursive":false},{"fieldPath":"creation_date","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.DateType":{}}},"nativeDataType":"DATE()","recursive":false},{"fieldPath":"uri","isPartOfKey":false,"nullable":true,"type":{"type":{"com.linkedin.schema.StringType":{}}},"nativeDataType":"TEXT()","recursive":false},{"fieldPath":"language_code","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.StringType":{}}},"nativeDataType":"TEXT()","recursive":false},{"fieldPath":"author","isPartOfKey":false,"nullable":true,"type":{"type":{"com.linkedin.schema.StringType":{}}},"nativeDataType":"TEXT()","recursive":false},{"fieldPath":"score","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.NumberType":{}}},"nativeDataType":"DOUBLE_PRECISION(precision=53)","recursive":false},{"fieldPath":"linked_issues","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.ArrayType":{}}},"nativeDataType":"ARRAY(BIGINT())","recursive":false},{"fieldPath":"linked_companies","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.ArrayType":{}}},"nativeDataType":"ARRAY(BIGINT())","recursive":false},{"fieldPath":"linked_projects","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.ArrayType":{}}},"nativeDataType":"ARRAY(BIGINT())","recursive":false},{"fieldPath":"linked_ngos","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.ArrayType":{}}},"nativeDataType":"ARRAY(BIGINT())","recursive":false},{"fieldPath":"linked_campaigns","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.ArrayType":{}}},"nativeDataType":"ARRAY(BIGINT())","recursive":false},{"fieldPath":"linked_governmentals","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.ArrayType":{}}},"nativeDataType":"ARRAY(BIGINT())","recursive":false},{"fieldPath":"linked_sources","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.ArrayType":{}}},"nativeDataType":"ARRAY(BIGINT())","recursive":false},{"fieldPath":"linked_locations","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.ArrayType":{}}},"nativeDataType":"ARRAY(BIGINT())","recursive":false},{"fieldPath":"linked_regions","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.ArrayType":{}}},"nativeDataType":"ARRAY(BIGINT())","recursive":false},{"fieldPath":"linked_divisions","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.ArrayType":{}}},"nativeDataType":"ARRAY(BIGINT())","recursive":false},{"fieldPath":"linked_cities","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.ArrayType":{}}},"nativeDataType":"ARRAY(BIGINT())","recursive":false},{"fieldPath":"new_ents","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.ArrayType":{}}},"nativeDataType":"ARRAY(BIGINT())","recursive":false},{"fieldPath":"issues_number","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.NumberType":{}}},"nativeDataType":"BIGINT()","recursive":false},{"fieldPath":"companies_number","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.NumberType":{}}},"nativeDataType":"BIGINT()","recursive":false},{"fieldPath":"platform_source_reach","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.NumberType":{}}},"nativeDataType":"BIGINT()","recursive":false},{"fieldPath":"linked_tags","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.ArrayType":{}}},"nativeDataType":"ARRAY(BIGINT())","recursive":false},{"fieldPath":"predicted_severity","isPartOfKey":false,"nullable":true,"type":{"type":{"com.linkedin.schema.NumberType":{}}},"nativeDataType":"DOUBLE_PRECISION(precision=53)","recursive":false}],"version":0,"hash":"","platform":"urn:li:dataPlatform:postgres"}},"status":{"created":{"actor":"urn:li:corpuser:datahub","time":1715606006300},"name":"status","type":"VERSIONED","systemMetadata":{"lastObserved":1715609175531,"lastRunId":"postgres-2024_05_13-16_05_49","runId":"postgres-2024_05_13-15_12_52"},"version":0,"value":{"removed":false}}},"entityName":"dataset"}%
This makes me think that during the graphql call from the UI there's some cached element:
2024-05-13 14:57:35,348 [ForkJoinPool.commonPool-worker-74] ERROR c.l.d.g.e.DataHubDataFetcherExceptionHandler:31 - Failed to execute
java.lang.IllegalArgumentException: No enum constant com.linkedin.datahub.graphql.generated.AssertionType.DATA_SCHEMA
at java.base/java.lang.Enum.valueOf(Enum.java:273)
at com.linkedin.datahub.graphql.generated.AssertionType.valueOf(AssertionType.java:6)
at com.linkedin.datahub.graphql.types.assertion.AssertionMapper.mapAssertionInfo(AssertionMapper.java:79)
at com.linkedin.datahub.graphql.types.assertion.AssertionMapper.map(AssertionMapper.java:45)
at com.linkedin.datahub.graphql.resolvers.assertion.EntityAssertionsResolver.lambda$get$0(EntityAssertionsResolver.java:91)
at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197)
at java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:179)
at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1625)
at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:921)
at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:682)
at com.linkedin.datahub.graphql.resolvers.assertion.EntityAssertionsResolver.lambda$get$2(EntityAssertionsResolver.java:93)
at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1768)
at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1760)
at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:373)
at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1182)
at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1655)
at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1622)
at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:165)
Same issue here, is anybody looking into this ?
The issue persists even after deleting the document related to the datacontract in elasticsearch.
Same issue here - it made the asset that i have attached the data contract to unusable even after removing the object.
Hey everyone, I've opened a PR with a possible fix here: https://github.com/datahub-project/datahub/pull/10534 I will attempt to repro soon, but if you're able to check out the branch and verify this fixes it for you let me know!
Thanks @jayacryl for addressing this!