datahub icon indicating copy to clipboard operation
datahub copied to clipboard

UI exception after adding a data contract

Open darthale opened this issue 9 months ago • 6 comments

After adding the data contract below via: datahub datacontract upsert -f mq.yaml, when navigating to the dataset in UI I get: No enum constant com.linkedin.datahub.graphql.generated.AssertionType.FRESHNESS (code 400)

mq.yaml

entity: urn:li:dataset:(urn:li:dataPlatform:postgres,ml_db.public.mq,PROD)
version: 1
data_quality:
  - type: unique

I would have expected the data contract to show up in the UI under the validation tab. At the moment, the dataset is not accessible anymore because of said error.

Desktop (please complete the following information):

  • OS: deployed datahub via the docker quickstart. OS: MacOS Sonoma
  • Browser: chrome
  • Version: v0.13.1

darthale avatar May 08 '24 10:05 darthale

When deleting the datacontract via CLI, the issue seems to be persisting and now I have a broken entity I can't access anymore.

darthale avatar May 08 '24 11:05 darthale

The same happens when in the contract you specify a schema. This time the exception is: No enum constant com.linkedin.datahub.graphql.generated.AssertionType.DATA_SCHEMA (code 400). Even after deleting the data contract, the issue persists. This makes these datasets unusable. Even deleting records in the metadata table in mariadb leads to nothing.

Trying to fetch the entity via Rest.li doesn't return any info to a connected data contract:

curl --header 'X-RestLi-Protocol-Version: 2.0.0' 'http://localhost:8080/entitiesV2/urn%3Ali%3Adataset%3A(urn%3Ali%3AdataPlatform%3Apostgres%2Cml_db.public.mq%2CPROD)' --header 'Authorization: Bearer xxx'

{"urn":"urn:li:dataset:(urn:li:dataPlatform:postgres,ml_db.public.mq,PROD)","aspects":{"container":{"created":{"actor":"urn:li:corpuser:datahub","time":1715606005904},"name":"container","type":"VERSIONED","systemMetadata":{"lastRunId":"postgres-2024_05_13-16_05_49","lastObserved":1715609175530,"runId":"postgres-2024_05_13-15_12_52"},"version":0,"value":{"container":"urn:li:container:3259e02b11cc42389f51a53ea87bf335"}},"browsePathsV2":{"created":{"actor":"urn:li:corpuser:datahub","time":1715606006906},"name":"browsePathsV2","type":"VERSIONED","systemMetadata":{"lastRunId":"postgres-2024_05_13-16_05_49","lastObserved":1715609175533,"runId":"postgres-2024_05_13-15_12_52"},"version":0,"value":{"path":[{"urn":"urn:li:container:de966145fb7cd0778c1e3d8c019b6c8c","id":"urn:li:container:de966145fb7cd0778c1e3d8c019b6c8c"},{"urn":"urn:li:container:3259e02b11cc42389f51a53ea87bf335","id":"urn:li:container:3259e02b11cc42389f51a53ea87bf335"}]}},"datasetKey":{"created":{"actor":"urn:li:corpuser:datahub","time":1715606005904},"name":"datasetKey","type":"VERSIONED","systemMetadata":{"lastObserved":1715606005851,"lastRunId":"postgres-2024_05_13-15_12_52","runId":"postgres-2024_05_13-15_12_52"},"version":0,"value":{"name":"ml_db.public.mq","platform":"urn:li:dataPlatform:postgres","origin":"PROD"}},"dataPlatformInstance":{"created":{"actor":"urn:li:corpuser:datahub","time":1715606005904},"name":"dataPlatformInstance","type":"VERSIONED","systemMetadata":{"lastObserved":1715606005851,"lastRunId":"postgres-2024_05_13-15_12_52","runId":"postgres-2024_05_13-15_12_52"},"version":0,"value":{"platform":"urn:li:dataPlatform:postgres"}},"subTypes":{"created":{"actor":"urn:li:corpuser:datahub","time":1715606006685},"name":"subTypes","type":"VERSIONED","systemMetadata":{"lastRunId":"postgres-2024_05_13-16_05_49","lastObserved":1715609175533,"runId":"postgres-2024_05_13-15_12_52"},"version":0,"value":{"typeNames":["Table"]}},"datasetProperties":{"created":{"actor":"urn:li:corpuser:datahub","time":1715606006300},"name":"datasetProperties","type":"VERSIONED","systemMetadata":{"lastObserved":1715609175531,"lastRunId":"postgres-2024_05_13-16_05_49","runId":"postgres-2024_05_13-15_12_52"},"version":0,"value":{"name":"mq","customProperties":{},"tags":[]}},"schemaMetadata":{"created":{"actor":"urn:li:corpuser:datahub","time":1715606006300},"name":"schemaMetadata","type":"VERSIONED","systemMetadata":{"lastObserved":1715609175531,"lastRunId":"postgres-2024_05_13-16_05_49","runId":"postgres-2024_05_13-15_12_52"},"version":0,"value":{"created":{"actor":"urn:li:corpuser:unknown","time":0},"platformSchema":{"com.linkedin.schema.MySqlDDL":{"tableSchema":""}},"lastModified":{"actor":"urn:li:corpuser:unknown","time":0},"schemaName":"ml_db.public.mq","fields":[{"fieldPath":"documentatlasid","isPartOfKey":true,"nullable":false,"type":{"type":{"com.linkedin.schema.NumberType":{}}},"nativeDataType":"BIGINT()","recursive":false},{"fieldPath":"title","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.StringType":{}}},"nativeDataType":"TEXT()","recursive":false},{"fieldPath":"article_text","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.StringType":{}}},"nativeDataType":"TEXT()","recursive":false},{"fieldPath":"publication_date","isPartOfKey":false,"nullable":true,"type":{"type":{"com.linkedin.schema.DateType":{}}},"nativeDataType":"DATE()","recursive":false},{"fieldPath":"creation_date","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.DateType":{}}},"nativeDataType":"DATE()","recursive":false},{"fieldPath":"uri","isPartOfKey":false,"nullable":true,"type":{"type":{"com.linkedin.schema.StringType":{}}},"nativeDataType":"TEXT()","recursive":false},{"fieldPath":"language_code","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.StringType":{}}},"nativeDataType":"TEXT()","recursive":false},{"fieldPath":"author","isPartOfKey":false,"nullable":true,"type":{"type":{"com.linkedin.schema.StringType":{}}},"nativeDataType":"TEXT()","recursive":false},{"fieldPath":"score","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.NumberType":{}}},"nativeDataType":"DOUBLE_PRECISION(precision=53)","recursive":false},{"fieldPath":"linked_issues","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.ArrayType":{}}},"nativeDataType":"ARRAY(BIGINT())","recursive":false},{"fieldPath":"linked_companies","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.ArrayType":{}}},"nativeDataType":"ARRAY(BIGINT())","recursive":false},{"fieldPath":"linked_projects","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.ArrayType":{}}},"nativeDataType":"ARRAY(BIGINT())","recursive":false},{"fieldPath":"linked_ngos","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.ArrayType":{}}},"nativeDataType":"ARRAY(BIGINT())","recursive":false},{"fieldPath":"linked_campaigns","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.ArrayType":{}}},"nativeDataType":"ARRAY(BIGINT())","recursive":false},{"fieldPath":"linked_governmentals","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.ArrayType":{}}},"nativeDataType":"ARRAY(BIGINT())","recursive":false},{"fieldPath":"linked_sources","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.ArrayType":{}}},"nativeDataType":"ARRAY(BIGINT())","recursive":false},{"fieldPath":"linked_locations","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.ArrayType":{}}},"nativeDataType":"ARRAY(BIGINT())","recursive":false},{"fieldPath":"linked_regions","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.ArrayType":{}}},"nativeDataType":"ARRAY(BIGINT())","recursive":false},{"fieldPath":"linked_divisions","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.ArrayType":{}}},"nativeDataType":"ARRAY(BIGINT())","recursive":false},{"fieldPath":"linked_cities","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.ArrayType":{}}},"nativeDataType":"ARRAY(BIGINT())","recursive":false},{"fieldPath":"new_ents","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.ArrayType":{}}},"nativeDataType":"ARRAY(BIGINT())","recursive":false},{"fieldPath":"issues_number","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.NumberType":{}}},"nativeDataType":"BIGINT()","recursive":false},{"fieldPath":"companies_number","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.NumberType":{}}},"nativeDataType":"BIGINT()","recursive":false},{"fieldPath":"platform_source_reach","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.NumberType":{}}},"nativeDataType":"BIGINT()","recursive":false},{"fieldPath":"linked_tags","isPartOfKey":false,"nullable":false,"type":{"type":{"com.linkedin.schema.ArrayType":{}}},"nativeDataType":"ARRAY(BIGINT())","recursive":false},{"fieldPath":"predicted_severity","isPartOfKey":false,"nullable":true,"type":{"type":{"com.linkedin.schema.NumberType":{}}},"nativeDataType":"DOUBLE_PRECISION(precision=53)","recursive":false}],"version":0,"hash":"","platform":"urn:li:dataPlatform:postgres"}},"status":{"created":{"actor":"urn:li:corpuser:datahub","time":1715606006300},"name":"status","type":"VERSIONED","systemMetadata":{"lastObserved":1715609175531,"lastRunId":"postgres-2024_05_13-16_05_49","runId":"postgres-2024_05_13-15_12_52"},"version":0,"value":{"removed":false}}},"entityName":"dataset"}%

This makes me think that during the graphql call from the UI there's some cached element:

2024-05-13 14:57:35,348 [ForkJoinPool.commonPool-worker-74] ERROR c.l.d.g.e.DataHubDataFetcherExceptionHandler:31 - Failed to execute
java.lang.IllegalArgumentException: No enum constant com.linkedin.datahub.graphql.generated.AssertionType.DATA_SCHEMA
	at java.base/java.lang.Enum.valueOf(Enum.java:273)
	at com.linkedin.datahub.graphql.generated.AssertionType.valueOf(AssertionType.java:6)
	at com.linkedin.datahub.graphql.types.assertion.AssertionMapper.mapAssertionInfo(AssertionMapper.java:79)
	at com.linkedin.datahub.graphql.types.assertion.AssertionMapper.map(AssertionMapper.java:45)
	at com.linkedin.datahub.graphql.resolvers.assertion.EntityAssertionsResolver.lambda$get$0(EntityAssertionsResolver.java:91)
	at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197)
	at java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:179)
	at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1625)
	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
	at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:921)
	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:682)
	at com.linkedin.datahub.graphql.resolvers.assertion.EntityAssertionsResolver.lambda$get$2(EntityAssertionsResolver.java:93)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1768)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1760)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:373)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1182)
	at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1655)
	at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1622)
	at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:165)

darthale avatar May 13 '24 14:05 darthale

Same issue here, is anybody looking into this ?

adriano-sportsbet avatar May 13 '24 23:05 adriano-sportsbet

The issue persists even after deleting the document related to the datacontract in elasticsearch.

darthale avatar May 14 '24 15:05 darthale

Same issue here - it made the asset that i have attached the data contract to unusable even after removing the object.

adriano-sportsbet avatar May 15 '24 01:05 adriano-sportsbet

Hey everyone, I've opened a PR with a possible fix here: https://github.com/datahub-project/datahub/pull/10534 I will attempt to repro soon, but if you're able to check out the branch and verify this fixes it for you let me know!

jayacryl avatar May 17 '24 21:05 jayacryl

Thanks @jayacryl for addressing this!

darthale avatar May 23 '24 07:05 darthale