OpenMetadata icon indicating copy to clipboard operation
OpenMetadata copied to clipboard

Ingestion causes Custom Property values to disappear

Open cdoron opened this issue 2 years ago • 5 comments

Description We use the OpenMetadata API to create a table pointing to an S3 CSV object. This Table entry is not created through discovery, but through manual creation (we manually create the databaseService, database, databaseSchema, and then the table). After creation, we add custom properties to the table.

Only later do we run an ingestion pipeline that discovers the table.

Once the ingestion process is complete, the custom properties are erased. This is in contrast to both the table and the column tags, which remain intact.

Expected behavior We expect the custom properties to remain intact after the ingestion pipeline run completes. That is the case for table and column tags.

Thanks!

Version:

  • OS: Linux Ubuntu
  • OpenMetadata version: 0.11.4

cdoron avatar Aug 22 '22 11:08 cdoron

cc @sureshms

harshach avatar Aug 22 '22 15:08 harshach

Before ingestion, the latest table version is 0.2. When I run:

curl -X GET 'http://localhost:8585/api/v1/tables/3dcac90d-608c-4214-9671-ec23889652b5/versions/0.2'

I get the following extensions:

  "extension": {
    "connectionType": "s3",
    "credentials": "/v1/kubernetes-secrets/paysim-csv?namespace=fybrik-notebook-sample",
    "dataFormat": "csv",
    "geography": "theshire ",
    "name": "Synthetic Financial Datasets For Fraud Detection"
  }

After ingestion, the latest version becomes 1.2. When I run

curl -X GET 'http://localhost:8585/api/v1/tables/3dcac90d-608c-4214-9671-ec23889652b5/versions/1.2'

there are no extensions

cdoron avatar Sep 09 '22 14:09 cdoron

Fixed in 0.12. Thanks for filing @cdoron

harshach avatar Sep 11 '22 22:09 harshach

Hi @harshach, sorry for reopening it, but it still happens when security is not enabled. In the code, we have:

    private void updateExtension() throws JsonProcessingException {
      if (updatedByBot()) {
        // Revert changes to extension field, if being updated by a bot
        updated.setExtension(original.getExtension());
        return;
      }

      removeExtension(original);
      storeExtension(updated);
    }

Apart from that, I noticed that when we cannot create a Table entity with a property that doesn't exist, otherwise, we can add it when updating it the second time. These properties are not shown in the UI except they have been created previously.

image

After creating testproperty in http://localhost:8585/settings/customAttributes/tables:

image

nahuelverdugo avatar Sep 12 '22 11:09 nahuelverdugo

I now realize that the fact that we created the table through the API, and not through the UI, is not relevant. The problem appears in both cases. Verified on version 0.12.0.

Recreate through the UI:

  1. create a databaseService
  2. create, deploy and run an ingestion pipeline.
  3. Once the table is discovered, add a custom property.
  4. run the ingestion pipeline once again
  5. once the run is over, the custom property disappears

cdoron avatar Sep 15 '22 05:09 cdoron

@harshach should we close this one?

nahuelverdugo avatar Oct 18 '22 12:10 nahuelverdugo