amundsen icon indicating copy to clipboard operation
amundsen copied to clipboard

Badges having same key but being applied to different node types (table vs column) overriding each other when publishing

Open mikaalanwar opened this issue 3 years ago • 5 comments

Current Behaviour

In case we have two badges having the same "key" and different "category", they are over-ridden at the time of publishing based on the one that get's published last. E.g I have a table badge having "PII" and a column badge having key "PII". If I was to do something as following, it would update the category of the same badge without recognising that they are two different badges (meant for different entities).

BadgeMetadata(
    start_label=TableMetadata.TABLE_NODE_LABEL,
    start_key=TableMetadata.TABLE_KEY_FORMAT.format(
        db=table.ref.database,
        cluster=table.ref.cluster,
        schema=table.ref.schema,
        tbl=table.ref.name
	    ),
	    badges=[Badge(
                category='compliance',
                name='PII'
            )]
	)

. . .

BadgeMetadata(
	start_label=ColumnMetadata.COLUMN_NODE_LABEL,
	start_key=ColumnMetadata.COLUMN_KEY_FORMAT.format(
	    db=column.table_ref.database,
	    cluster=column.table_ref.cluster,
	    schema=column.table_ref.schema,
	    tbl=column.table_ref.name,
	    col=column.name
	    ),
	    badges=[Badge(
                category='column',
                name='PII'
            )]
	)

Expected Behaviour

Ideally, the uniqueness check of badges should also consider the category or the type of the node / entity (column vs table) it is being applied to. Otherwise, column badges could override table badges and vice versa etc.

Possible Solution

The workaround that I am currently considering to use is to use a different/unique name of table or column badge but doesn't seem ideal.

Steps to Reproduce

  1. Try creating badges using the pseudo-code above.
  2. The order of creating will determine which category gets applied.

Screenshots (if appropriate)

image

Context

Yes, we have a scenario where we have a table and a column badge having the same name. A change in ordering of badge creation (during some refactoring) caused the relevant "table" badge to disappear from the home page as they category of that badge was changed from "compliance" to "column" (and we don't display column badges on our home screen).

Your Environment

  • Amundsen version used: b86a06a15f5acccecc6d4c1dc512603a5daaa1cf
  • Data warehouse stores: neo4j
  • Deployment (k8s or native): native
  • Link to your fork or repository: https://github.com/deliveryhero/amundsen

mikaalanwar avatar Apr 04 '23 09:04 mikaalanwar

This was the intended behavior of badges, similar to how if you name 2 tables exactly the same thing they would be considered the same table. You can try naming the table and column badges slightly differently in the database and for display purposes you can configure the badges to display any string even if its the same.

allisonsuarez avatar Apr 05 '23 21:04 allisonsuarez

@allisonsuarez In the badge case they're not the "same" as they're categorically different for column badge and table badge. Our initial thought was that the "category" should be included in the de-dedup and should be enough to differentiate the badges. similarly, if we have tables of the same name with different storage layer i.e (Redshift or Bigquery)

MrwanBaghdad avatar Apr 06 '23 15:04 MrwanBaghdad

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

stale[bot] avatar May 03 '23 23:05 stale[bot]

This issue has been automatically closed for inactivity. If you still wish to make these changes, please open a new pull request or reopen this one.

stale[bot] avatar Jun 10 '23 00:06 stale[bot]

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

stale[bot] avatar Aug 07 '23 05:08 stale[bot]