datahub
datahub copied to clipboard
Neo4j getLineage with multi-hop support
Describe the bug DataHub is not fully integrated with Neo4j as it does not support multi-hop queries.
We need an implementation of getLineage for Neo4jGraphService, and then we can also set supportsMultiHop() to return true.
To Reproduce
- Set up an instance of DataHub with Neo4j as the Graph Index implementation
- Ingest some Datasets with lineage to each other.
- Open the DataHub UI and navigate to a dataset with some lineage.
- Go to the Impact tab (if using Neo4j this will be disabled).
Expected behavior The expectation is the Lineage tab and "Impact Analysis" should be functional once the Neo4j multi-hop integration is complete.
The following code needs to be implemented or amended in https://github.com/datahub-project/datahub/blob/master/metadata-io/src/main/java/com/linkedin/metadata/graph/neo4j/Neo4jGraphService.java:
The getLineage method needs to be implemented for multiple hops in Neo4jGraphService: https://github.com/datahub-project/datahub/blob/2dfc166bbde4f5ae6de04482fd1fcca9dcc3cabc/metadata-io/src/main/java/com/linkedin/metadata/graph/GraphService.java#L94
The supportsMultiHop method should return true for Neo4j: https://github.com/datahub-project/datahub/blob/2dfc166bbde4f5ae6de04482fd1fcca9dcc3cabc/metadata-io/src/main/java/com/linkedin/metadata/graph/GraphService.java#L188
Screenshots Example Lineage tab: https://demo.datahubproject.io/dataset/urn:li:dataset:(urn:li:dataPlatform:looker,long_tail_companions.view.instruction_set,PROD)/Lineage?filter_degree=1&page=1