[Bug] Lineage is empty when logical plan has no child
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
Search before asking
- [X] I have searched in the issues and found no similar issues.
Describe the bug
Lineage is returned as empty since we dont return the parentColumnLineage
but an empty AttributeSet()
case p if p.children.isEmpty => ListMapAttribute, AttributeSet
where as it should be
case p if p.children.isEmpty => parentColumnsLineage
Affects Version(s)
https://github.com/apache/kyuubi/releases/tag/v1.9.2
Kyuubi Server Log Output
No error in server logs, you can test it simply by running test cases like
test("columns lineage extract - AppendData/OverwriteByExpression") {
val ddls =
"""
|create table v2_catalog.db.tb0(col1 int, col2 string) partitioned by(col2)
|""".stripMargin
ddls.split("\n").filter(_.nonEmpty).foreach(spark.sql(_).collect())
withTable("v2_catalog.db.tb0") { _ =>
val ret0 =
extractLineage(
s"insert into table v2_catalog.db.tb0 " +
s"select key as col1, value as col2 from test_db0.test_table0")
assert(ret0 == Lineage(
List(s"$DEFAULT_CATALOG.test_db0.test_table0"),
List("v2_catalog.db.tb0"),
List(
("v2_catalog.db.tb0.col1", Set(s"$DEFAULT_CATALOG.test_db0.test_table0.key")),
("v2_catalog.db.tb0.col2", Set(s"$DEFAULT_CATALOG.test_db0.test_table0.value")))))
val ret1 =
extractLineage(
s"insert overwrite table v2_catalog.db.tb0 partition(col2) " +
s"select key as col1, value as col2 from test_db0.test_table0")
assert(ret1 == Lineage(
List(s"$DEFAULT_CATALOG.test_db0.test_table0"),
List("v2_catalog.db.tb0"),
List(
("v2_catalog.db.tb0.col1", Set(s"$DEFAULT_CATALOG.test_db0.test_table0.key")),
("v2_catalog.db.tb0.col2", Set(s"$DEFAULT_CATALOG.test_db0.test_table0.value")))))
val ret2 =
extractLineage(
s"insert overwrite table v2_catalog.db.tb0 partition(col2 = 'bb') " +
s"select key as col1 from test_db0.test_table0")
assert(ret2 == Lineage(
List(s"$DEFAULT_CATALOG.test_db0.test_table0"),
List("v2_catalog.db.tb0"),
List(
("v2_catalog.db.tb0.col1", Set(s"$DEFAULT_CATALOG.test_db0.test_table0.key")),
("v2_catalog.db.tb0.col2", Set()))))
}
}
Kyuubi Engine Log Output
Empty Lineage returned :
inputTables(List())
outputTables(List())
columnLineage(List())
### Whereas it should return lineage like
Lineage(
List(s"$DEFAULT_CATALOG.test_db0.test_table0"),
List("v2_catalog.db.tb0"),
List(
("v2_catalog.db.tb0.col1", Set(s"$DEFAULT_CATALOG.test_db0.test_table0.key")),
("v2_catalog.db.tb0.col2", Set(s"$DEFAULT_CATALOG.test_db0.test_table0.value")))))
this test fails
### Test name : test("columns lineage extract - AppendData/OverwriteByExpression")
Kyuubi Server Configurations
No response
Kyuubi Engine Configurations
No response
Additional context
By uddating the logic when
case p if p.children.isEmpty
we should return the parentColumnLineage instead of empty ListMapAttribute, AttributeSet
Are you willing to submit PR?
- [X] Yes. I would be willing to submit a PR with guidance from the Kyuubi community to fix.
- [ ] No. I cannot submit a PR at this time.
Hello @vinayakmalik95, Thanks for finding the time to report the issue! We really appreciate the community's efforts to improve Apache Kyuubi.
This is present in all release since the inception of lineage in kyuubi.
@iodone @zwangsheng @bowenliang123 @cfmcgrady let me know so I can raise the PR
This is present in all releases since the inception of lineage in kyuubi.
@iodone @zwangsheng @bowenliang123 @cfmcgrady let me know so I can raise the PR
Can you run the unit test with your specific error case?
yes it fails with current unit test
@iodone
@Vinayakmalik95 You can try to fix it if the problem has been located.
I have already provided the solution in the comments above, let me open the PR for the same.