datafusion-comet icon indicating copy to clipboard operation
datafusion-comet copied to clipboard

CometSparkToColumnar should have different name for row vs columnar input

Open andygrove opened this issue 1 year ago • 3 comments

What is the problem the feature request solves?

I would like to change the nodeName for CometSparkToColumnar as follows:

 override def nodeName: String = if (child.supportsColumnar) {
    "CometSparkColumnarToColumnar"
  } else {
    "CometSparkRowToColumnar"
  }

This makes it easier to comprehend which version is being used in a plan:

+- == Final Plan ==
   *(1) ColumnarToRow
   +- !CometHashAggregate [key#6L, count#88L], Final, [key#6L], [count(1)]
      +- !CometHashAggregate [key#6L], Partial, [key#6L], [partial_count(1)]
         +- CometSparkColumnarToColumnar
            +- Scan In-memory table abc [key#6L]
                  +- InMemoryRelation [key#6L, value#7L, (key + 1)#10L], StorageLevel(disk, memory, deserialized, 1 replicas)
                        +- *(2) ColumnarToRow
                           +- CometProject [key#6L, value#7L, (key + 1)#10L], [id#0L AS key#6L, (id#0L % 8) AS value#7L, (id#0L + 1) AS (key + 1)#10L]
                              +- CometSparkRowToColumnar
                                 +- *(1) Range (0, 1000, step=1, splits=5)

Describe the potential solution

No response

Additional context

No response

andygrove avatar Sep 11 '24 14:09 andygrove

Maybe just override the simpleString and verboseString methods?

parthchandra avatar Sep 16 '24 16:09 parthchandra

Hey, I'd love to pick this up if possible. Thanks.

JensonChoi avatar Sep 21 '24 06:09 JensonChoi

@JensonChoi Thanks. I assigned this to you. Please feel free to open PR.

viirya avatar Sep 21 '24 18:09 viirya