Wenchen Fan

Results 245 comments of Wenchen Fan

> For incorrect metric cases, the minimum value is always zero while the median value is always less than the actual median value. As I explained earlier, this should not...

I confirmed that the bug exists. I was wrong about executor side accumulator updates filtering. We only filter out zero values for task metrics, but not SQL metrics. But I...

> binary_search(array(1.0, 2.0, 3.0), 1.1) -> return -2 Is this expected behavior? In general it's more efficient to use expressions than UDF, but we should also care about the coherence...

is it possible to implement this expression with `StaticInvoke`? We can add a bunch of overloads that are specified for variant primitive types.

If we want more compile-time safety, we can also specify the where condition in `execute(...)`, as there should be at most one where condition for an UPDATE command. I don't...

can we update the PR title? It's not related to `DataFrameWriterV2` anymore.

The idea LGTM. I think it's cleaner if the implementation is an optimizer rule that rewrites `InsertIntoHadoopFsRelation` command. We can look at the constraints of the input query, if we...

Sorry I missed this. https://github.com/apache/spark/pull/48573 did the same thing, and I think it's simpler to have separate APIs for Scan and Write, instead of adding a common interface for it.