Etienne Chauchot comments

Results 15 comments of


                                            Etienne Chauchot

Improved pipeline translation in SparkStructuredStreamingRunner

> I took a glance on this change and LGTM for me. Taking into account that this PR really improves the performance of some transforms while running it on Spark...

Improved pipeline translation in SparkStructuredStreamingRunner

> I agree, that leaves room for potential new confusion. Giving this a 2nd thought I suppose you're right and `SparkDatasetRunner` is the better name with less ambiguity ... nevertheless...

Improved pipeline translation in SparkStructuredStreamingRunner

@mosche reviewing ... cc: @aromanenko-dev

Improved pipeline translation in SparkStructuredStreamingRunner

@mosche: did you rebase this PR on top of the previous merged code about the Encoders? I have the impression it contains the same changes ?

Improved pipeline translation in SparkStructuredStreamingRunner

> oh, I remember ... you mean this one #22157? Yes, that's rebased ... but obviously this one here contains lots of changes to encoders to use encoders that are...

Improved pipeline translation in SparkStructuredStreamingRunner

> ![results](https://user-images.githubusercontent.com/1401430/184098877-4972debd-4eba-4ade-a613-ace1d464a4fe.png) @aromanenko-dev I think you should also run the TPCDS suite on this PR (ask @aromanenko-dev ) because when we compared the 2 spark runners in the past we've...