Bobby Wang
Bobby Wang
> Or, we can use what `RDD.coalesce` does currently. Good suggestion. Done. Thx. @mridulm
> > @wbo4958 > > Issue: The xgboost code uses rdd barrier mode, but barrier mode does not work with `coalesce` operator. @mridulm just suggested using another random way borrowing...
@cloud-fan could you help to review it?
@cloud-fan @mridulm could you help to review it again? Thx
> @wbo4958 Can you add comments as I asked in https://github.com/apache/spark/pull/37855/files#r975993118 ? I added some comments from https://issues.apache.org/jira/browse/SPARK-21782.
> Good catch! seems we can also simply switch to `XORShiftRandom` which always [hash the seeds](https://github.com/apache/spark/blob/e1ea806b3075d279b5f08a29fe4c1ad6d3c4191a/core/src/main/scala/org/apache/spark/util/random/XORShiftRandom.scala#L58-L67) > > ``` > scala> (1 to 200).map(partitionId => new Random(partitionId).nextInt(4)) > val res3:...
No, we can file the following PR for it. This PR looks good to me.