Zhi Lin

Results 84 comments of Zhi Lin

better to use spark shim in 3.3

ok, but we'll need a shim layer to add support for 3.3. I've told @KepingYan to look into it. I 'll drop python 3.6 [here](https://github.com/oap-project/raydp/pull/214), and it seems like raydp...

This example just serves as a demonstration of how to train pytorch models on data loaded/processed by raydp. The data is randomly generated, so it is not expected that the...

Hi, glad you tried raydp. Ray's object store is shared among nodes. By calling our `create_ml_dataset_from_spark`, you create a `MLDataset`, which is partitioned. That means your data is probably distibuted...

Is it a GNN application? Like each node needs a full graph, but node/edge features can be partitioned? Anyway I guess you can save the graph to parquet fist, and...

Hi @YeahNew , I can not find your reply, but I saw it in my mailbox. Have you solved the problem? I think you don't need to use MLDataset for...

I'm not sure about this, but it seems to have something to do with OpenMP. Can you run some xgboost_ray examples to verify if it's raydp's problem?

hi @LAITRUNGMINHDUC , RayDP does not has its own SQL implementation for now, I think the performance would be very similar to vanilla spark. If you want to have better...

This error indicates that the Raydp Jar is not included in the pyspark driver's classpath for some reason. Can you check driver_cp in ray_cluster.py and see if it is a...

Are you using java 9? In our tests, we use java 8 and spark 3.2.1, you can try this configuration. Can you use pyspark to start a session without raydp?