Jun Shi comments

Results 115 comments of


                                            Jun Shi

How to run caffeonspark in cluster mode

If you are using Yarn as the resource manager, here is the instruction. https://github.com/yahoo/CaffeOnSpark/wiki/GetStarted_yarn

PythonApiTest failed when build CaffeOnSpark

@apli It sounds like incompatibility between Spark, Java and the Python wrapper. We don't have bandwidth to test all the combinations. It may be easier to use supported versions specified...

PythonApiTest failed when build CaffeOnSpark

@apli you can by-pass the python test. @mriduljain Can you take a look at the python test cases?

Num of Executors gets changed internally

I am not aware of this in the source code, as you also observed. Which scheduler (resource manager) are you using? yarn?

Num of Executors gets changed internally

clusterSize actually gets set in the source code. See my answer in https://github.com/yahoo/CaffeOnSpark/issues/125

DataLayer use data_param instead of memory_data_param

You can only use memory_data_param or [cos_data_param](https://github.com/yahoo/CaffeOnSpark/blob/master/data/lenet_cos_train_test.prototxt#L11-L32) to specify the input layers. The data layers have simple resize function, but I don't think it handles encoded images (say JPEG) with...

Jun Shi

How to run caffeonspark in cluster mode

PythonApiTest failed when build CaffeOnSpark

PythonApiTest failed when build CaffeOnSpark

Num of Executors gets changed internally

Num of Executors gets changed internally

DataLayer use data_param instead of memory_data_param

Lack of scalability

java.lang.IllegalStateException: RpcEnv already stopped.

How to build with protobuf_2.4.1? The protobuf version is 2.4.1 in spark cluster which I can use.

build failed with spark 1.6.1, how should i config the pom.xml in caffe-grid?