Jun Shi
Jun Shi
If you are using Yarn as the resource manager, here is the instruction. https://github.com/yahoo/CaffeOnSpark/wiki/GetStarted_yarn
@apli It sounds like incompatibility between Spark, Java and the Python wrapper. We don't have bandwidth to test all the combinations. It may be easier to use supported versions specified...
@apli you can by-pass the python test. @mriduljain Can you take a look at the python test cases?
I am not aware of this in the source code, as you also observed. Which scheduler (resource manager) are you using? yarn?
clusterSize actually gets set in the source code. See my answer in https://github.com/yahoo/CaffeOnSpark/issues/125
You can only use memory_data_param or [cos_data_param](https://github.com/yahoo/CaffeOnSpark/blob/master/data/lenet_cos_train_test.prototxt#L11-L32) to specify the input layers. The data layers have simple resize function, but I don't think it handles encoded images (say JPEG) with...
MNIST is too small. Do the same comparison on a larger neural network and a larger database, for example, inception-v3 (or VGG16) on imagenet dataset.
This looks like a driver log, which contains less information. Can you share your executor logs?
From the error message, it is likely some protobuf java api has changed from 2.4.1 to 2.5.0. I have not tried 2.4.1 myself, but if BLVC caffe works with protobuf...
Did you change the pom file before compile? mvn could not find spark-core_2.10 in maven repo.