CaffeOnSpark icon indicating copy to clipboard operation
CaffeOnSpark copied to clipboard

Do all the nodes of hadoop yarn cluster need to install CaffeOnSpark ?

Open guyang88 opened this issue 7 years ago • 5 comments

@mriduljain @davglass @gyehuda @javadba @pcnudde I want to run CaffeOnSpark on Spark-on-Yarn Cluster, do i need to install CaffeOnSpark on every node, or just install on one of them?

guyang88 avatar Mar 31 '17 03:03 guyang88

You can either install it on each node, or install it on the node where you launch your job and use spark-submit to ship the whole package to executors. In either case, path needs to be set up properly.

junshi15 avatar Apr 01 '17 23:04 junshi15

@junshi15 I installed CaffeOnSpark on one node and set the path per GetStarted_yarn , when I used spark-sbumit to launch caffeonspark ,I met mistake:"no lmdbjni in java path". But I installed caffeonspark on all node , I succeed. so what is the problem?(I have set LD_LIBRARY_PATH before spark-submit)

guyang88 avatar Apr 05 '17 02:04 guyang88

You should create a tgz file, say cos.tgz, with lib64/liblmdbjni.so etc, and specify that tgz file as --archive and extend executor's LD_LIBRARY_PATH to include ":cos.tgz/lib64".

tar -cpzf ${HOME}/tmp/cos.tgz lib64

spark-submit .... --archives ${HOME}/tmp/caffe_on_grid_archive.tgz
--conf spark.driver.extraLibraryPath="WHATEVRYOUHAVE:./cos.tgz/lib64"
--conf spark.executorEnv.LD_LIBRARY_PATH="WHATEVRYOUHAVE:./cos.tgz/lib64" \

Andy

On Tue, Apr 4, 2017 at 7:18 PM, guyang88 [email protected] wrote:

I installed CaffeOnSpark on one node and set the path per GetStarted_yarn , when I used spark-sbumit to launch caffeonspark ,I met mistake:"no lmdbjni in java path". But I installed caffeonspark on all node , I succeed. so what is the problem?(I have set LD_LIBRARY_PATH before spark-submit)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/yahoo/CaffeOnSpark/issues/241#issuecomment-291718908, or mute the thread https://github.com/notifications/unsubscribe-auth/AClTeEKysQCqara09xgAXZKt1JfvS8Qmks5rsvnogaJpZM4MvLza .

anfeng avatar Apr 05 '17 03:04 anfeng

@anfeng Thangks for your answer . I created a tgz file including CaffeOnSpark and all the lib*.so which are requierd, then I used spark-sbumit --archives / path to my tgz file . But I met the same error :"no lmdbjni in java path" .How can I make the tgz file shared by other nodes ? Do I create a fake tgz file?

guyang88 avatar Apr 06 '17 09:04 guyang88

The .tgz file will be shipped to all executors. Make sure

  1. your .tgz file contains lmdbjni.so
  2. set the library path as shown by @anfeng

junshi15 avatar Apr 10 '17 16:04 junshi15