CaffeOnSpark
CaffeOnSpark copied to clipboard
How can I trans my own data into DF?
Now I have already build CaffeOnSpark successfully , my own data is CIFAR-10(256*256) LMDB-9.8GB,How can I trans this into DF? Thank you!
Hello @kceil ,
It is mentioned here: https://github.com/yahoo/CaffeOnSpark/wiki/GetStarted_EC2
pushd ${CAFFE_ON_SPARK}/data
hadoop fs -rm -r -f ${CAFFE_ON_SPARK}/data/mnist_train_dataframe
spark-submit --master ${MASTER_URL} \
--conf spark.cores.max=${TOTAL_CORES} \
--conf spark.driver.extraLibraryPath="${LD_LIBRARY_PATH}" \
--conf spark.executorEnv.LD_LIBRARY_PATH="${LD_LIBRARY_PATH}" \
--class com.yahoo.ml.caffe.tools.LMDB2DataFrame \
${CAFFE_ON_SPARK}/caffe-grid/target/caffe-grid-0.1-SNAPSHOT-jar-with-dependencies.jar \
-imageRoot file:${CAFFE_ON_SPARK}/data/mnist_train_lmdb \
-lmdb_partitions ${TOTAL_CORES} \
-outputFormat parquet \
-output ${CAFFE_ON_SPARK}/data/mnist_train_dataframe
hadoop fs -rm -r -f ${CAFFE_ON_SPARK}/data/mnist_test_dataframe
spark-submit --master ${MASTER_URL} \
--conf spark.cores.max=${TOTAL_CORES} \
--conf spark.driver.extraLibraryPath="${LD_LIBRARY_PATH}" \
--conf spark.executorEnv.LD_LIBRARY_PATH="${LD_LIBRARY_PATH}" \
--class com.yahoo.ml.caffe.tools.LMDB2DataFrame \
${CAFFE_ON_SPARK}/caffe-grid/target/caffe-grid-0.1-SNAPSHOT-jar-with-dependencies.jar \
-imageRoot file:${CAFFE_ON_SPARK}/data/mnist_test_lmdb \
-lmdb_partitions ${TOTAL_CORES} \
-outputFormat parquet \
-output ${CAFFE_ON_SPARK}/data/mnist_test_dataframe
You could change the MNIST details to CIFAR10 easily.
Thanks, Arun