Missing lopq library in sample `spark-submit` call
The lopq library is not currently provided to the example spark-submit call in the documentation. I found the easiest solution was to run python setup.py bdist_egg in the python subdirectory and then pass the generated egg to spark-submit via the --py-files parameter. It would probably be helpful if this were added to the documentation.
There is a disclaimer of sorts about that at the top of the README. One concern is that there are a variety of ways that the package could be provided to the runtime and none of them impact the usage of the LOPQ scripts, so illustrating a single one in all of the example commands seems distracting.
Perhaps something like this is better though?
spark-submit \
... # spark configuraton
train_model.py \
--data /hdfs/path/to/data \
--V 16 \
--M 8 \
--model_pkl /hdfs/output/path/model.pkl \
--model_proto /hdfs/output/path/model.lopq
Whoops. I guess I missed that section in the docs. That extra line might be helpful (perhaps also include "see above").