exelixi icon indicating copy to clipboard operation
exelixi copied to clipboard

is HDFS a hard requirement to setup/run exelixi framework?

Open dbsiegel opened this issue 11 years ago • 9 comments

Is HDFS a hard requirement to setup/run exelixi?

Hadoop is not currently part of playa-mesos box image, so install.sh fails the hadoop fs commands.

playa-mesos team is thinking to add support for a single node hadoop instance configured with pseudo-distributed operation. Would that work?

dbsiegel avatar Jan 16 '14 22:01 dbsiegel

Great to see this issue on GH!

Yes, HDFS is required when running on Mesos. Py code gets distributed via HDFS onto the slave nodes. That's pretty standard for how we use Spark and other frameworks.

However, you could run in standalone mode w/o Mesos -- that's mentioned in the "Getting Started" section of the wiki.

Huh... playa-mesos may need to rethink, since HDFS is needed by many popular Mesos use cases. Also, pseudo-distributed mode Hadoop is generally a bad idea. I should have a discussion with Jeremy about that...

Meanwhile, awesome gravatar there :)

ceteri avatar Jan 17 '14 19:01 ceteri

Thanks :) I will run in standalone mode for now w/o Mesos, on macOSX. Perhaps I've missed something. I see an import error when launching the framework ImportError: cannot import name shutdown. (from gevent in service.py)

dbsiegel avatar Jan 17 '14 22:01 dbsiegel

Gevent should have shutdown as a standard part of the package.

Trying running just the Python prompt from command line, then type

from gevent import shutdown

Does it give the same error? If you've got a GitHub gist of the full error trace, that'd probably help too.

Thanks,

ceteri avatar Jan 17 '14 22:01 ceteri

same error. I would embed this gist but don't know how at the moment. https://gist.github.com/d3borah/d83eefec307076371e8d

dbsiegel avatar Jan 17 '14 23:01 dbsiegel

Dang. This may require customer support on-site.

One thing that may help is to try running under Anaconda, instead of the default Py 2.7.x that comes installed on Mac OSX:

https://store.continuum.io/cshop/anaconda/

It should be quick to install, and is easily reversed.

ceteri avatar Jan 17 '14 23:01 ceteri

Also, just checked my local set up:

pacos-mbp-3:c3nom ceteri$ pip freeze | grep gevent gevent==0.13.8 gevent-websocket==0.3.6 gevent-zeromq==0.2.2

So that's a very different version of gevent. Will check next time when running in AWS, but it was the same previously.

ceteri avatar Jan 17 '14 23:01 ceteri

Thanks will look into Anaconda for this. I am just getting into python so not invested in the default.

your version of gevent is close to what's available on playa-mesos box. however, playa-mesos box apparently does not have hat_trie at the moment.

vagrant@mesos:~/exelixi$ pip freeze Cython==0.19.2 Pillow==2.0.0 apt-xapian-index==0.45 argparse==1.2.1 chardet==2.0.1 distribute==0.6.34 gevent==0.13.7 greenlet==0.4.0 mesos-0.14.0==rc4-amd64 mesos-0.15.0==rc4-amd64 numpy==1.7.1 pandas==0.13.0 protobuf==2.4.1 psutil==0.6.1 python-apt==0.8.8ubuntu6 python-dateutil==2.2 python-debian==0.1.21-nmu2ubuntu1 pytz==2013.9 requests==1.1.0 scikit-learn==0.14.1 scipy==0.11.0 six==1.2.0 ssh-import-id==3.14 urllib3==1.5 wsgiref==0.1.2

dbsiegel avatar Jan 17 '14 23:01 dbsiegel

That makes sense, playa-mesos will have introduced some other versions/dependencies then.

Yes, the hat_trie package needs to be installed from GitHub, since there's no PyPi support for it yet. If you use the command in bin/local_install.sh, then:

sudo pip install git+https://github.com/kmike/hat-trie.git#egg=hat-trie

That requires Git to be installed first, too.

ceteri avatar Jan 17 '14 23:01 ceteri

:+1: :) :)

dbsiegel avatar Jan 17 '14 23:01 dbsiegel