magpie
magpie copied to clipboard
Support some sort of Monitoring Software
It would be great it we added something like ambari so that we could check the status of our nodes. In one case, all of my HBase region servers went down and my Spark job just hung around waiting for them to come back up (which they never did due to an unrelated issue). I had no idea that this occurred. It would be nice to be able to see what is happening all in one place.
I'm not sure what else is out that other than ambari at this point.
Some others I have found: ganglia, whiteelephant, and starfish. I will have to look into them all more closely.
I actually worked on adding Ganglia support a long time ago b/c there is Hadoop support for it natively. However, I gave up on it b/c it would require additional software installations on my clusters that most users don't have (minimally httpd, maybe some more I don't recall). It's something to reconsider of course.