edamame
edamame copied to clipboard
Beanstalk + Tokyo Tyrant = Edamame, queryable persistent priority queuing
h1. Beanstalk + Tokyo Tyrant = Edamame, a fast persistent distributed priority job queue
"Edamame":http://bit.ly/edamame combines the "Beanstalk priority queue":http://bit.ly/beanstalkd with a "Tokyo Tyrant database"::http://bit.ly/ttyrant and "God monitoring":http://bit.ly/godmonitor to produce a persistent distributed priority job queue system.
- fast, scalable, lightweight and distributed
- persistent and recoverable
- scalable up to your memory limits
- queryable and enumerable jobs
- named jobs
- reasonably-good availability.
Like beanstalk, it is a job queue, not just a message queue:
- priority job scheduling, not just FIFO
- Supports multiple queues ('tubes')
- reliable scheduling: jobs that time out are re-assigned
It includes a few nifty toys:
- Scripts for "God":http://bit.ly/godmonitor to monitor and restart the daemons
- Command-line management scripts to load. enumerate, empty, and show stats for the db+queue
- The start of a lightweight web frontend in Sinatra.
h2. Documentation
The bulk of the documentation is at "http://mrflip.github.com/edamame":http://mrflip.github.com/edamame Go there instead.
h2. Help!
Send Edamame questions to the "Infinite Monkeywrench mailing list":http://groups.google.com/group/infochimps-code
h2. Requirements and Installation
h2. Install
** "Main Install and Setup Documentation":http://mrflip.github.com/edamame/INSTALL.html **
h3. Get the code
We're still actively developing edamame. The newest version is available via "Git":http://git-scm.com on "github:":http://github.com/mrflip/edamame
pre. $ git clone git://github.com/mrflip/edamame
A gem is available from "gemcutter:":http://gemcutter.org/gems/edamame
pre. $ sudo gem install edamame --source=http://gemcutter.org
(don't use the gems.github.com version -- it's way out of date.)
You can instead download this project in either "zip":http://github.com/mrflip/edamame/zipball/master or "tar":http://github.com/mrflip/edamame/tarball/master formats.
h3. Get the Dependencies
To finish setting up, see the "detailed setup instructions":http://mrflip.github.com/edamame/INSTALL.html and then read the "usage notes":http://mrflip.github.com/edamame/usage.html
- "beanstalkd 1.3,":http://xph.us/dist/beanstalkd/ "libevent 1.4,":http://monkey.org/~provos/libevent/ and "beanstalk-client":http://github.com/dustin/beanstalk-client
- "Tokyo Tyrant,":http://tokyocabinet.sourceforge.net/tyrantdoc/ "Tokyo Tyrant Ruby libs,":http://tokyocabinet.sourceforge.net/tyrantrubydoc/ "Tokyo Cabinet,":http://tokyocabinet.sourceforge.net and "Tokyo Cabinet Ruby libs":http://tokyocabinet.sourceforge.net/tyrantdoc/
- Gems: "wukong":http://mrflip.github.com/wukong and "monkeyshines":http://mrflip.github.com/monkeyshines
See the "Detailed install instructions":http://mrflip.github.com/edamame/INSTALL.html (it also has hints about installing Tokyo*, Beanstalkd and friends.
h2. Endnotes
h3. Caveats
Weaknesses? Mainly that it will make an Erlang'er cry for its lack of concurrency correctness. Its goal is to work pretty well and to recover gracefully, but its design limits .
- We store jobs in two places: the central DB and the distributed queue.
- As always, your jobs must either be idempotent, or harmless if re-run: a job could start and do some or all of its job -- but lose contact with the queue, causing the job to be re-run. This is inherent in beanstalkd (and most comparable solutions), not just edamame.
- Although God will watch the daemons, it won't repopulate the queue or restart a worker that fails.
h3. TODOs
- Restarting is still manual: you have to run @bin/edamame-sync@ to reload the queue from the database
- The sinatra queue viewer doesn't work at the moment.
h3. Links
There's a fuller set of docs at "http://mrflip.github.com/edamame":http://mrflip.github.com/edamame
- Origin of the name "edamame":http://en.wikipedia.org/wiki/Edamame
- This library was written to support the "Monkeyshines":http://bit.ly/shines distributed API scraper.
- Beanstalk: ** "Beanstalk, a fast, distributed, in-memory workqueue service":http://xph.us/software/beanstalkd/ ** "Beanstalkd code":http://github.com/kr/beanstalkd/tree/master ** "FAQ":http://wiki.github.com/kr/beanstalkd/faq ** "Beanstalk Ruby Client":http://github.com/dustin/beanstalk-client-ruby/tree/master ** "Tutorial from nubyonrails":http://nubyonrails.com/articles/about-this-blog-beanstalk-messaging-queue ** "Mailing list":http://www.mail-archive.com/[email protected]/ ** Some "beanstalk utilities":http://github.com/dustin/beanstalk-tools/tree/master -- edamame has its own take on some of these.
- Tokyo Tyrant: ** "Tokyo Tyrant":http://tokyocabinet.sourceforge.net/tyrantdoc/ ** "Tokyo Tyrant Ruby libs":http://tokyocabinet.sourceforge.net/tyrantrubydoc/ ** You'll need the "Tokyo Cabinet":http://tokyocabinet.sourceforge.net libs and the "Tokyo Cabinet Ruby libs":http://tokyocabinet.sourceforge.net/tyrantdoc/
- "God process monitoring framework":http://god.rubyforge.org/ ** http://railscasts.com/episodes/130-monitoring-with-god ** Some code for the god conf is inspired by that railscast, "this pastie,":http://pastie.textmate.org/private/ovgxu2ihoicli2ktrwtbew the "one from the god docs":http://god.rubyforge.org/, and "Configuring GMail notifiers in God":http://millarian.com/programming/ruby-on-rails/monitoring-thin-using-god-with-google-apps-notifications/ ** Alternatives to God include (in order of complexity): "Monit,":http://mmonit.com/monit/ perhaps "with Munin;":http://www.howtoforge.com/server_monitoring_monit_munin "Cacti":http://www.cacti.net/ and "Hyperic":http://www.hyperic.com/
h2. More info h3. Credits Edamame was written by "Philip (flip) Kromer":http://mrflip.com ([email protected] / "@mrflip":http://twitter.com/mrflip) for the "infochimps project":http://infochimps.org h3. Help! Send wuclan questions to the "Infinite Monkeywrench mailing list":http://groups.google.com/group/infochimps-code