cascalog
cascalog copied to clipboard
Data processing on Hadoop without the hassle.
This is clearly the right library to use for parsing: https://github.com/Engelberg/instaparse Right now it's all manual and pretty hacky. I'd love a formal grammar for Cascalog's datalog implementation.
Check out https://groups.google.com/forum/?fromgroups=#!topic/cascalog-user/_SDWj9GtD0Y for more info (especially post no. 3).
Currently I am testing Clojure/Cascalog with Hortonworks HDP but got the following errors.. //lein repl - compile with no error test2.core=> (? ?line)) 14/08/10 16:51:20 INFO util.HadoopUtil: using default application...
(as from the mailing list: https://groups.google.com/forum/#!topic/cascalog-user/Rq_O33VsDyc ) I've come across similar issues of the options for child JVMs specified in with-job-conf not "sticking". I experienced GC issues in a reducer...
I've been trying to install Cascalog from behind a proxy using `leiningen`, and both 1.x and 2.x-SNAPSHOT fail with variations of the following output: ``` $ lein deps Could not...
I'd love to see someone take on integration of Prismatic's Schema library with Cascalog. The ability to write schemafied operations, and have the Cascalog compiler validate schemas before submitting jobs,...
Rather than on an executed query basis.
From @cwensel: just added a commit that grants access to the co group iterators from a Buffer https://github.com/ConcurrentCore/cascading/commit/2e3a1a7c93a012ef15e690a6728e191d1194b127 this test shows how it works https://github.com/ConcurrentCore/cascading/commit/2e3a1a7c93a012ef15e690a6728e191d1194b127#L31R182