cascalog icon indicating copy to clipboard operation
cascalog copied to clipboard

Data processing on Hadoop without the hassle.

Results 40 cascalog issues
Sort by recently updated
recently updated
newest added

`with-job-conf`'s bindings won't work w/ threads. Inside the checkpointing macro, for example. Fix inside checkpoint by using futures instead of threads.

Bug

e.g., if c/count is done twice in one query (which can happen especially when predicate macros are involved)

Improvement

e.g., Bixo's fetch pipe. Need to figure out how to parameterize things so its clear how Cascalog connects in and connects out.

Feature

I am using JCascalog to create a query to read and process data from a hbase table. On execution, two mappers are getting created for all types of queries and...

Probably against the matrix of Clojure and Hadoop version combinations, given how easy this is with Travis. Here's some more information: http://about.travis-ci.org/docs/user/build-configuration/ I believe that Cascading's going to start testing...

Improvement
newbies

The new serfn for Cascalog 2.0 works great when the same var definitions exist on the tasks as the machine submitting the job. This is the case for normal ETL...

Feature

e.g. ``` (> ?pivot)) ``` `:>>` into a var will capture the output into a nested tuple (just a seq of fields) Unclear how to handle nested serialization. Perhaps Cascading...

Feature

The logs get a bit muddled of late.

Improvement

Cascalog's errors are pretty bad and confuse new users. A non-exhaustive list of bad spots includes: 1. Trying to output to a local tap under Hadoop 2. Trying to write...

Improvement

My notes from another ticket: I wanted to ask what you thought of allowing larger lazy sequences to be used with union and combine. I've got some code that you...

Improvement