scalding
scalding copied to clipboard
A Scala API for Cascading
Hi, I tried the following steps according to the doc but run into some errors: ``` ./sbt update (takes 2 minutes or more) ./sbt test ``` any idea what is...
Hi, I've been trying to install & run Scalding, but I've been running in to some issues... it looks as though some of the installation directions may be outdated. I...
`forceToDisk` and `forceToDiskExecutition` should be basically equivalent. However, there is a significant difference in terms of the ability to combine splits. Within a single flow, Twitter configures cascading.flowconnector.intermediateschemeclass to https://github.com/twitter/elephant-bird/blob/master/cascading2/src/main/java/com/twitter/elephantbird/cascading2/scheme/CombinedSequenceFile.java...
I want to groupBy an key. And then, hopefully I get an iterator over all the values with the same key (please correct me if I'm wrong here). Then, for...
At my workplace, we work with data where we expect each element (`TypedPipe[A]`) to be unique across the whole dataset or for each key (`TypedPipe[(K, V)]`) to be unique across...
Currently, the output cannot be compressed in Scalding. The Hadoop configuration property `mapred.output.compress` is ignored, so setting it to `true` both thorough command line with `-D mapred.output.compress=true` or by overriding...
https://github.com/twitter/scalding/blob/develop/scalding-repl/src/main/scala/com/twitter/scalding/ReplImplicits.scala#L139 Refers to the ScaldingShell object, if a user makes their own shell inheriting from ScaldingShell this will break how the code is built I believe. (I thinnk @ Twitter...
https://github.com/twitter/scalding/blob/398545991c63019714d80ac22de6494adfd798f2/scalding-core/src/main/scala/com/twitter/scalding/macros/impl/TypeDescriptorProviderImpl.scala If users override an implicit TypeDescriptor, use that if possible.
Note, inside of Job there is an implicit from `TraverseableOnce` to `Fields`: https://github.com/twitter/scalding/blob/develop/scalding-core/src/main/scala/com/twitter/scalding/FieldConversions.scala#L185 Note, `.isDefined` is a method on `Option` in scala, but not on `TraversableOnce/List/Seq`. It is an easy...