scalding icon indicating copy to clipboard operation
scalding copied to clipboard

A Scala API for Cascading

Results 102 scalding issues
Sort by recently updated
recently updated
newest added

When I run a job, I get a job name in the tracker like `[26B3D8E9AA5C4F86BECCE8387FC9EBED/F6BF86BE5BC645D6AAE865BE1851E91E] com.houzz.log.PartitionMahout/(1/2)` which just clutters the view and makes it harder to see the useful part...

This is a vague idea, but we could make a scalding macro annotation, such as: ```scala @scalding class MyJob(foo: Args) extends Job(args) { } ``` Which could do some standard...

to discuss the right implementation of #1652. Original problem: 1. original implementation was based on `tap.getPath`, but some sources return the wrong path of source files/dirs because internal implementation of...

When I groupBy custom object it throws Exception. ``` 017-02-16 20:02:43,650 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : cascading.CascadingException: unable to compare stream elements in position: 0 at cascading.tuple.hadoop.util.DeserializerComparator.compareTuples(DeserializerComparator.java:164) at...

All toIterator impls should consistently validate taps in order to be able to rely on the iterator completeness. Compare com.twitter.scalding.Mappable#toIterator and com.twitter.scalding.commons.source.LzoCodec#toIterator, etc.

using batched on the map-side dramatically improves the performance of sketch-join.

Because I want the hadoop shuffle sort use my 'compareTo' method. So how to do?

Hi to the scalding team I am trying to produce a parquet file with scalding, following example available at: https://github.com/twitter/scalding/tree/develop/scalding-parquet I succeeded to compile example using maven with both scala...

HadoopMode.newFlowConnector ignores (intentionally) the jobConf because we don't want creepy mutation confusing what the Config is. The problem is, Execution gives no programmatic way to change the Config inside the...

bug

A user has a case class wrapping a `Long`. The tsv had a string that could not be parsed as long, and the use got `0L` for the Long. I...