scalding icon indicating copy to clipboard operation
scalding copied to clipboard

A Scala API for Cascading

Results 102 scalding issues
Sort by recently updated
recently updated
newest added

This moves all the parquet schemes to separate sub-projects than the parquet sources. This is mainly for easier upgrade to cascading3 and future versions perhaps. (As things stand now, we...

Needs some more work before merging around cleaning it up, more benchmarks and seeing what does/doesn't need to be in lui. TBD are real world tests against jobs with manual...

This is a single merge consisting of the piecemeal changes listed in https://github.com/twitter/scalding/issues/1465 For testing, I've published `0.16.1-cascading3-RC2` to sonatype. @johnynek @ianoc would be great if this can be integrated/tested...

(as being discussed on scalding-dev) In the things I've left out for now (in addition to the list in the e-mail): scalding-hadoop-test (platform-specific tests on a real minicluster). Perhaps it'd...

First extraction. Right now i don't think we need the macros and can probably drop them again. Naming/code organization and how this should all get imported open to opinions We...

Generators of TypedPipe/Execution inspired by @erik-stripe talk http://plastic-idolatry.com/erik/oslo2019.pdf

- Problem: projection schema is created by [ThriftSchemaConvertVisitor](https://github.com/apache/parquet-mr/blob/master/parquet-thrift/src/main/java/org/apache/parquet/thrift/ThriftSchemaConvertVisitor.java#L180) which suffix fields with `_tuple` https://github.com/twitter/scalding/blob/59d932731f7396eaaf6024624d1ce6660534ca77/scalding-parquet-scrooge/src/main/java/com/twitter/scalding/parquet/scrooge/ScroogeReadSupport.java#L98-L101 - When reading parquet file with standard format 3-level or even legacy ones 2-level `array`, projection...

build.sbt includes com.hadoop.gplcompression which is GPLv3 therefore this project should be too to comply with the terms. What is this PR for? Change license to be compatible with license of...

I found some issues when I was trying to use macro to generate ParquetReadSupport: here is the repl. Could there be some bugs in the macro? The ParquetWriteSupport works through....