Liran

Results 46 comments of Liran

This is just the way spark works, I don't like the proposed solution since it involves reading the DF non-lazily in the driver (using collect). We could add this as...

I don't think it halts, it just takes a very long time depending on your computer. Try with some memory configurations: ``` -Xms2048M -Xmx2048M -Xss6M -XX:MaxPermSize=512M -XX:ReservedCodeCacheSize=256M ```

You need to run ```sbt assembly``` prior to running the docker build

Hi! so you could hack this some way. One of the abilities to write via JDBC is with the [JDBC Query](https://github.com/YotpoLtd/metorikku#jdbc-query) output, so you could, for example, after writing to...

I've opened this PR for this #414

@cyrillay would you mind testing #414 and see if it fits what you need?

@RonBarabash can I close this?

So if you're reading a single CSV and perform partitionby the first part will be a single partition. You can either repartition using the hints in the link or create...

It's a work in progress. but I wanted to test delta vs. hudi in our environment and figured I'd add delta writer along the way. I'll share the results internally...

Hi! thanks for opening the issue. I started dabbling with this here: https://github.com/YotpoLtd/metorikku/pull/310 But encountered some issues with the avro deserialization lib we're using... Maybe I'll take another go at...