metorikku icon indicating copy to clipboard operation
metorikku copied to clipboard

CDC for multiple Tables

Open rubenssoto opened this issue 4 years ago • 1 comments

Hello 👍

I read your article about CDC and metorikku, great article. I have a case that I have 200 tables that arrives in parquet format in my datalake, metorikku could process more than one table in the same spark context in any paralalel way? For example with threads.

thank you

rubenssoto avatar Nov 07 '20 02:11 rubenssoto

Hi! thanks for opening the issue. I started dabbling with this here: https://github.com/YotpoLtd/metorikku/pull/310

But encountered some issues with the avro deserialization lib we're using... Maybe I'll take another go at this soon. Just FYI we started moving away from hudi for CDC recently as it became too expensive and complex and started using https://www.upsolver.com/ and we couldn't be happier :)

lyogev avatar Nov 09 '20 09:11 lyogev