ksrihari93

Results 10 comments of ksrihari93

Hi @nsivabalan , #base properties hoodie.insert.shuffle.parallelism=50 hoodie.bulkinsert.shuffle.parallelism=200 hoodie.embed.timeline.server=true hoodie.filesystem.view.type=EMBEDDED_KV_STORE hoodie.compact.inline=false hoodie.bulkinsert.sort.mode=none #cleaner properties hoodie.cleaner.policy=KEEP_LATEST_FILE_VERSIONS hoodie.cleaner.fileversions.retained=60 hoodie.clean.async=true #archival hoodie.keep.min.commits=12 hoodie.keep.max.commits=15 #datasource properties hoodie.deltastreamer.schemaprovider.registry.url= hoodie.datasource.write.recordkey.field= hoodie.deltastreamer.source.kafka.topic= hoodie.datasource.write.keygenerator.class=org.apache.hudi.keygen.CustomKeyGenerator hoodie.datasource.write.partitionpath.field=timestamp:TIMESTAMP hoodie.deltastreamer.kafka.source.maxEvents=600000000 hoodie.deltastreamer.keygen.timebased.timestamp.type=EPOCHMILLISECONDS hoodie.deltastreamer.keygen.timebased.input.timezone=UTC hoodie.deltastreamer.keygen.timebased.output.timezone=UTC...

We have recovered the job by skipping a few offsets. this issue can be closed now.

[logs_latest_commits.txt](https://github.com/apache/hudi/files/8909385/logs_latest_commits.txt) [logs_latest.txt](https://github.com/apache/hudi/files/8909386/logs_latest.txt)

Hi Team, Sorry for the late reply. i have used this option only. auto.offset.reset=LATEST But could not recover. So i wrote to some temporary path for few records and copied...

what happened was that within one partition offset has expired. When tried with offset based on timestamp (this doesn't support as suggested above) this also did not work out. So...

> may be this could be the issue. can you try adding this to spark-submit command > > ``` > --hoodie-conf hoodie.clustering.async.enabled=true > ``` Hi , I have passed this...

> may be this could be the issue. can you try adding this to spark-submit command > > ``` > --hoodie-conf hoodie.clustering.async.enabled=true > ``` Have tried this option still no...

Hi @codope , I have first setup the clustering with default configuration only. since it's not working have used these options. hoodie.clustering.async.enabled=true hoodie.clustering.plan.strategy.target.file.max.bytes=3000000000 hoodie.clustering.plan.strategy.small.file.limit=200000001 hoodie.clustering.async.max.commits=1 hoodie.clustering.plan.strategy.max.num.groups=10 oodie.clustering.plan.strategy.max.bytes.per.group=9000000000

Hi Team, Sorry for the late reply. Now it's working fine. we can close this issue.