hudi issues

[SUPPORT] - Deltastreamer not shutting down properly

5

Running Deltastreamer with Cloudwatch Metrics isn't shutting down properly. This is in NON continous mode. DeltaSync and spark context say they are closing, but the JVM is not exiting, everything...

sstimmel

priority:critical

deltastreamer

[SUPPORT] Hudi creates duplicate, redundant file during clustering

4

**Summary** During clustering, Hudi creates duplicate parquet file with the same file group ID and identical content. One of the two files are later marked as a duplicate and deleted....

namuny

priority:critical

table-service

[SUPPORT] Hudi Delete Not working with EMR, AWS Glue & S3

5

### Describe the problem I'm using a Spark job running on EMR to insert data using hudi (0.9.0). The inserts are working as expected and it stores parquet files in...

navbalaraman

meta-sync

priority:critical

spark

[SUPPORT] Incremental and snapshot reads shows different results

7

Hudi version: 0.11.1 Spark version: 3.1.1 Storage: S3 AWS Glue: 3 Function ```scala import org.apache.spark.sql.{functions => fn} def readAndShow(path: String) { val df = spark.read.format("hudi").load(path) df.select(fn.min(fn.col("updated_at")), fn.min(fn.col("_hoodie_commit_time"))) show false val...

eshu

priority:major

spark

reader-core

incremental-query

[SUPPORT] Hudi cli got empty result for command show fsview all

2

**Describe the problem you faced** Hudi cli got empty result after running command show fsview all. ![image](https://user-images.githubusercontent.com/7007327/180346750-6a55f472-45ac-46cf-8185-3c4fc4c76434.png) The type of table t1 is COW and I am sure that the...

paul8263

priority:minor

cli

[SUPPORT] Building workload profile failing after upgrade to 0.11.0 when doing upsert operations

7

**Describe the problem you faced** A clear and concise description of the problem. Upgrading to 0.11.1 , the deltastreamer is failing to write to a 6GB bucket. It is failing...

rohitmittapalli

performance

priority:major

writer-core

on-call-triaged

[SUPPORT] ALL PARQUET FILES FROM BASE PATH GOT DELETED BY CLEANER

4

**_Tips before filing an issue_** - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)? - Join the mailing list to engage in conversations and get faster support at [email protected]. - If you...

Nazerra

priority:critical

writer-core

table-service

[SUPPORT] S3 throttling while loading a table written with "hoodie.metadata.enable" = true

5

**Describe the problem you faced** Our Hudi data lake is heavily partitioned by datasource, year, and month. We have 1000 datasources currently loaded into the lake, and are looking to...

noahtaite

priority:major

metadata

Hoodie Deltastreamer Job unable to fecth data from kafka topic from starting offset available

5

**Describe the problem you faced** I'm using Hudi Delta streamer in continuous mode with Kafka source. Whenever Kafka offset got expired the job will fail with offset out of range...

ksrihari93

priority:critical

deltastreamer

pre-0.10.0

[SUPPORT] Hoodie Delta streamer Job with Kafka Source fetching the same offset again and again Commiting the same offset again and again

5

**_Tips before filing an issue_** - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)? - Join the mailing list to engage in conversations and get faster support at [email protected]. - If you...

ksrihari93

priority:critical

deltastreamer

pre-0.10.0

hudi
hudi copied to clipboard

Metadata

[SUPPORT] - Deltastreamer not shutting down properly

[SUPPORT] Hudi creates duplicate, redundant file during clustering

[SUPPORT] Hudi Delete Not working with EMR, AWS Glue & S3

[SUPPORT] Incremental and snapshot reads shows different results

[SUPPORT] Hudi cli got empty result for command show fsview all

[SUPPORT] Building workload profile failing after upgrade to 0.11.0 when doing upsert operations

[SUPPORT] ALL PARQUET FILES FROM BASE PATH GOT DELETED BY CLEANER

[SUPPORT] S3 throttling while loading a table written with "hoodie.metadata.enable" = true

Hoodie Deltastreamer Job unable to fecth data from kafka topic from starting offset available

[SUPPORT] Hoodie Delta streamer Job with Kafka Source fetching the same offset again and again Commiting the same offset again and again

← Metadata

Owner

Metadata

hudi hudi copied to clipboard

Metadata

← Metadata

Owner

Metadata

hudi
hudi copied to clipboard