hudi issues

Hudi Write Performance

5

Concerned about performance. How long should the following mocked-up sample take to write to s3? There are 1,369,765 records and 308 columns. It is taking ~10.5min running in docker container...

p-powell

performance

priority:critical

writer-core

pre-0.10.0

[SUPPORT] Hope to maintain a stable version

4

The current development route is that the next release will add some new functions and fix bugs in the old branch. However, the newly added functions will introduce new bugs....

todd5167

priority:minor

feature-enquiry

[SUPPORT] hudi-examples-dbt not running with spark thrift server

4

**_Tips before filing an issue_** - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)? - Join the mailing list to engage in conversations and get faster support at [email protected]. - If you...

sambhav13

priority:minor

engine-interoperability

Exception file does not exist .hoodie/.aux/view_storage_conf.properties

2

``` 2022-07-18 16:49:53 org.apache.hudi.exception.HoodieIOException: Could not load filesystem view storage properties from hdfs://XXXXXX/user/tdw/warehouse/csig_billing_rt_ods.db/ods_dev_flow_t_operation_flow_ri/.hoodie/.aux/view_storage_conf.properties at org.apache.hudi.util.ViewStorageProperties.loadFromProperties(ViewStorageProperties.java:78) at org.apache.hudi.util.StreamerUtil.getHoodieClientConfig(StreamerUtil.java:213) at org.apache.hudi.util.StreamerUtil.getHoodieClientConfig(StreamerUtil.java:152) at org.apache.hudi.util.StreamerUtil.createWriteClient(StreamerUtil.java:376) at org.apache.hudi.util.StreamerUtil.createWriteClient(StreamerUtil.java:360) at org.apache.hudi.sink.compact.CompactFunction.open(CompactFunction.java:81) at org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:34) at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:102) at...

XuQianJin-Stars

priority:minor

flink

table-service

Exception org.apache.hudi.exception.HoodieIOException: Could not read commit details

2

```2022-07-19 05:44:23 org.apache.hudi.exception.HoodieIOException: Could not read commit details from hdfs://XXXXXX/.hoodie/20220719053423274.deltacommit at org.apache.hudi.common.table.timeline.HoodieActiveTimeline.readDataFromPath(HoodieActiveTimeline.java:763) at org.apache.hudi.common.table.timeline.HoodieActiveTimeline.getInstantDetails(HoodieActiveTimeline.java:264) at org.apache.hudi.common.table.timeline.HoodieDefaultTimeline.getInstantDetails(HoodieDefaultTimeline.java:372) at org.apache.hudi.hadoop.utils.HoodieInputFormatUtils.getCommitMetadata(HoodieInputFormatUtils.java:511) at org.apache.hudi.sink.partitioner.profile.WriteProfiles.getCommitMetadata(WriteProfiles.java:194) at org.apache.hudi.source.IncrementalInputSplits.lambda$inputSplits$71(IncrementalInputSplits.java:183) at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1384) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)...

XuQianJin-Stars

priority:major

flink

table-service

[SUPPORT] DISTRIBUTE BY is not supported(line 59:undefined, pos 0) when using hudi-0.11.1 & spark-3.2.1

1

**_Tips before filing an issue_** - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)? - Join the mailing list to engage in conversations and get faster support at [email protected]. - If you...

jiezi2026

priority:minor

feature-enquiry

spark-sql

[SUPPORT] facing an issue on querying Data in Hudi version 0.10.1 using AWS glue

4

Hello guys. I am facing an issue on querying Data in Hudi version 0.10.1 using AWS glue. It works fine with 100 partitions in Dev but it got memory issues...

svaddoriya

aws-support

priority:critical

spark

[SUPPORT] - Hudi Read on a MOR table is failing with ArrayIndexOutOfBound exception

2

**_Tips before filing an issue_** - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)? - Join the mailing list to engage in conversations and get faster support at [email protected]. - If you...

soma1712

priority:critical

spark

[SUPPORT] HiveSyncTool: missing partitions

5

**Describe the problem you faced** We have some IoT data tables with a few thousands of partitions; typically `deviceId/year/month/day`. We do not sync to hive every commit, but at regular...

matthiasdg

meta-sync

priority:critical

[SUPPORT] Deltastreamer fails with data and timestamp related exception after upgrading to EMR 6.5 and spark3

5

We upgraded ourselves from running our Hudi spark-submits from EMR 5.33 to EMR 6.5 that has Spark 3x and then started running into below errors with date and timestamp. Please...

lavakerreddy

schema-and-data-types

priority:minor

spark

hudi
hudi copied to clipboard

Metadata

Hudi Write Performance

[SUPPORT] Hope to maintain a stable version

[SUPPORT] hudi-examples-dbt not running with spark thrift server

Exception file does not exist .hoodie/.aux/view_storage_conf.properties

Exception org.apache.hudi.exception.HoodieIOException: Could not read commit details

[SUPPORT] DISTRIBUTE BY is not supported(line 59:undefined, pos 0) when using hudi-0.11.1 & spark-3.2.1

[SUPPORT] facing an issue on querying Data in Hudi version 0.10.1 using AWS glue

[SUPPORT] - Hudi Read on a MOR table is failing with ArrayIndexOutOfBound exception

[SUPPORT] HiveSyncTool: missing partitions

[SUPPORT] Deltastreamer fails with data and timestamp related exception after upgrading to EMR 6.5 and spark3

← Metadata

Owner

Metadata

hudi hudi copied to clipboard

Metadata

← Metadata

Owner

Metadata

hudi
hudi copied to clipboard