hudi issues

[MINOR] Update PR template with documentation update

1

### Change Logs As above. The author of PRs should put up docs updates as part of their contribution. The author should create a Jira for the docs and link...

yihua

How to read existing hoodie written data from `S3` using `AWS Glue DynamicFrame` class. Failing with error with below error: An error occurred while calling o84.getDynamicFrame. s3://xxxx/.hoodie/202212312312.commit is not a Parquet file. expected magic number at tail

5

**Describe the problem you faced** Error found while reading data written using Hudi in a S3 prefix. A clear and concise description of the problem. We are writing data to...

gtwuser

aws-support

priority:minor

[SUPPORT] SqlQueryBasedTransformer causes memory issues

**Describe the problem you faced** With a DeltaStreamer job that runs fine before, adding a SqlQueryBasedTransformer that only SELECTs 1 column runs into memory issues. `"--transformer-class", "org.apache.hudi.utilities.transform.SqlQueryBasedTransformer", "--hoodie-conf", "hoodie.deltastreamer.transformer.sql=SELECT a.ATTRIBUTES...

tzhang-fetch

[SUPPORT] AWSDmsAvroPayload not found querying _rt table MoR

2

**Describe the problem you faced** When I tried to query _rt table using `select count(*) from table_rt` through Hive or Spark SQL, an exception is thrown saying AWSDmsAvroPayload not found....

Xiaohan-Shen

aws-support

priority:critical

reader-core

[SUPPORT] ClassNotFoundException org.apache.hudi.org.apache.avro.LogicalTypes$LocalTimestampMillis

2

**_Tips before filing an issue_** - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)? - Join the mailing list to engage in conversations and get faster support at [email protected]. - If you...

eric9204

dependencies

priority:critical

flink-sql

[HUDI-4885] Adding org.apache.avro to hudi-hive-sync bundle

1

### Change Logs After we landed https://github.com/apache/hudi/pull/6472, hive-sync in docker demo is broken. It fails w/ below stacktrace. ``` 2022-09-20 14:24:39,758 INFO [main] table.TableSchemaResolver (TableSchemaResolver.java:readSchemaFromParquetBaseFile(439)) - Reading schema from /user/hive/warehouse/stock_ticks_cow/2018/08/31/b4a7076c-30e6-4320-bb04-be47246b6646-0_0-29-29_20220920142351042.parquet...

nsivabalan

priority:blocker

[HUDI-4848] Fixing repair deprecated partition tool

1

### Change Logs Existing cli tool to repair deprecated partition had some assumptions about partition type being string. Also, it did not delete the physical old partition. Fixing those in...

nsivabalan

priority:blocker

[HUDI-4237] should not sync partition parameters when create non-partition table in spark

13

### issue description Create a non-partition hudi table in Spark，it will store spark.sql.sources.schema.partCol.0 with an empty value in hiveMetastore. This is unexpected behavior, it should not store spark.sql.sources.schema.partCol.0 in HiveMetastore...

dujl

priority:blocker

spark

catalog

[MINOR]: Optimize the judgment logic of `SparkDataSourceOptions Key`

1

## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.* ## What is the purpose of the pull request...

gnailJC

priority:blocker

spark-sql

[HUDI-4526] Improve spillableMapBasePath disk directory is full

4

## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.* ## What is the purpose of the pull request...

XuQianJin-Stars

priority:blocker

writer-core

reader-core

hudi
hudi copied to clipboard

Metadata

[MINOR] Update PR template with documentation update

How to read existing hoodie written data from `S3` using `AWS Glue DynamicFrame` class. Failing with error with below error: An error occurred while calling o84.getDynamicFrame. s3://xxxx/.hoodie/202212312312.commit is not a Parquet file. expected magic number at tail

[SUPPORT] SqlQueryBasedTransformer causes memory issues

[SUPPORT] AWSDmsAvroPayload not found querying _rt table MoR

[SUPPORT] ClassNotFoundException org.apache.hudi.org.apache.avro.LogicalTypes$LocalTimestampMillis

[HUDI-4885] Adding org.apache.avro to hudi-hive-sync bundle

[HUDI-4848] Fixing repair deprecated partition tool

[HUDI-4237] should not sync partition parameters when create non-partition table in spark

[MINOR]: Optimize the judgment logic of `SparkDataSourceOptions Key`

[HUDI-4526] Improve spillableMapBasePath disk directory is full

← Metadata

Owner

Metadata

hudi hudi copied to clipboard

Metadata

← Metadata

Owner

Metadata

hudi
hudi copied to clipboard