hudi
hudi copied to clipboard
Upserts, Deletes And Incremental Processing on Big Data.
### Change Logs As above. The author of PRs should put up docs updates as part of their contribution. The author should create a Jira for the docs and link...
**Describe the problem you faced** Error found while reading data written using Hudi in a S3 prefix. A clear and concise description of the problem. We are writing data to...
**Describe the problem you faced** With a DeltaStreamer job that runs fine before, adding a SqlQueryBasedTransformer that only SELECTs 1 column runs into memory issues. `"--transformer-class", "org.apache.hudi.utilities.transform.SqlQueryBasedTransformer", "--hoodie-conf", "hoodie.deltastreamer.transformer.sql=SELECT a.ATTRIBUTES...
**Describe the problem you faced** When I tried to query _rt table using `select count(*) from table_rt` through Hive or Spark SQL, an exception is thrown saying AWSDmsAvroPayload not found....
[SUPPORT] ClassNotFoundException org.apache.hudi.org.apache.avro.LogicalTypes$LocalTimestampMillis
**_Tips before filing an issue_** - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)? - Join the mailing list to engage in conversations and get faster support at [email protected]. - If you...
### Change Logs After we landed https://github.com/apache/hudi/pull/6472, hive-sync in docker demo is broken. It fails w/ below stacktrace. ``` 2022-09-20 14:24:39,758 INFO [main] table.TableSchemaResolver (TableSchemaResolver.java:readSchemaFromParquetBaseFile(439)) - Reading schema from /user/hive/warehouse/stock_ticks_cow/2018/08/31/b4a7076c-30e6-4320-bb04-be47246b6646-0_0-29-29_20220920142351042.parquet...
### Change Logs Existing cli tool to repair deprecated partition had some assumptions about partition type being string. Also, it did not delete the physical old partition. Fixing those in...
### issue description Create a non-partition hudi table in Spark,it will store spark.sql.sources.schema.partCol.0 with an empty value in hiveMetastore. This is unexpected behavior, it should not store spark.sql.sources.schema.partCol.0 in HiveMetastore...
## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.* ## What is the purpose of the pull request...
## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.* ## What is the purpose of the pull request...