hudi
hudi copied to clipboard
Upserts, Deletes And Incremental Processing on Big Data.
### Change Logs `SQLConf` by default isn't propagated to Executors from the Driver. This is the reason why configuration specified w/ `--conf` is not being respected by some components being...
### Change Logs Cleaning up Spark utilities, removing duplication (this is a preparatory change for the one stacked on top). ### Impact None ### Contributor's checklist - [ ] Read...
### Change Logs This PR updates the Spark Guide website page with Spark 3.3 support. ### Impact Only docs update. **Risk level: none** The website can be properly built and...
### Change Logs Currently, `HoodieParquetReader` is not specifying projected schema properly when reading Parquet files which ends up failing in cases when the provided schema is not equal the schema...
**Describe the problem you faced** I tried to use Hudi `hudi-defaults.conf` with Glue and tried to set the path of the file using Spark Config and Python Environment config and...
I use the insert operation to publish data to the hive. I insert a 541 partitions table. And the input data size is 153GB .But there are 2.6TB in the...
My job is just a wrapper around `HoodieDeltaStreamer` (yes, there are probably better ways to do this). ``` public class SparkHudiPoc { public static void main(String[] args) throws Exception {...
## What is the purpose of the pull request The repo recently upgraded to start using log4j2 instead of log4j. There are a few configuration and dependency changes that needed...
**_Tips before filing an issue_** - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)? - Join the mailing list to engage in conversations and get faster support at [email protected]. - If you...
**_Tips before filing an issue_** - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)? - YES A clear and concise description of the problem. With missing nested fields in a struct, it...