Rohit Mittapalli
Rohit Mittapalli
Unable to reproduce on a brand new table. Same script, same environment but new table causes no issues. Not sure if this has any problems due to upgrading a previously...
Also wanted note that in the logs I am seeing `22/08/09 05:13:37 INFO DeltaSync: Seeing new schema. Source :{ ` This has the correct schema, for both the source and...
``` stat_data_frame = (session.read.format("hudi").option("hoodie.datasource.write.reconcile.schema", "true").load("s3a://example-prod-output/stats/querying/0e6a3669-1f94-4ec4-93e8-6b5b25053b7e-0_0-70-1046_20220809052311671.parquet")) len(stat_data_frame.columns) # returns 616 ``` however ``` stat_data_frame = (session.read.format("hudi").option("hoodie.datasource.write.reconcile.schema", "true").load("s3a://example-prod-output/stats/querying/*")) len(stat_data_frame.columns) # returns 551 ```
This issue was resolved by instead remove asterisk: ` stat_data_frame = (session.read.format("hudi").option("hoodie.datasource.write.reconcile.schema", "true").load("s3a://example-prod-output/stats/querying")) `
Running into an issue now with `.option("hoodie.metadata.enable", "true")`. When doing so I receive the following issue alongside a FileNotFound: `It is possible the underlying files have been updated. You can...
Option 2 worked for me! Set hoodie.metadata.enable to false in Deltastreamer and wait for a few commits so that metadata table is deleted completely (no .hoodie/metadata folder), and then re-enable...
When trying to upgrade to 0.11.1 I receive the following error: ``` Caused by: java.io.InvalidClassException: org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$Config; local class incompatible: stream classdesc serialVersionUID = 8242394756111306873, local class serialVersionUID = -7585117557652348753 ```
I use the Spark and Hudi jars provided here: ``` curl https://repo1.maven.org/maven2/org/apache/hudi/hudi-spark3.2-bundle_2.12/0.11.0/hudi-spark3.2-bundle_2.12-0.11.0.jar --output hudi-spark3-bundle.jar && \ curl https://repo1.maven.org/maven2/org/apache/hudi/hudi-utilities-bundle_2.12/0.11.0/hudi-utilities-bundle_2.12-0.11.0.jar --output hudi-utilities-bundle.jar && \ ```
Was able to successfully run the job by 1. Downgrading from Spark 3.2.1 to 3.1.2 2. Using hadoop version 3.2.0 3. Using hudi-utilities bundle exclusively in the deltastreamer 4. Exclusively...
Since we've swtiched to BULK_INSERT / INSERT haven't seen this operation, can close the support ticket.