Sagar Lakshmipathy
Sagar Lakshmipathy
Refer: https://github.com/apache/hudi/pull/10202 ### Change Logs Update SQL Queries compatibility matrix Redshift spectrum support Hudi MOR RO tables but doesn't support Snapshot and Incremental queries. StarRocks support MOR RO and Snapshot...
### Change Logs Fixed all code examples and refs for deprecated configs in the docs in current and 0.14.0 ### Impact Doc change ### Risk level (write none, low medium...
### Backend VL (Velox) ### Bug description [Expected behavior] Faster query runs compared to OSS Spark [actual behavior] OSS Spark runs in half the time taken by Gluten+Velox Spark. ###...
## What is the purpose of the pull request - Fixes [392](https://github.com/apache/incubator-xtable/issues/392) ## Brief change log - Fixed spark, delta, iceberg versions to follow the current pom.xml ## Verify this...
Steps to reproduce: Note: This issue only happens when the `hudi-extensions-0.1.0-SNAPSHOT-bundled.jar` is added to the class path, and its important for Big Query integration as iceberg tables. ``` spark-shell \...
code block ``` val icebergSourceClientProvider = new IcebergSourceClientProvider() icebergSourceClientProvider.init(spark.sparkContext.hadoopConfiguration, Collections.emptyMap()) val icebergSourcePerTableConfig = PerTableConfigImpl.builder() .tableName(hudiTableName) .namespace(namespaceArray) .targetTableFormats(Arrays.asList(TableFormat.DELTA)) .tableBasePath(hudiBasePath) .icebergCatalogConfig(icebergCatalogConfig) .syncMode(SyncMode.INCREMENTAL) .build() oneTableClient.sync(icebergSourcePerTableConfig, icebergSourceClientProvider) ``` error: ``` java.lang.NoSuchMethodError: org.apache.spark.sql.delta.actions.AddFile.(Ljava/lang/String;Lscala/collection/immutable/Map;JJZLjava/lang/String;Lscala/collection/immutable/Map;Lorg/apache/spark/sql/delta/actions/DeletionVectorDescriptor;)V org.apache.xtable.delta.DeltaDataFileUpdatesExtractor.createAddFileAction(DeltaDataFileUpdatesExtractor.java:118) org.apache.xtable.delta.DeltaDataFileUpdatesExtractor.lambda$applyDiff$3(DeltaDataFileUpdatesExtractor.java:99)...
Data engineers quite heavily rely on Python for creating data pipelines. Its important to support the ability to call the Java objects using Python (through Py4J or similar tools/libraries). Right...
To address [256](https://github.com/onetable-io/onetable/issues/256) and similar issues people have with environment setup having a docker playground will help.
According to https://github.com/onetable-io/onetable/issues/262 Docs needs to clarify 1. DB names appropriately i.e. instead of `onetable_synced_db` call it `onetable_synced_delta_db` (same for hudi and iceberg as well) 2. Command to create the...
## *Important Read* https://github.com/onetable-io/onetable/issues/257 ## What is the purpose of the pull request Adds docker playground with spark and copies the bundled jar. This improvement lets the users to run...