incubator-graphar icon indicating copy to clipboard operation
incubator-graphar copied to clipboard

[Bug][Spark]: `overwrite` mode not work with GarDataSource

Open acezen opened this issue 7 months ago • 0 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Current Behavior

If GraphAr try to write dataframe with overwrite mode and GarDataSource like

dataFrame.write
        .mode("overwrite")
        .option("header", "true")
        .option("fileFormat", fileType)
        .option(
          GeneralParams.offsetStartChunkIndexKey,
          offsetStartChunkIndex.get
        )
        .format("com.alibaba.graphar.datasources.GarDataSource")
        .save(outputPrefix)

it would raise error:

org.apache.spark.sql.AnalysisException: Table gar file:/tmp/test1/edge/person_knows_person/ordered_by_source/offset does not support truncate in batch mode.;
OverwriteByExpression RelationV2[_graphArOffset#339] gar file:/tmp/test1/edge/person_knows_person/ordered_by_source/offset, true, [header=true, fileFormat=csv, _graphar_offset_start_chunk_index=0, path=/tmp/test1/edge/person_knows_person/ordered_by_source/offset/], true
+- LogicalRDD [_graphArOffset#216], false
  at org.apache.spark.sql.errors.QueryCompilationErrors$.unsupportedTableOperationError(QueryCompilationErrors.scala:801)
  at org.apache.spark.sql.errors.QueryCompilationErrors$.unsupportedTruncateInBatchModeError(QueryCompilationErrors.scala:821)
  at org.apache.spark.sql.execution.datasources.v2.TableCapabilityCheck$.$anonfun$apply$1(TableCapabilityCheck.scala:61)
  at org.apache.spark.sql.execution.datasources.v2.TableCapabilityCheck$.$anonfun$apply$1$adapted(TableCapabilityCheck.scala:40)
  at org.apache.spark.sql.catalyst.trees.TreeNode.foreach(TreeNode.scala:253)
  at org.apache.spark.sql.execution.datasources.v2.TableCapabilityCheck$.apply(TableCapabilityCheck.scala:40)
  at org.apache.spark.sql.execution.datasources.v2.TableCapabilityCheck$.apply(TableCapabilityCheck.scala:32)
  at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$39(CheckAnalysis.scala:560)
  at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$39$adapted(CheckAnalysis.scala:560)
  at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)

Expected Behavior

write with overwrite as expect.

Minimal Reproducible Example

NA

Environment

  • Operating system: macOS M2
  • GraphAr version: 0.11.0

Link to GraphAr Logs

No response

Further Information

No response

acezen avatar Jan 19 '24 07:01 acezen