ecosystem
ecosystem copied to clipboard
java.lang.IllegalStateException: Path hdfs://cluster/user/app/user/test.tfrecords already exists. SaveMode: ErrorIfExists.
hi,
I am running my script using spark-connector_2.11-1.10.0.jar. wirte data into HDFS using dataframe.write.format("tfrecords").save("hdfs://cluster/user/app/user/test.tfrecords ").
run it as :spark-submit --jars jars/spark-connector_2.11-1.10.0.jar
I got an exception.
py4j.protocol.Py4JJavaError: An error occurred while calling o68.save.
: java.lang.IllegalStateException: Path hdfs://cluster/user/app/user/test.tfrecords already exists. SaveMode: ErrorIfExists.
at org.tensorflow.spark.datasources.tfrecords.DefaultSource.saveDistributed(DefaultSource.scala:107)
at org.tensorflow.spark.datasources.tfrecords.DefaultSource.createRelation(DefaultSource.scala:71)
at org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:469)
at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:48)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116)
at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92)
at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92)
at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:609)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:233)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:217)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:280)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:214)
at java.lang.Thread.run(Thread.java:748)
could you help me check what the problem is?
Of course there is no such path:hdfs://cluster/user/app/user/test.tfrecords in my hdfs