ipex-llm icon indicating copy to clipboard operation
ipex-llm copied to clipboard

[PPML] write encrypted JSON file will raise "Bad arguments" Exception

Open PatrickkZ opened this issue 1 year ago • 0 comments

This Exception only happens when DataFrame's record number is less than the partition number. for example, DataFrame df only contain 3 Row, when we config --master 'local[n]' or use df.repartition(n) where n > 3, some partitions will be empty. so when try to encrypt the empty partition will cause the Exception.

trace:

java.lang.IllegalArgumentException: Bad arguments
        at javax.crypto.Cipher.doFinal(Cipher.java:2222) ~[?:1.8.0_292]
        at com.intel.analytics.bigdl.ppml.crypto.BigDLEncrypt.doFinal(BigDLEncrypt.scala:165) ~[classes/:?]
        at com.intel.analytics.bigdl.ppml.crypto.BigDLEncryptCompressor.compress(BigDLEncryptCompressor.scala:72) ~[classes/:?]
        at org.apache.hadoop.io.compress.CompressorStream.compress(CompressorStream.java:81) ~[hadoop-common-2.7.7.jar:?]
        at org.apache.hadoop.io.compress.CompressorStream.finish(CompressorStream.java:92) ~[hadoop-common-2.7.7.jar:?]
        at org.apache.hadoop.io.compress.CompressionOutputStream.close(CompressionOutputStream.java:60) ~[hadoop-common-2.7.7.jar:?]
        at org.apache.hadoop.io.compress.CompressorStream.close(CompressorStream.java:106) ~[hadoop-common-2.7.7.jar:?]
        at sun.nio.cs.StreamEncoder.implClose(StreamEncoder.java:320) ~[?:1.8.0_292]
        at sun.nio.cs.StreamEncoder.close(StreamEncoder.java:149) ~[?:1.8.0_292]
        at java.io.OutputStreamWriter.close(OutputStreamWriter.java:233) ~[?:1.8.0_292]
        at com.fasterxml.jackson.core.json.WriterBasedJsonGenerator.close(WriterBasedJsonGenerator.java:999) ~[jackson-core-2.11.4.jar:2.11.4]
        at org.apache.spark.sql.catalyst.json.JacksonGenerator.close(JacksonGenerator.scala:239) ~[spark-catalyst_2.12-3.1.2.jar:3.1.2]
        at org.apache.spark.sql.execution.datasources.json.JsonOutputWriter.close(JsonOutputWriter.scala:58) ~[spark-sql_2.12-3.1.2.jar:3.1.2]
        at org.apache.spark.sql.execution.datasources.FileFormatDataWriter.releaseResources(FileFormatDataWriter.scala:58) ~[spark-sql_2.12-3.1.2.jar:3.1.2]
        at org.apache.spark.sql.execution.datasources.FileFormatDataWriter.commit(FileFormatDataWriter.scala:75) ~[spark-sql_2.12-3.1.2.jar:3.1.2]
        at org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$executeTask$1(FileFormatWriter.scala:280) ~[spark-sql_2.12-3.1.2.jar:3.1.2]
        at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1473) ~[spark-core_2.12-3.1.2.jar:3.1.2]
        at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:286) ~[spark-sql_2.12-3.1.2.jar:3.1.2]
        at org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$write$15(FileFormatWriter.scala:210) ~[spark-sql_2.12-3.1.2.jar:3.1.2]
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) ~[spark-core_2.12-3.1.2.jar:3.1.2]
        at org.apache.spark.scheduler.Task.run(Task.scala:131) ~[spark-core_2.12-3.1.2.jar:3.1.2]
        at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:497) ~[spark-core_2.12-3.1.2.jar:3.1.2]
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1439) [spark-core_2.12-3.1.2.jar:3.1.2]
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:500) [spark-core_2.12-3.1.2.jar:3.1.2]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_292]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_292]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_292]

in BigDLEncryptCompressor.scala:72, this.lv2Buffer could be null when the partition is empty. so cause the java.lang.IllegalArgumentException: Bad arguments

@qiuxin2012 please take a look

PatrickkZ avatar Jul 28 '22 06:07 PatrickkZ