spark-redshift org.apache.hadoop.fs.s3.S3Exception ResponseCode: 404, ResponseStatus: Not Found, XML

Hi guys,

I'm using kafka, spark streaming 1.62 and spark-redshift. it works well at first, and then suddenly get these error message below:

========= 2016-09-28 08:47:00 ========= 16/09/28 08:47:04 WARN Utils$: The S3 bucket ******** does not have an object lifecycle configuration to ensure cleanup of temporary files. Consider configuring tempdir to point to a bucket with an object lifecycle policy that automatically deletes files after an expiration period. For more information, see https://docs.aws.amazon.com/AmazonS3/latest/dev/object-lifecycle-mgmt.html finished export to redshift ========= 2016-09-28 08:48:00 ========= 16/09/28 08:48:01 WARN Utils$: The S3 bucket ******** does not have an object lifecycle configuration to ensure cleanup of temporary files. Consider configuring tempdir to point to a bucket with an object lifecycle policy that automatically deletes files after an expiration period. For more information, see https://docs.aws.amazon.com/AmazonS3/latest/dev/object-lifecycle-mgmt.html finished export to redshift ========= 2016-09-28 08:49:00 ========= 16/09/28 08:49:01 WARN Utils$: The S3 bucket ******* does not have an object lifecycle configuration to ensure cleanup of temporary files. Consider configuring tempdir to point to a bucket with an object lifecycle policy that automatically deletes files after an expiration period. For more information, see https://docs.aws.amazon.com/AmazonS3/latest/dev/object-lifecycle-mgmt.html [Stage 58:> (0 + 2) / 4]16/09/28 08:49:05 ERROR SparkHadoopMapRedUtil: Error committing the output of task: attempt_201609280849_0058_m_000000_0 org.apache.hadoop.fs.s3.S3Exception: org.jets3t.service.ServiceException: Service Error Message. -- ResponseCode: 404, ResponseStatus: Not Found, XML Error Message: <Error><Code>NoSuchKey</Code><Message>The specified key does not exist.</Message><Key>temp_spark_streaming_data/temp_events/49b2d38f-bf9b-4512-85b3-6f6c60629bfc/_temporary/0/_temporary/attempt_201609280849_0058_m_000000_0/part-r-00000-6eed0a45-8732-4524-a342-0cdca3ccb43b.avro</Key><RequestId>84F26D842580A8C9</RequestId><HostId>8UlMGpqB1svcQl4/QhKwnt+NsOWGOgKYJes/W8RffrM+H+tKPuHCuhCWnQ8nmzu8inRr6UWlIMU=</HostId></Error> at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.processException(Jets3tNativeFileSystemStore.java:470) at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.handleException(Jets3tNativeFileSystemStore.java:411) at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.copy(Jets3tNativeFileSystemStore.java:326) at sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at org.apache.hadoop.fs.s3native.$Proxy20.copy(Unknown Source) at org.apache.hadoop.fs.s3native.NativeS3FileSystem.rename(NativeS3FileSystem.java:713) at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.commitTask(FileOutputCommitter.java:435) at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.commitTask(FileOutputCommitter.java:415) at org.apache.spark.mapred.SparkHadoopMapRedUtil$.performCommit$1(SparkHadoopMapRedUtil.scala:98) at org.apache.spark.mapred.SparkHadoopMapRedUtil$.commitTask(SparkHadoopMapRedUtil.scala:124) at org.apache.spark.sql.execution.datasources.BaseWriterContainer.commitTask(WriterContainer.scala:219) at org.apache.spark.sql.execution.datasources.DefaultWriterContainer.org$apache$spark$sql$execution$datasources$DefaultWriterContainer$$commitTask$1(WriterContainer.scala:278) at org.apache.spark.sql.execution.datasources.DefaultWriterContainer$$anonfun$writeRows$1.apply$mcV$sp(WriterContainer.scala:265) at org.apache.spark.sql.execution.datasources.DefaultWriterContainer$$anonfun$writeRows$1.apply(WriterContainer.scala:260) at org.apache.spark.sql.execution.datasources.DefaultWriterContainer$$anonfun$writeRows$1.apply(WriterContainer.scala:260) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1277) at org.apache.spark.sql.execution.datasources.DefaultWriterContainer.writeRows(WriterContainer.scala:266) at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation$$anonfun$run$1$$anonfun$apply$mcV$sp$3.apply(InsertIntoHadoopFsRelation.scala:148) at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation$$anonfun$run$1$$anonfun$apply$mcV$sp$3.apply(InsertIntoHadoopFsRelation.scala:148) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) at org.apache.spark.scheduler.Task.run(Task.scala:89) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.jets3t.service.ServiceException: Service Error Message. -- ResponseCode: 404, ResponseStatus: Not Found, XML Error Message: <Error><Code>NoSuchKey</Code><Message>The specified key does not exist.</Message><Key>temp_spark_streaming_data/temp_events/49b2d38f-bf9b-4512-85b3-6f6c60629bfc/_temporary/0/_temporary/attempt_201609280849_0058_m_000000_0/part-r-00000-6eed0a45-8732-4524-a342-0cdca3ccb43b.avro</Key><RequestId>84F26D842580A8C9</RequestId><HostId>8UlMGpqB1svcQl4/QhKwnt+NsOWGOgKYJes/W8RffrM+H+tKPuHCuhCWnQ8nmzu8inRr6UWlIMU=</HostId></Error> at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:407) at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:277) at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRestPut(RestStorageService.java:1143) at org.jets3t.service.impl.rest.httpclient.RestStorageService.copyObjectImpl(RestStorageService.java:2117) at org.jets3t.service.StorageService.copyObject(StorageService.java:898) at org.jets3t.service.StorageService.copyObject(StorageService.java:943) at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.copy(Jets3tNativeFileSystemStore.java:323) ... 26 more 16/09/28 08:49:05 ERROR Utils: Aborting task java.lang.RuntimeException: Failed to commit task at org.apache.spark.sql.execution.datasources.DefaultWriterContainer.org$apache$spark$sql$execution$datasources$DefaultWriterContainer$$commitTask$1(WriterContainer.scala:283) at org.apache.spark.sql.execution.datasources.DefaultWriterContainer$$anonfun$writeRows$1.apply$mcV$sp(WriterContainer.scala:265) at org.apache.spark.sql.execution.datasources.DefaultWriterContainer$$anonfun$writeRows$1.apply(WriterContainer.scala:260) at org.apache.spark.sql.execution.datasources.DefaultWriterContainer$$anonfun$writeRows$1.apply(WriterContainer.scala:260) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1277) at org.apache.spark.sql.execution.datasources.DefaultWriterContainer.writeRows(WriterContainer.scala:266) at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation$$anonfun$run$1$$anonfun$apply$mcV$sp$3.apply(InsertIntoHadoopFsRelation.scala:148) at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation$$anonfun$run$1$$anonfun$apply$mcV$sp$3.apply(InsertIntoHadoopFsRelation.scala:148) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) at org.apache.spark.scheduler.Task.run(Task.scala:89) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

Any one know why this happened?

Sep 28 '16 08:09 thomastanmin93

+1

Sep 09 '18 15:09 shukla2009

Anyone got solution for this? can you help me with atleast any workarounds?.

Apr 29 '20 13:04 sukumar-nataraj

Any updates on this issue?

Sep 17 '20 05:09 balaji-fourkites

spark-redshift spark-redshift copied to clipboard

org.apache.hadoop.fs.s3.S3Exception ResponseCode: 404, ResponseStatus: Not Found, XML

spark-redshift
spark-redshift copied to clipboard