stocator
stocator copied to clipboard
Spark History Server unable to read rolling logs from COS using stocator
When Spark History Server (SHS) is configured to use stocator to read rolling files it fails. Below are the following configurations done (SHS read configurations from spark-default.conf)
Non-Working Configuration
spark.hadoop.fs.cos.hbdevshivcos.endpoint https://s3.us-south.cloud-object-storage.appdomain.cloud
spark.hadoop.fs.cos.hbdevshivcos.access.key <hmac-access-key>
spark.hadoop.fs.cos.hbdevshivcos.secret.key <hmac-secret-key>
spark.hadoop.fs.stocator.cos.impl com.ibm.stocator.fs.cos.COSAPIClient
spark.hadoop.fs.cos.impl com.ibm.stocator.fs.ObjectStoreFileSystem
spark.hadoop.fs.stocator.scheme.list cos
spark.hadoop.fs.stocator.cos.scheme cos
spark.history.fs.logDirectory cos://hbdev-shiv.hbdevshivcos/spark-events
spark.eventLog.dir cos://hbdev-shiv.hbdevshivcos/spark-events
Working Configuration (
$ start-history-server --properties-file shs.properties) content of shs.properties below
spark.hadoop.fs.s3a.endpoint=https://s3.us-south.cloud-object-storage.appdomain.cloud
spark.hadoop.fs.s3a.access.key=<testaccesskey>
spark.hadoop.fs.s3a.secret.key=<testsecretkey>
spark.hadoop.fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem
spark.hadoop.fs.s3a.scheme.list=s3a
spark.hadoop.fs.s3a.scheme=s3a
spark.history.fs.logDirectory=s3a://hbdev-shiv/spark-events
spark.eventLog.dir=s3a://hbdev-shiv/spark-events
What differs in s3a and COS implementation to read rolling log files needs to be investigated and fixed. More details to follow...