oci-hdfs-connector icon indicating copy to clipboard operation
oci-hdfs-connector copied to clipboard

How to configure the filesystem implementation in spark configuration

Open nikosheng opened this issue 1 year ago • 2 comments

[Issue] Unable to read the files in OCI Object Storage in Databricks with error

org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for scheme "oci"

[Details] I have configure the oci-hdfs-connector in databricks cluster and configure the auth in spark config

fs.oci.client.custom.clientfs.impl com.oracle.bmc.hdfs.BmcFilesystem
fs.oci.client.auth.tenantId ocid1.tenancy.oc1..aaaaaxxxx
fs.oci.client.auth.userId ocid1.user.oc1..aaaaaaxxxx
fs.oci.client.auth.fingerprint a8:5b:41:70:63:31:b2:8f:5xxx
spark.master local[*, 4]
spark.databricks.cluster.profile singleNode
fs.oci.client.auth.bucketnamespace OCIxxx
fs.oci.client.hostname https://objectstorage.us-phoenix-1.oraclecloud.com
fs.oci.client.auth.pemfilepath /dbfs/FileStore/oci/oci_private.pem

I would love to know how to replace/overwrite the spark config to include the BmcFilesystem and make databricks recognize the oci scheme in Notebook.

nikosheng avatar Feb 26 '24 04:02 nikosheng

Did you put oci-hdfs-connector-3.3.x.x.xx.jar or oci-hdfs-full-3.3.x.x.xx.jar under your $SPARK_HOME/jars directory?

yanhaizhongyu avatar Mar 13 '24 16:03 yanhaizhongyu

@yanhaizhongyu yes, and I fixed the issue as it is the configuration issue in Databricks spark config, thanks for your help

nikosheng avatar Mar 15 '24 02:03 nikosheng