spark
spark copied to clipboard
[BUG]: "Exists" does not work on Synapse
Describe the bug We are trying to use the new Exists feature in .NET for Apache Spark 2.0 on Azure Synapse. Unfortunately, the corresponding function returns "False" for every path, regardless of whether it points to a valid file or not.
To Reproduce In our tests, we have the file directly in Synapse primary data lake.
The file can be read without any problems using:
spark.Read().Schema(schema).Parquet(sourcepath)
We use the following code to check if the file exists or not:
FileSystem fs = FileSystem.Get(spark.SparkContext.HadoopConfiguration()); bool fileExists = fs.Exists(sourcepath);
We are using Apache Spark Version 3.1 with .NET for Apache Spark 2.0.0.
Expected behavior If the path points to an existing file, the function should return "True".
Where is the file stored? are you using an abfss path?
I tested it on my own instance and that seemed to work fine:
Is there something that we are doing differently? I did try an http path but that failed to read for me as well.
We tried this again, also on different environments. Our code looks exactly like yours, but unfortunately without success.
Are there any basic settings need to make on the cluster?
No nothing I just created a brand new cluster pointing to an ADLS 2.0 storage account, this is the cluster config:
Sorry for the late reply, these are exactly the same settings we use. 2 different clusters, same problem. Any ideas what else we can check why this is not working? Any special permissions on the Lake that are needed?
It does not work in Scala either