spark icon indicating copy to clipboard operation
spark copied to clipboard

[BUG]: "Exists" does not work on Synapse

Open TEitelberg opened this issue 3 years ago • 5 comments

Describe the bug We are trying to use the new Exists feature in .NET for Apache Spark 2.0 on Azure Synapse. Unfortunately, the corresponding function returns "False" for every path, regardless of whether it points to a valid file or not.

To Reproduce In our tests, we have the file directly in Synapse primary data lake.

The file can be read without any problems using:

spark.Read().Schema(schema).Parquet(sourcepath)

We use the following code to check if the file exists or not:

FileSystem fs = FileSystem.Get(spark.SparkContext.HadoopConfiguration()); bool fileExists = fs.Exists(sourcepath);

We are using Apache Spark Version 3.1 with .NET for Apache Spark 2.0.0.

Expected behavior If the path points to an existing file, the function should return "True".

TEitelberg avatar Sep 15 '21 07:09 TEitelberg

Where is the file stored? are you using an abfss path?

I tested it on my own instance and that seemed to work fine:

image

Is there something that we are doing differently? I did try an http path but that failed to read for me as well.

GoEddie avatar Oct 15 '21 07:10 GoEddie

We tried this again, also on different environments. Our code looks exactly like yours, but unfortunately without success.

Are there any basic settings need to make on the cluster?

TEitelberg avatar Oct 20 '21 07:10 TEitelberg

No nothing I just created a brand new cluster pointing to an ADLS 2.0 storage account, this is the cluster config:

image

GoEddie avatar Oct 20 '21 07:10 GoEddie

Sorry for the late reply, these are exactly the same settings we use. 2 different clusters, same problem. Any ideas what else we can check why this is not working? Any special permissions on the Lake that are needed?

TEitelberg avatar Oct 28 '21 18:10 TEitelberg

It does not work in Scala either

artmasa avatar Apr 29 '22 22:04 artmasa