delta-sharing icon indicating copy to clipboard operation
delta-sharing copied to clipboard

Load profile.json exception

Open cic1988 opened this issue 1 year ago • 1 comments

Hello experts,

I followed the protocol example to build the reference server. The server generated the presigned URL when table/query endpoint is called.

Assumed that my table_url is profile.json#share.schema.table.

By using df = delta_sharing.load_as_pandas(table_url, limit=3) it loads the data well. But it has failed if I use load_as_spark.

Following code:

from pyspark.sql import SparkSession
spark = SparkSession.builder.appName("Delta Share Demo") \
    .config('spark.jars', 'packages/haddop-azure-3.3.6.jar,packages/delta-sharing-spark_2.12-0.6.4.jar') \
    .getOrCreate()

...

import delta_sharing
df = delta_sharing.load_as_spark(table_url)
df.limit(2).select("path").show()

In the error, it shows:

java.lang.RuntimeException: delta-sharing:/profile.json%23share.schema.table/123/25169076 is not a Parquet file. Expected magic number at tail, but found [0, 20, 14, 55]

Have you seen the error before?

cic1988 avatar Feb 05 '24 06:02 cic1988

@cic1988 sorry haven't seen it before. Is this still happening? Do you have a full stack trace?

linzhou-db avatar Feb 29 '24 22:02 linzhou-db