pyreadstat icon indicating copy to clipboard operation
pyreadstat copied to clipboard

pyreadstat.read_sav won't work on databricks

Open marcmaxson opened this issue 2 months ago • 2 comments

Please read the README, particularly the known limitations section!

When I try this: df, meta = pyreadstat.read_sav("/Volumes/sandbox/schema/folder/myfile.sav") I get "Unknown Error".

This happens for versions 1.3.1, 1.3.0, 1.2.9, and 1.2.8 but works fine in 1.2.7.

To Reproduce databricks (serverless) environment version 4

File example Tried with many SAV files; any file will reproduce.

Expected behavior File loads

Setup Information: How did you install pyreadstat?

  • tested with !pip install pyreadstat==1.3.1 in notebook or using the databricks environment GUI (version 4).

Platform: linux Python Version: Python 3.12.3 (main, Aug 14 2025, 17:47:21) [GCC 13.3.0] on linux

Image

marcmaxson avatar Nov 03 '25 18:11 marcmaxson

Very strange. If I understand correctly those files would open fine in a local machine but fail in databricks?

If so, unfortunately I cannot reproduce as I don't have databricks.

The error suggests the problem is coming from Readstat, and indeed in version 1.2.8 there was an update on the Readstat sources, so it may be a problem over there, but cannot tell without being able to reproduce. You may want to report also in Readstat in case somebody over there could reproduce.

BTW, have you tried to copy one of those files from /Volumes to your home directory? Maybe it has something to do with the way the Volume is mounted.

ofajardo avatar Nov 04 '25 11:11 ofajardo

I don't think it is related to /Volumes or file locations - as I tried it both in my workspace and in unity catalog. We're freezing at version 1.2.7 (and have been for a while now) until this is resolved.

marcmaxson avatar Nov 17 '25 16:11 marcmaxson