soda-spark
soda-spark copied to clipboard
Fails to install on Azure Databricks Cluster
Library installation attempted on the driver node of cluster 0531-095737-pc8ifbl4 and failed. Please refer to the following error message to fix the library or contact Databricks support. Error Code: DRIVER_LIBRARY_INSTALLATION_FAILURE. Error Message: org.apache.spark.SparkException: Process List(/databricks/python/bin/pip, install, soda-spark, --disable-pip-version-check) exited with code 1. ERROR: Command errored out with exit status 1: command: /databricks/python3/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-hk_a28h0/sasl_22bdc11526b24a309f12b898eb2ce262/setup.py'"'"'; file='"'"'/tmp/pip-install-hk_a28h0/sasl_22bdc11526b24a309f12b898eb2ce262/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-0uq392_j cwd: /tmp/pip-install-hk_a28h0/sasl_22bdc11526b24a309f12b898eb2ce262/ Complete output (29 lines): running bdist_wheel running build running build_py creating build creating build/lib.linux-x86_64-3.8 creating build/lib.linux-x86_64-3.8/sasl copying sasl/init.py -> build/lib.linux-x86_64-3.8/sasl running egg_info writing sasl.egg-info/PKG-INFO writing dependency_links to sasl.egg-info/dependency_links.txt writing requirements to sasl.egg-info/requires.txt writing top-level names to sasl.egg-info/top_level.txt reading manifest file 'sasl.egg-info/SOURCES.txt' reading manifest template 'MANIFEST.in' writing manifest file 'sasl.egg-info/SOURCES.txt' copying sasl/saslwrapper.cpp -> build/lib.linux-x86_64-3.8/sasl copying sasl/saslwrapper.h -> build/lib.linux-x86_64-3.8/sasl copying sasl/saslwrapper.pyx -> build/lib.linux-x86_64-3.8/sasl running build_ext building 'sasl.saslwrapper' extension creating build/temp.linux-x86_64-3.8 creating build/temp.linux-x86_64-3.8/sasl x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -Isasl -I/databricks/python3/include -I/usr/include/python3.8 -c sasl/saslwrapper.cpp -o build/temp.linux-x86_64-3.8/sasl/saslwrapper.o In file included from sasl/saslwrapper.cpp:629: sasl/saslwrapper.h:22:10: fatal error: sasl/sasl.h: No such file or directory 22 | #include <sasl/sasl.h> | ^~~~~~~~~~~~~ compilation terminated. error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
ERROR: Failed building wheel for sasl ERROR: Command errored out with exit status 1: command: /databricks/python3/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-hk_a28h0/sasl_22bdc11526b24a309f12b898eb2ce262/setup.py'"'"'; file='"'"'/tmp/pip-install-hk_a28h0/sasl_22bdc11526b24a309f12b898eb2ce262/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /tmp/pip-record-_6sr1coa/install-record.txt --single-version-externally-managed --compile --install-headers /databricks/python3/include/site/python3.8/sasl cwd: /tmp/pip-install-hk_a28h0/sasl_22bdc11526b24a309f12b898eb2ce262/ Complete output (29 lines): running install running build running build_py creating build creating build/lib.linux-x86_64-3.8 creating build/lib.linux-x86_64-3.8/sasl copying sasl/init.py -> build/lib.linux-x86_64-3.8/sasl running egg_info writing sasl.egg-info/PKG-INFO writing dependency_links to sasl.egg-info/dependency_links.txt writing requirements to sasl.egg-info/requires.txt writing top-level names to sasl.egg-info/top_level.txt reading manifest file 'sasl.egg-info/SOURCES.txt' reading manifest template 'MANIFEST.in' writing manifest file 'sasl.egg-info/SOURCES.txt' copying sasl/saslwrapper.cpp -> build/lib.linux-x86_64-3.8/sasl copying sasl/saslwrapper.h -> build/lib.linux-x86_64-3.8/sasl copying sasl/saslwrapper.pyx -> build/lib.linux-x86_64-3.8/sasl running build_ext building 'sasl.saslwrapper' extension creating build/temp.linux-x86_64-3.8 creating build/temp.linux-x86_64-3.8/sasl x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -Isasl -I/databricks/python3/include -I/usr/include/python3.8 -c sasl/saslwrapper.cpp -o build/temp.linux-x86_64-3.8/sasl/saslwrapper.o In file included from sasl/saslwrapper.cpp:629: sasl/saslwrapper.h:22:10: fatal error: sasl/sasl.h: No such file or directory 22 | #include <sasl/sasl.h> | ^~~~~~~~~~~~~ compilation terminated. error: command 'x86_64-linux-gnu-gcc' failed with exit status 1 ---------------------------------------- ERROR: Command errored out with exit status 1: /databricks/python3/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-hk_a28h0/sasl_22bdc11526b24a309f12b898eb2ce262/setup.py'"'"'; file='"'"'/tmp/pip-install-hk_a28h0/sasl_22bdc11526b24a309f12b898eb2ce262/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /tmp/pip-record-_6sr1coa/install-record.txt --single-version-externally-managed --compile --install-headers /databricks/python3/include/site/python3.8/sasl Check the logs for full command output.
Hi @sachinwadhwa, that is annoying, this should not happen.
It is expecting sasl to be there, but it is not. I think sasl is a dependency of soda-sql-spark
(which is the dependency of soda-spark
). A proper solution is to make that dependency optional in soda-sql-spark
. Depending on the connection method that is used, it is or is not required. In soda-spark
we do not require sasl and thus we can exclude that dependency.
However, it is a long route to a solution. @vijaykiran : I expect other users ran into the same problem, do you know if this happened before? A short-term solution is to install libsasl2-dev
: sudo apt-get install libsasl2-dev
@vijaykiran : How is this issue progressing? We are running into the same problem
@vijaykiran : How is this issue progressing? We are running into the same problem
Anything new on this? Would like to use soda in databricks but this issue and the workaround makes it not really usable
@bombercorny It seems that this package is, or soon wil be deprecated in favor of soda-core.
I suggest to use soda-core-spark-df
or soda-core-spark
packages with Databricks, depending on your use-case.