sedona icon indicating copy to clipboard operation
sedona copied to clipboard

Geostats Functions in Spark Connect

Open james-willis opened this issue 5 months ago • 3 comments

I don't think the stats functions are compatible with spark connect today. I tried this in spark 3.5:

(python) ➜  python git:(graphframes-0.9.0) ✗ export SPARK_REMOTE=local
(python) ➜  python git:(graphframes-0.9.0) ✗ pytest -v tests/stats

and every test that wasn't skipped (for checkpointing) gave this kind of _jvm error:

self = <pyspark.sql.connect.session.SparkSession object at 0x16fd17df0>, name = '_jvm'

    def __getattr__(self, name: str) -> Any:
        if name in ["_jsc", "_jconf", "_jvm", "_jsparkSession"]:
>           raise PySparkAttributeError(
                error_class="JVM_ATTRIBUTE_NOT_SUPPORTED", message_parameters={"attr_name": name}
E               pyspark.errors.exceptions.base.PySparkAttributeError: [JVM_ATTRIBUTE_NOT_SUPPORTED] Attribute `_jvm` is not supported in Spark Connect as it depends on the JVM. If you need to use this attribute, do not use Spark Connect when creating your session.

../../../../.local/share/virtualenvs/python-GYLC1Bm8/lib/python3.10/site-packages/pyspark/sql/connect/session.py:692: PySparkAttributeError

james-willis avatar Jul 15 '25 21:07 james-willis