koalas
koalas copied to clipboard
apply does not work properly with databricks-connect
Hi, when I run my code with pyCharm and databricks-connect on Spark 2.4.5 cluster I get
AttributeError: 'DataFrame' object has no attribute 'mapInPandas'
I believe this is due to the fact that in apply function
should_use_map_in_pandas = LooseVersion(pyspark.__version__) >= "3.0"
does not work as expected with databricks-connect, as pyspark.version returns the version of connect package, not Spark version
I think we do no plan to support it in DBConnect in the near future.
cc @juliuszsompolski @youngbink
Not sure I understand what do you mean. Are you saying DataFrame.apply should not work in DBConnect, and I should move my code to the notebook every time I have apply in my code?
@kismsu we don't yet officially support Koalas with DB Connect (for any versions). It seems like this particular issue might be avoided with 7.1, but it could have other issues until we officially support it.
Ok, I see. This is good to know. We're trying to get into 7.1 actually but have to sort out some infrastructure issues. Thanks.