koalas
koalas copied to clipboard
Issues converting series to 'object' dtype
Is it possible to convert a Koalas series to have an object
dtype. I have tried this, but get an error as shown below. Is there a way that this can be done?
>>> import databricks.koalas as ks
>>> ks_ser = ks.Series([1, 2, 3])
>>> ks_ser_obj = ks_ser.astype('object')
>>> assert ks_ser_obj.dtype == 'object'
>>> ks_ser_obj.to_pandas()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/env/lib/python3.8/site-packages/databricks/koalas/series.py", line 1490, in to_pandas
return self._to_internal_pandas().copy()
File "/env/lib/python3.8/site-packages/databricks/koalas/series.py", line 6059, in _to_internal_pandas
return self._kdf._internal.to_pandas_frame[self.name]
File "/env/lib/python3.8/site-packages/databricks/koalas/utils.py", line 545, in wrapped_lazy_property
setattr(self, attr_name, fn(self))
File "/env/lib/python3.8/site-packages/databricks/koalas/internal.py", line 933, in to_pandas_frame
sdf = self.to_internal_spark_frame
File "/env/lib/python3.8/site-packages/databricks/koalas/utils.py", line 545, in wrapped_lazy_property
setattr(self, attr_name, fn(self))
File "/env/lib/python3.8/site-packages/databricks/koalas/internal.py", line 921, in to_internal_spark_frame
zip(self.column_labels, self.data_spark_columns, self.data_spark_column_names)
File "/env/lib/python3.8/site-packages/databricks/koalas/utils.py", line 545, in wrapped_lazy_property
setattr(self, attr_name, fn(self))
File "/env/lib/python3.8/site-packages/databricks/koalas/internal.py", line 845, in data_spark_column_names
return self.spark_frame.select(self.data_spark_columns).columns
File "/env/lib/python3.8/site-packages/pyspark/sql/dataframe.py", line 1669, in select
jdf = self._jdf.select(self._jcols(*cols))
File "/env/lib/python3.8/site-packages/py4j/java_gateway.py", line 1304, in __call__
return_value = get_return_value(
File "/env/lib/python3.8/site-packages/pyspark/sql/utils.py", line 117, in deco
raise converted from None
pyspark.sql.utils.AnalysisException: cannot resolve '`0`' due to data type mismatch: cannot cast bigint to array<string>;
'Project [cast(0#18L as array<string>) AS __none__#25]
+- Project [__index_level_0__#17L, 0#18L, monotonically_increasing_id() AS __natural_order__#21L]
+- LogicalRDD [__index_level_0__#17L, 0#18L], false
Yeah, object type is not supported.