Takuya UESHIN comments

Results 51 comments of


                                            Takuya UESHIN

Apply causes

In that case, I'd also suspect `ARROW_PRE_0_15_IPC_FORMAT` is not set properly. Could you try: ```py import os os.environ.get('ARROW_PRE_0_15_IPC_FORMAT', 'None') ``` and ```py from pyspark.sql.functions import udf @udf('string') def check(x): return...

Apply causes

Also, what if with an older PyArrow than `0.15.0`, like `0.14.1`?

Implementing the full functionality of the 'sample' function

Hi @amueller, Seems like `Series.sample` supports the `frac` parameter now. - https://koalas.readthedocs.io/en/latest/reference/api/databricks.koalas.Series.sample.html For #1893, now it's stuck by a performance concern (https://github.com/databricks/koalas/pull/1893#discussion_r521869698). Could you kindly advice us if you have...

pyarrow interoperability and integration testing

Thanks for escalating the issue here! It seems related to Spark UDT and Arrow. Actually we can have a workaround to convert from/to pandas DataFrame/Series including Spark UDT objects, but...

Return empty column name when column schema cannot be infered

Hi @vkrot-exos, thanks for the suggestion! It sounds a good idea. Would you mind submitting the PR to modify the error message? Thanks!

Question: how to avoid all data being merged in the driver?

Does the function `toposort` have a return type annotation? If not, Koalas collects some amount of data into the driver to infer the return type. See also: - https://koalas.readthedocs.io/en/latest/reference/api/databricks.koalas.groupby.GroupBy.apply.html -...

Takuya UESHIN

Apply causes

Apply causes

Implementing the full functionality of the 'sample' function

pyarrow interoperability and integration testing

Return empty column name when column schema cannot be infered

Question: how to avoid all data being merged in the driver?

save and load parquet with MultiIndex (row) index and columns

Skiprows not working in case of Koalas Dataframe read_csv method

Add 'list' to agg functions

Add 'list' to agg functions