Han Wang

Results 63 comments of Han Wang

Thank you, and also Dask needs to be tested.

Oh, by the way, when you test on Dask, please make sure you have multiple-group-key test cases. I remember that caused some troubles with my projects.

@saulpw @cpcloud `NULLS FIRST` and `NULLS LAST` are part of the SQL that major backends all support. I think this should be implemented. https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax#order_by_clause https://www.postgresqltutorial.com/postgresql-tutorial/postgresql-order-by/ https://duckdb.org/docs/sql/query_syntax/orderby Also notice different backends...

So look at this code ```python import pandas as pd df1 = pd.DataFrame([[0.0, None], [pd.NA, "abc"], [float("nan"), "def"]], columns=["a", "b"]) df2 = df1.copy() print(df1.merge(df2, on="a")) print(df1.merge(df2, on="b")) ``` And the...

Ah, good catch. So in pyarrow, the [dictionary](https://arrow.apache.org/docs/python/generated/pyarrow.dictionary.html#pyarrow.dictionary) is the categorical type. But the implementation can be very hard. Converting categorical to string may be a more practical way? I...

We need to add this from [triad](https://github.com/fugue-project/triad/blob/master/triad/utils/pyarrow.py) And then on Fugue

And then we also need to add tons of unit tests and need to make it work for all backends

We will try to solve it in https://github.com/fugue-project/fugue/issues/296