vaex
vaex copied to clipboard
Fix: various improvements related to string columns.
This PR fixes several (hopefully) small issues that pop up when using string columns with vaex, now that it is primarily arrow based.
List of new / updated unit tests:
- using
vaex.from_arrow_table(pa_table, as_numpy=True)
makes string columns be dtype 'O' in vaex - selections made on arrow/string columns
-
df.x.fillna()
converts string type to 'O' type. This is not captured bydf.is_string(df.x)
but is captured bydf.x.dtype
-
df.func.where(df.x == 'string_value', 'new_value', 'df.x')
casts the resulting expression to dtype 'O'. This is captured bydf.is_string(...)
Also see https://github.com/vaexio/vaex/issues/718