vaex icon indicating copy to clipboard operation
vaex copied to clipboard

Fix: various improvements related to string columns.

Open JovanVeljanoski opened this issue 3 years ago • 1 comments

This PR fixes several (hopefully) small issues that pop up when using string columns with vaex, now that it is primarily arrow based.

List of new / updated unit tests:

  • using vaex.from_arrow_table(pa_table, as_numpy=True) makes string columns be dtype 'O' in vaex
  • selections made on arrow/string columns
  • df.x.fillna() converts string type to 'O' type. This is not captured by df.is_string(df.x) but is captured by df.x.dtype
  • df.func.where(df.x == 'string_value', 'new_value', 'df.x') casts the resulting expression to dtype 'O'. This is captured by df.is_string(...)

JovanVeljanoski avatar Oct 13 '20 17:10 JovanVeljanoski

Also see https://github.com/vaexio/vaex/issues/718

JovanVeljanoski avatar Oct 14 '20 07:10 JovanVeljanoski