duckplyr
duckplyr copied to clipboard
Support factors
I know this is a known issue, but it I feel like it could do with an issue for tracking purposes.
Replication:
> duckplyr:::duckdb_rel_from_df(iris)
Error in duckplyr:::duckdb_rel_from_df(iris) :
Can't convert factor columns to relational. Affected column: `Species`.
Factor support could be implemented via enum types, no? Or, am I missing something obvious?
Doing some digging it looks like there was a missing feature in duckdb, which made it complicated, but that bug is now solved! Hopefully factors can be enabled soon.
While we can add support for factors, enabling this will almost certainly unlock test failures that were hidden before because we had a fallback as soon as factors were present.
Action items:
- Enable
- Fix internal tests
- Run revdepchecks, fix
@krlmlr - Any chance this could make it in to the 1.0.0 release? If it's a case of undoing https://github.com/tidyverse/duckplyr/commit/8da062f7915bf5134785b6684f0c396c56dd09d6, testing and then revdep checks (as opposed to needing a good understanding of internals) then I'd be happy to take a look / make a PR to that effect.
Thanks. It's unlikely to make it into 1.0.0: to be useful, we'll also need support for functions that accept factors as an argument. We'd also need support from the duckdb R package.
Happy to revisit after 1.0.0 is out. Also need to check the original trigger for https://github.com/duckdb/duckdb/issues/8561 .
I wanted to bump this now that 1.0.0 is out - any timeline on when factors may be supported for duckdb + duckplyr?