vaex
vaex copied to clipboard
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
Otherwise we leave an inconsistent file that may lead other processes to think a file was property generated. This can happen when someone ctrl-c.
Handles the issues raised in: https://github.com/vaexio/vaex/issues/1725 - [x] Added unit-tests - [ ] Unit tests pass
TODO: do the same for strings
This should speed up computations, since we don't go in and out of numpy. We also use the binner more consistently throughout. Also, this should allow for GrouperLimited, and Binner...
The Bug-Report [1545](https://github.com/vaexio/vaex/issues/1545) described a broader issue that mainly broke down to the fact that `vaex.dataframe.DataFrame.dropna()` did not consider hidden original columns when the parameter `column_names` is unspecified. This is...
Should replace #1476 which makes Vaex more compatible with Pandas. cc @AlenkaF
Thi PR is a proposed correction for `category_values()` function on the dataframe class. When using `df.category_values(column)` an error accurs while accessing a list item that doesn't exist (list `_category`, item...
Fixes #1456 I left some comments in the #development Slack channel, as it looks like `min()`, `max()` and `minmax()` were using different approaches. I'm not sure which is more efficient/robust...
Small changes throughout the codebase aimed to reduce various types of warnings that can be safely avoided.