vaex
vaex copied to clipboard
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second π
**Description** I am building an app using vaex v4.9.1 and python 3.9. I use limits function to get the min and max combos for two axes like so: ``` limits...
Credit to Hotaru which discovered this problem and alerted us via Slack. If a name has a math symbol (like a minus sigh) it sometimes causes problems. This PR exposes...
This will align the implementation with those in other libraries, xref https://github.com/data-apis/dataframe-api/issues/80. Cc @maartenbreddels, @honno
**Description** ArrowInvalid: Failed casting from large_string to string CodeοΌ vaex.from_csv("/data/transactions.csv",convert=True,chunk_size=10000000) When I tried to call the from_csv function to convert the csv file to hdf5, each small hdf5 file was...
I experience extreme CPU and memory usage when using `df.apply` to apply `sklearn.preprocessing.MultiLabelBinarizer` ("mlb") to a array-of-string column, to the point that terminal is no longer responsive (ssh no longer...
Thank you for reaching out and helping us improve Vaex! Before you submit a new Issue, please read through the [documentation](https://docs.vaex.io/en/latest/). Also, make sure you search through the Open and...
Thank you for reaching out and helping us improve Vaex! Before you submit a new Issue, please read through the [documentation](https://docs.vaex.io/en/latest/). Also, make sure you search through the Open and...
Thank you for reaching out and helping us improve Vaex! Before you submit a new Issue, please read through the [documentation](https://docs.vaex.io/en/latest/). Also, make sure you search through the Open and...
Hey @JovanVeljanoski, Hope you are doing well. Currently I'm working on script where I'm joining two vaex dataframes and while doing that it makes my kernel go dead. The two...
**Description** After upgrading from vaex v4.7.0 This call: ```python import vaex from numpy import datetime64 vx = vaex.from_dict({ 'dt' : [ datetime64('2016-12-01T10:00:00'), datetime64('2016-12-01T11:00:00') ], 'tmp_idx': [ '111_223.5', '111_223.5' ] })...