vaex
vaex copied to clipboard
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
Hi, I noticed in the tutorials it mentioned performance may be better if multiple smaller files are combined into one larger file. I could open many files and then save...
```python >>> df = vaex.from_items(("__", np.asarray([0]))) >>> df.get_column_names() [] ``` Same goes for seemingly any string after the double underscores ```python >>> df = vaex.from_items(("__foo", np.asarray([0]))) >>> df.get_column_names() [] ```...
Dear Marteen, I can't open a fits file (attached in the message) using: ```python import vaex vaex.open('test.fits') ``` I get the following error ```txt ERROR:MainThread:vaex:error opening 'test.fits' --------------------------------------------------------------------------- OSError Traceback...
Closes https://github.com/vaexio/vaex/issues/1587 - [x] implemented unit test - [ ] test passes
**Description** The following code ``` import pandas as pd import vaex p_df = pd.DataFrame({"A": ["abc"] * 100}) df = vaex.from_pandas(p_df) f_df = df[df["A"] == "abc"] f_df[99] # Works fine. f_df[-1]...
Closes https://github.com/vaexio/vaex/issues/2153 - [x] unit test - [ ] tests pass
**Description** Files with a significant amount of columns seem to freeze, while trying to convert to HDF5. On a test file with 5000 columns and 10 rows, conversion to arrow...
**Description** Since vaex provides all these great struct operations, it would be great if we could create structs in vaex directly via massive dataframes **Additional context** ``` import pyarrow as...
The interchange protocol had some spec changes recently (https://github.com/data-apis/dataframe-api/pull/74), so this PR namely updates vaex to conform with them. * `Column.size()` * Resolves https://github.com/vaexio/vaex/issues/2093 (was useful so I could test...
Hey There Vaex Team, Basically I'm working over some task in which I'm using vaex. So in that I have 3 million rows and 7 columns dataset. I made an...