vaex icon indicating copy to clipboard operation
vaex copied to clipboard

[BUG-REPORT] converting massive CSV (50GB) stalls

Open mfouesneau opened this issue 10 months ago • 0 comments

Description I have multiple massive CSV files (~50GB) that I would like to put into a more efficient format

Following the documentation, I tried

vaex.open('file.csv', convert='file.hdf5', progress=True)

After many hours and no progress, the HDF5 file is only a few bits.

I tried the old way of

vaex.open('file.csv').export_hdf5(convert='file.hdf5', progress=True)

This creates a 7K file rapidly, but nothing happens either after.

In both cases, the file contains /table/columns but no column definition.

Software information

  • Vaex version (import vaex; vaex.__version__):
{'vaex': '4.17.0',
'vaex-core': '4.17.1',
'vaex-viz': '0.5.4',
'vaex-hdf5': '0.14.1',
'vaex-server': '0.9.0',
'vaex-astro': '0.9.3',
'vaex-jupyter': '0.8.2',
'vaex-ml': '0.18.3'}
  • Vaex was installed via: pip
  • OS: linux

mfouesneau avatar Aug 20 '23 16:08 mfouesneau