vaex
vaex copied to clipboard
Unable to open gzipped CSV's [BUG-REPORT]
I am trying to use vaex.open()
on a gzipped CSV.
Getting this error:
OSError: Cannot open ./folder/filename.csv.gz nobody knows how to read it.
Software information
- Vaex version (
import vaex; vaex.__version__)
: 4.9 - Vaex was installed via: pip
Additional information Pandas is able to read in gzipped CSV's without issue. Do you plan on adding support at any point? Also a more descriptive error for this use case could be helpful, 'nobody knows how to read it' is a little vague although I understand you guys are likely using this as an umbrella error for many unsupported formats.
This same issue arises when using vaex.open_many()
. Opening using vaex.from_csv
and specifying compression='GZIP'
was successful however.
This is somewhat related to: https://github.com/vaexio/vaex/issues/1879
In a nutshell, for compressed files (csv, json) you need to use the right method and specify the compression type. Essentially there is no better way current to open the file from what you've found.
@maartenbreddels maybe we can consider adding the csv reader to the list of openers to try as the final fallback ?
I think we need to explore the option mentioned in https://github.com/vaexio/vaex/issues/1879#issuecomment-1033691134 first
I believe this is now possible in the new release, thanks to @maartenbreddels .
Please re-open if needed.