MonetDBLite-R icon indicating copy to clipboard operation
MonetDBLite-R copied to clipboard

Support for compressed CSV files

Open Enchufa2 opened this issue 7 years ago • 4 comments

Most CSV readers (notably, base R readers) support compressed (gz, bzip2...) CSV files transparently. It would be a nice addition to MonetDBLite::monetdb.read.csv, because big CSVs are commonly gzip'ed.

Enchufa2 avatar Jun 06 '18 13:06 Enchufa2

Yes we had removed this feature because it adds a dependency, which is a pain especially on Windows.

hannes avatar Jun 06 '18 14:06 hannes

And what about allowing an external command to be specified? May this be possible? Example: data.table::fread. You can read a compressed CSV as follows:

data.table::fread("zcat somefile.csv.gz")

zcat is invoked and its output feeds the reader.

Enchufa2 avatar Jun 06 '18 14:06 Enchufa2

That sounds pretty good. I am unlikely to implement this at the moment. Happy to review a PR though.

hannes avatar Jun 07 '18 06:06 hannes

And another quick and portable option would be to rely on the R.utils package, which implements gunzip and the like based on base R (efficiently copying from a gzfile, bzfile... connection to a file connection). But this would mean adding R.utils as a dependency (or porting to MonetDBLite just the relevant code). What do you think?

Enchufa2 avatar Jun 07 '18 11:06 Enchufa2