Colin Dellow
Colin Dellow
The nature of the Parquet format is such that you can't really update an existing file. (Wellllll, maybe it's technically possible to do surgery and add new row groups and...
> So if you have lots of CSV files then (3) is good, although how would this work with xargs for example? Wouldn't it just work? :) e.g.: `find -type...
Is there a command line flag or clasp.json setting that needs to be enabled? With clasp 2.3.2, I still get a zero exit code if I push code that explicitly...
Hm, interestingly, this does not fail when added as a test case. That made me think it was something specific to the python test harness, but now the same query...
While it's overwhelmingly likely that this issue is in my code, it's also unclear whether 18.04 is a supported target for arrow/parquet-cpp yet -- http://mail-archives.apache.org/mod_mbox/arrow-dev/201806.mbox/%[email protected]%3E talks about adding it to...
The intent was that it doesn't uncompress the Parquet file fully into memory, although I don't think that I specifically tested that. The library itself doesn't create any buffers for...
Separate threads would make sense - I think someone else mentioned that since each column is compressed separately, there's a lot of opportunity for parallelism. If we did add that,...
BTW - this repo is largely inactive now. In fact, I see that you have a PR from a few months ago that I failed to notice :( Would you...
Great, invite sent!
I'd like to support this more seamlessly in the future, either by supporting a glob or, like Hive, taking a directory to query. There are some internal design things I'd...