Ben Chambers

Results 83 comments of Ben Chambers

My latest PR (#495) should write multiple files. @kevinjnguyen once that goes in, would you be able to verify everything is working with the python client support?

Also instructions for running integration tests using docker seem broken. ```make test/int/docker-up``` produces the following errors: ``` kaskada | /bin/wren: /lib/aarch64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by /bin/wren) kaskada exited...

Also, once the local binaries are running, things still fail since a Pulsar is necessary as well.

As an example, if we had a simple "as-of" join that produced 3 copies of a struct, this would be the difference between: ``` without dictionaries = [{a: 5, b:...

@kerinin This would likely make it much faster when working with datasets that have been split into many files. We may want to consider prioritizing this work.

Hmm... actually, that raises an interesting point. We currently generate these files as part of prepare. But there is no reason we do it for prepare vs when the file...

We've also talked about supporting `unnest` and possibly `nest`, similar to BigQuery (and others). This would require some of the "bag semantics" work to support multiple simultaneous values.

In the shortest term, we could even support collections but leave them as "opaque" columns. Basically -- read them in, and if they are plumbed through to the output, write...

This also seems like a partial duplicate of #367. These should possibly be merged.

I think (as #367 identified) this would also require some support for generic types. Specifically, I think we would something like. The specific methods / behaviors are TBD, but the...