Wes McKinney

Results 203 comments of Wes McKinney

+1, the more easily reproducible the performance numbers the better (e.g. providing a Dockerfile would be ideal). This way users can validate performance on various hardware configurations I also recommend...

While you're at it, it would be nice to plot a course to `conda install weld` and get all the Python things in a single `import weld` statement. This probably...

You can look at what we did in Apache Arrow with manylinux1: https://github.com/wesm/arrow/blob/master/python/manylinux1/build_arrow.sh and https://github.com/wesm/arrow/blob/master/python/setup.py#L210 so all the shared libs (build with CMake) get bundled in the wheel. Probably possible...

conda is the easiest way since you can package `libweld` (the shared libraries) and `weld-python` (the Python package and C extensions) as separate components

There seems to be some GitHub snafu right now so all the Apache git mirrors on GitHub are down at the moment

I'm very interested in the subgraph compiler problem for Arrow. It might be that we need to define a slightly higher level Arrow analytics IR that lowers to Weld DSL...

The notion of "arbitrary input data formats" is potentially a rathole. Beyond non-nullable tensor-like memory (i.e. the NumPy ndarray model), packed record / row-oriented tables (similar to Spark's Tungsten "off...

@sursu indeed one of my primary motivations in developing the Apache Arrow project (which has more or less been my primary focus since sometime in 2015) is to develop next-generation...

We could definitely have a mutating append and write into resizeable buffers (with growth factor 1.5 or 2). Something we can experiment with