Matthew Powers
Matthew Powers
@boonware - I wouldn't expect reading multiple files into a pandas DataFrame via delta-rs to provide low latency queries. Query engines that are optimized to read multiple files in parallel...
@wjones127 - yea, I could go either way on this too. My opinion on `import deltalake as dl` being the "best" import pattern is weakly held. I think this import...
@xianwill - The steps you've outlined sound good to me and think small file compaction via `OPTIMIZE` would be a great addition to this library. Also agree that the first...
@wjones127 - oh, wow, looks like amazing progress is being made, so exciting!
@wjones127 - looks like #607 was merged!! Will it be relatively easy to expose this functionality via the Python bindings?
@alfonsorr - that organization sounds good to me. Sounds similar to what we have. `mrpowers.bebe` is for the Bebe typed and `org.apache.spark.sql` is for the Bebe functions. Unless you have...
@zero323 - we're building a project to [expose the Spark functions that are in the SQL API but not in the Scala API](https://github.com/MrPowers/bebe/issues/16). We'd also like to expose the "missing"...
@nchammas - we've added the functions that are in the SQL API, but missing in the Scala API, to this project. [weekday](https://spark.apache.org/docs/latest/api/sql/index.html#weekday) is an example of a function that was...
@nchammas - Think we'll need to get this in PyPI so users can add this as a regular project dependency right? [pyspark](https://pypi.org/project/pyspark/) and [spark-testing-base](https://pypi.org/project/spark-testing-base/) are in PyPI so it must...
@alfonsorr - good questions. Feel free to update the list and just add something like "wont add" to the functions that shouldn't get implemented. This list was generated by a...