Jay Chia

Results 70 issues of Jay Chia

This PR adds support for GCS as a datastore using the GCS Python client (no optimizations on top of the vanilla client). All of the patterns adopted here were taken...

When the `packaging` module is unavailable, fall back onto `distutils` as per the pattern in this PR: https://github.com/ray-project/ray/pull/28315 ## Why are these changes needed? Python versions < 3.10 will otherwise...

stale

Arrow2 already has support for Parquet FixedLenByteArray -> Decimal conversion This PR adds support for Parquet (variable-length) ByteArray -> Decimal conversion, re-using most of the logic from FixedLenByteArray conversion

This PR adds logic to parse any Parquet fields with `Unknown` LogicalType as the Arrow `Null` DataType.

I am running into an issue where reading Parquet int96 timestamps into arrow2 `timestamp[ns]` arrays can potentially overflow silently, providing wrong results. This issue was also noted in pyarrow/arrow-cpp [ARROW-12096](https://issues.apache.org/jira/browse/ARROW-12096)....

# Code Pull Requests This PR introduces the DaftDataFrameEngine, which is a DataFrameEngine implementation that is backed by [Daft](https://github.com/Eventual-Inc/Daft/tree/main/daft) This has several advantages: 1. Daft is easy to use locally...

### Which new backend would you like to see in Ibis? Hi! I would like to explore building a backend for Ibis for Daft (www.getdaft.io) I am one of the...

feature
new backend

This would allow for access to publicly available datasets without valid S3 credentials: handy for public demos hosted on Google Colab! I did not see anything available here: https://lancedb.github.io/lance/read_and_write.html#s3-configuration

Video URL: https://www.youtube.com/watch?v=ol6IQUbyeDo&t=1s&ab_channel=PyData # Contents 0:13 - Speaker Bios 1:02 - What is Daft? 2:05 - Defining "Complex Data" 4:28 - Summary: 3 things that make data "Complex" 6:01 -...