Cory Grinstead

Results 188 comments of Cory Grinstead

@FounderHy can you provide an example of when this would be needed? the `metric()` function exists on the concrete implementations `LanceVectorQueryBuilder` and `LanceHybridQueryBuilder`

@eddyxu is this issue still relevant? can you also remove me from assignee.

FWIW, most of the times we access `DataType` is through `SchemaRef` and `FieldRef` which are already arc'd

note: this does work for _some_ datasets. ```py df = daft.read_parquet("hf://datasets/universalmind303/daft-docs") df.show() ╭──────────────┬─────────────┬──────────────┬─────────────┬─────────╮ │ sepal_length ┆ sepal_width ┆ petal_length ┆ petal_width ┆ species │ │ --- ┆ --- ┆ ---...

When I originally introduced https://github.com/Eventual-Inc/Daft/pull/2701 i tested on a few other datasets and it seemed to work on all of the ones I tested on at the time. I know...

so on further debugging, this is looking like a problem with the huggingface apis. If i run the same query multiple times, it'll eventually go through and return the dataframe....

Have confirmed that it is an issue with huggingface API's. I suspect it's a CDN/caching issue as it often responds with the header `x-cache: Error from cloudfront` and a different...

opened up an [issue](https://github.com/huggingface/datasets/issues/7685) on the datasets repo

> Do you know when such requests are submitted? Are they done only for Parquet footer requests? for parquet specifically, we do a lot of range requests. Any projection pushdown...

still is failing on some datasets as seen in #4907 ``` df = daft.read_parquet("hf://datasets/stanfordnlp/snli/") df.write_parquet("./out") ```