Cory Grinstead
Cory Grinstead
@FounderHy can you provide an example of when this would be needed? the `metric()` function exists on the concrete implementations `LanceVectorQueryBuilder` and `LanceHybridQueryBuilder`
@eddyxu is this issue still relevant? can you also remove me from assignee.
FWIW, most of the times we access `DataType` is through `SchemaRef` and `FieldRef` which are already arc'd
note: this does work for _some_ datasets. ```py df = daft.read_parquet("hf://datasets/universalmind303/daft-docs") df.show() ╭──────────────┬─────────────┬──────────────┬─────────────┬─────────╮ │ sepal_length ┆ sepal_width ┆ petal_length ┆ petal_width ┆ species │ │ --- ┆ --- ┆ ---...
When I originally introduced https://github.com/Eventual-Inc/Daft/pull/2701 i tested on a few other datasets and it seemed to work on all of the ones I tested on at the time. I know...
so on further debugging, this is looking like a problem with the huggingface apis. If i run the same query multiple times, it'll eventually go through and return the dataframe....
Have confirmed that it is an issue with huggingface API's. I suspect it's a CDN/caching issue as it often responds with the header `x-cache: Error from cloudfront` and a different...
opened up an [issue](https://github.com/huggingface/datasets/issues/7685) on the datasets repo
> Do you know when such requests are submitted? Are they done only for Parquet footer requests? for parquet specifically, we do a lot of range requests. Any projection pushdown...
still is failing on some datasets as seen in #4907 ``` df = daft.read_parquet("hf://datasets/stanfordnlp/snli/") df.write_parquet("./out") ```