friendlier error messages in nearest API
Currently in the nearest parameter dict:
- if the
columnexists but is not a list column say, then tokio panics and the error looks very scary - if the
qis a list/array but the dimensionality doesn't match the vectors, also a tokio panic scary looking error
These are actually schema errors so should just be caught in python
2 is tricky since pyarrow defaults to using ListType instead of the FixedSizeListType for vector columns.
Is there another way besides say, sampling the first 10 values in the embeddings column and checking they have the same dimension as q?
import lance import numpy as np import pandas as pd import pyarrow as pa import pyarrow.dataset
df = pd.DataFrame({"a": [5], "b": [10]}) tbl = pa.Table.from_pandas(df) ds = lance.write_dataset(tbl, "/tmp/test.lance") ds.to_table(nearest={'column': 'a', 'q': np.random.randn(128), 'k':10})
ValueError: LanceError(IO): KNNFlatExec node: query column a is not a vector
Resolved with #1336.