weaviate-python-client icon indicating copy to clipboard operation
weaviate-python-client copied to clipboard

feature: allow 1D numpy array

Open tibor-reiss opened this issue 1 year ago • 8 comments

Fixes #1002

This is the simplest fix which allows a 1d numpy array to be passed in (additionally, anything which can be converted in the function util.get_vector). This will raise a TypeError if the input is incorrect, which is converted later to WeaviateInvalidInputError.

Note that passing in something else then types.VECTOR, e.g. a numpy array, will raise a mypy error.

tibor-reiss avatar Jul 18 '24 19:07 tibor-reiss

Hey, thank you for your contribution! Would you be able to add a test here: integration/test_batch_v4.py? Numpy is available as a test dependency

dirkkul avatar Jul 19 '24 04:07 dirkkul

To avoid any confusion in the future about your contribution to Weaviate, we work with a Contributor License Agreement. If you agree, you can simply add a comment to this PR that you agree with the CLA so that we can merge.

beep boop - the Weaviate bot 👋🤖

PS:
Are you already a member of the Weaviate Slack channel?

weaviate-git-bot avatar Jul 19 '24 10:07 weaviate-git-bot

agree with the CLA

agree with the CLA

tibor-reiss avatar Jul 21 '24 06:07 tibor-reiss

Hi @dirkkul, I think should be fine now. The two previous tests which failed (DB issue and timeout on grpc) were not connected to this PR.

tibor-reiss avatar Jul 26 '24 18:07 tibor-reiss

@dirkkul friendly reminder: can this branch be merged?

tibor-reiss avatar Sep 11 '24 19:09 tibor-reiss

Hey, I think you are mixing two things up:

  • how the users supplies vectors
  • how we hand them internally (vector bytes, packing etc)

We would want that user supplied vectors that are numpy arrays are converted to python lists. So if you do

                _BatchObject(
                    collection=self.name,
                    vector=_get_vector_v4(obj.vector),
                    uuid=str(obj.uuid if obj.uuid is not None else uuid_package.uuid4()),
                    properties=cast(dict, obj.properties),
                    tenant=self._tenant,
                    references=obj.references,
                    index=idx,
                )

it should take care of things.

dirkkul avatar Sep 26 '24 05:09 dirkkul

Hi @dirkkul, thanks for your patience, fingers crossed I got it now...

tibor-reiss avatar Sep 26 '24 18:09 tibor-reiss

Friendly ping @dirkkul

tibor-reiss avatar Nov 01 '24 06:11 tibor-reiss