vector-io icon indicating copy to clipboard operation
vector-io copied to clipboard

testing: Weaviate impl

Open emekaokoli19 opened this issue 1 year ago • 4 comments

/claim #74

@dhruv-anand-aintech Please take a look


:rocket: This description was created by Ellipsis for commit 5e2a59276706d3d3f00abdabb684d17ddd615d1e

Summary:

Enhances Weaviate export and import functionalities with support for multiple connection types, batch processing, and integration of OpenAI API key, with import functionality needing completion.

Key points:

  • Added support for local and cloud connections in ExportWeaviate and ImportWeaviate.
  • Integrated OpenAI API key handling in Weaviate connections.
  • Implemented batch processing for exporting data to Parquet files in ExportWeaviate.
  • Prepared structure for importing data from Parquet files in ImportWeaviate, but actual data upsert code is commented out.
  • Environment variables are used for connection details, and progress is visualized using tqdm.

Generated with :heart: by ellipsis.dev

emekaokoli19 avatar May 02 '24 08:05 emekaokoli19

Your free trial has expired. To continue using Ellipsis, sign up at https://app.ellipsis.dev for $20/seat/month. If you have any questions, reach us at [email protected]

ellipsis-dev[bot] avatar May 06 '24 04:05 ellipsis-dev[bot]

@emekaokoli19 did you test the newest version of the code with a weaviate instance?

dhruv-anand-aintech avatar May 06 '24 07:05 dhruv-anand-aintech

@emekaokoli19 did you test the newest version of the code with a weaviate instance?

@dhruv-anand-aintech I am sorry for the late reply, I was out for a few days but I am back now. Yes the code was tested with a weaviate instance and it works

emekaokoli19 avatar May 13 '24 10:05 emekaokoli19

@emekaokoli19 you need to make sure that it works for any import command. eg:

import_vdf \
	--max_num_rows 1000 \
	--hf_dataset aintech/vdf_prefix-cache \
    weaviate

Here it asks me for index name, which it should be using the name from the VDF_META.json file (see other DBs' implementation). Screenshot 2024-05-15 at 5 00 42 PM

It also fails because openai_api_key is not set. This is not a required API key for importing datasets. You need to make sure the code works without it being set (using .get("openai_api_key", "") syntax.

Screenshot 2024-05-15 at 5 00 51 PM

Let's have a pair programming session to finish off this task. Please message me on discord to set a time

dhruv-anand-aintech avatar May 15 '24 11:05 dhruv-anand-aintech