Sweep: The export from pinecone fails due to some data type error
Details
Fetching namespaces: 0% 0/1 [02:54<?, ?it/s] Error: ("Could not convert '1719697028.0' with type str: tried to convert to double", 'Conversion failed for column created_at with type object') Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/vdf_io/export_vdf_cli.py", line 89, in main run_export(span) File "/usr/local/lib/python3.10/dist-packages/vdf_io/export_vdf_cli.py", line 149, in run_export export_obj = slug_to_export_func[args["vector_database"]](args) File "/usr/local/lib/python3.10/dist-packages/vdf_io/export_vdf/pinecone_export.py", line 164, in export_vdb pinecone_export.get_data() File "/usr/local/lib/python3.10/dist-packages/vdf_io/export_vdf/pinecone_export.py", line 481, in get_data index_meta = self.get_data_for_index(index_name) File "/usr/local/lib/python3.10/dist-packages/vdf_io/export_vdf/pinecone_export.py", line 575, in get_data_for_index total_size += self.save_vectors_to_parquet( File "/usr/local/lib/python3.10/dist-packages/vdf_io/export_vdf/vdb_export_cls.py", line 87, in save_vectors_to_parquet df.to_parquet(parquet_file) File "/usr/local/lib/python3.10/dist-packages/pandas/core/frame.py", line 2970, in to_parquet return to_parquet( File "/usr/local/lib/python3.10/dist-packages/pandas/io/parquet.py", line 483, in to_parquet impl.write( File "/usr/local/lib/python3.10/dist-packages/pandas/io/parquet.py", line 189, in write table = self.api.Table.from_pandas(df, **from_pandas_kwargs) File "pyarrow/table.pxi", line 3874, in pyarrow.lib.Table.from_pandas File "/usr/local/lib/python3.10/dist-packages/pyarrow/pandas_compat.py", line 624, in dataframe_to_arrays arrays[i] = maybe_fut.result() File "/usr/lib/python3.10/concurrent/futures/_base.py", line 451, in result return self.__get_result() File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result raise self._exception File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run result = self.fn(*self.args, **self.kwargs) File "/usr/local/lib/python3.10/dist-packages/pyarrow/pandas_compat.py", line 598, in convert_column raise e File "/usr/local/lib/python3.10/dist-packages/pyarrow/pandas_compat.py", line 592, in convert_column result = pa.array(col, type=type_, from_pandas=True, safe=safe) File "pyarrow/array.pxi", line 340, in pyarrow.lib.array File "pyarrow/array.pxi", line 86, in pyarrow.lib._ndarray_to_array File "pyarrow/error.pxi", line 91, in pyarrow.lib.check_status pyarrow.lib.ArrowInvalid: ("Could not convert '1719697028.0' with type str: tried to convert to double", 'Conversion failed for column created_at with type object') Exporting fluidaigpt-dev: 0% 0/1 [02:56<?, ?it/s] Final Step: Fetching vectors: 100% 14404/14404 [02:39<00:00, 90.24it/s]
Branch
No response
Actions
- [ ] โป Restart Sweep
Step 1: ๐ Searching
I'm searching for relevant snippets in your repository. If this is your first time using Sweep, I'm indexing your repository, which will take a few minutes.
[!TIP] To recreate the pull request, edit the issue title or description.
To resolve the data type error during export from Pinecone, modify the save_vectors_to_parquet method in /src/vdf_io/export_vdf/vdb_export_cls.py to ensure the created_at column is converted to a double. Add the following code before calling df.to_parquet(parquet_file):
if 'created_at' in df.columns:
df['created_at'] = df['created_at'].astype(float)
This will convert the created_at column to the correct data type.
References
/.github/ISSUE_TEMPLATE /src/vdf_io/import_vdf/pinecone_import.py /src/vdf_io/export_vdf_cli.py /archive/example-VDF_META.json /src/vdf_io/import_vdf /src/vdf_io/notebooks/test_filtering_pc_log.ipynb /src/vdf_io/notebooks /src/vdf_io/import_vdf_cli.py /docs/export_vdf_pinecone_help.txt /.github/ISSUE_TEMPLATE/sweep-template.yml /.github/ISSUE_TEMPLATE/support-for-new-vector-db.md /src/vdf_io/export_vdf/vertexai_vector_search_export.py /.github /src/vdf_io /src/vdf_io/scripts/push_to_hub_vdf.py /docs /src /src/vdf_io/export_vdf/pinecone_export.py /archive /src/vdf_io/notebooks/kdbai_end_to_end_vectorIO.ipynb /README.md /src/vdf_io/export_vdf/vdb_export_cls.py
About Greptile
This response provides a starting point for your research, not a precise solution.
Help us improve! Please leave a ๐ if this is helpful and ๐ if it is irrelevant.
What type is the created at column in your original index?
